Language models are central to most AI applications and tools, but they can be huge and require large amounts of complex and expensive compute power to run. Most of the well known tools such as ChatGPT, Gemini and CoPilot use large language models that contain many trillions of parameters. In the case of GPT 4 that powers the latest model of ChatGPT it is rumoured to have around 100 Trillion. What is a parameter and why is it important? Parameters are the way in which the large language models interprets and is able to create sentences and combine words to solve problems. The parameters are what the model uses to determine the best output based on the input. The more parameters the more accurate the output can be. However, this comes with a trade off in that the more parameters the more compute power is needed to run the model. This is where small language models, or SLMs, can help. Small language models still often contain billions of parameters but are of a much smaller size that allows them to run on less powerful devices. For example you can take Microsoft’s latest Phi model and run it locally on a MacBook Pro. Why is this useful? Let’s say you want to run a model on a small edge device to interpret data to predict maintenance schedules from a bunch of sensors on a piece of machinery. The device is running in a remote location that does not have an internet connection are therefore prevents you from connecting to one of the LLMs. By downloading and training the LLM on a customised dataset you can have this running independently and with little hardware costs. Another use case is an internal chat bot where you do not want to connect your data to a public cloud platform but instead want to run it on your own hardware. Over the next few weeks I am going to be testing and using some of the small language models to try and solve a few problems and will be posting my progress here. #ai #llm #slm #languagemodels #artificialintelligence
Hey Ben, been interested myself in how to leverage AI to assist with data engineering tasks and started to think how this can assist in generating the meta data around common data engineering objects. AI is proving to provide some interesting options and different ways of working...
Very informative