Phones Will Drive Usage of Small Language Models Such As OpenAI’s GPT-4o mini
Not all natural language use cases require the power and capability of the latest and greatest large language model (LLM). In fact, small language models that excel at discreet tasks may be the order of the day in many cases.
We recently wrote about RouteLLM which directs “traffic” across a series of language models/agents, each of which is focused on a specific task (example: writing portions of code, checking code, working to optimize how models and agents are deployed).
My view is that phones will leverage a number of small language models that will carry a small footprint. The small model footprints will allow users to engage with various on-device natural language AI agents. On-device models will operate without the latency that one finds when engaging with models that are deployed in the cloud or at the edge of the network.
OpenAI’s recently announced partnership with Apple will not be the first time that small AI models were deployed on-device. Google has deployed small AI models for years on its Pixel phones as we have previously written about.
The new, small language models developed by OpenAI and others will make Google Assistant, Siri and other AI agents operate as intended.
OpenAI’s latest model – GPT-4o mini – offers superior performance to OpenAI’s GPT 3.5 Turbo release and the same range of languages as GPT-4o, but at significantly lower cost. Small, sophisticated models such as GPT-4o mini will be optimal for on-device usage once deployed on phones and tablets as natural language powered AI agents and applications.
Smaller, sophisticated, less expensive natural language models such as GPT-4o mini will make it easier for developers to experiment and create new natural language applications and services.




