Insight by Red Hat
AI & Data Exchange 2025: Red Hat’s Ben Cushing on value of thinking small when it comes to AI language models
Small language models, hyper-focused on specific domains, can help agencies make fast, smart and cost-effective use of AI, the Red Hat systems architect says.
While the whole of the IT world might seem fixated on talking about large language models, Ben Cushing wants to shift some of that focus to small language models.
Why? Because of the ability to use small models to derive value from the application of artificial intelligence, said Cushing, chief architect for health and life sciences at Red Hat.
Red Hat has focused R&D on small models in part because they are “not dependent on expensive hardware both for the inference and training,” Cushing said during Federal News Network’s AI and Data Exchange.
That makes them easier to train, test and scale — a bonus for agencies looking to take quick advantage of AI tools against their federal data stores.
“They are small in terms of the intensity of compute they need. The total number of parameters that make up each model is significantly diminished,” he said
The IBM Granite family of models that Red Hat works with encompass a billion to 32 billion parameters, “which is significantly smaller than the type of large language models we see in the news.”
Many well-known LLMs rely on trillions of parameters.
What’s the advantage of small language models?
Cushing said small language models bring what he called domain specificity to deliver faster and more reliable results for narrow fields of inquiry than LLMs typically can.
He pointed to the health field to provide an example for how using a SLM could be valuable.
Suppose a large language model produces the name of a particular medication. A medical practitioner might then enter that medication into a patient health record. In that example, “there’s really no step in there for the validation of that output or entering it into the electronic health record — besides the electronic health record actually accepting it,” he said.
By contrast, a small language model could be trained deeply in just that specific medication. It could “judge whether that medication is sufficient and well-formed and can be submitted to the electronic health record,” Cushing explained, meaning that the model could gauge whether the output is correct for highly specific situations and use of the medication.
Cushing said he envisions organizations deploying thousands of SLMs. “Imagine thousands of these little, tiny worker models able to create a collective capability that surpasses the large language model itself because each one is so good at its particular task,” he said.
What’s the “Goldilocks” size for an AI model?
The problem with LLMs is the largeness itself, meaning the range of data on which they’re trained might be overly broad for the task at hand.
A researcher might study a molecule for its efficacy in creating a new way to remediate a disease, for instance. In such a case, Cushing said, “You don’t need the model doing that work to also know lots about automobiles. We don’t need all that knowledge when we’re trying to hyper focus on a particular domain.”
He envisions large numbers of hyper-focused SLMs operating in tandem with LLMs in a connected system that routs requests to the correct model based on need and desired outcomes.
Domain-specific small language models, because they use less data and computing capacity than their larger counterparts, can help agencies avoid ballooning cloud costs too, he pointed out. Instead, the SLMs can run in on-premise systems or data centers in which agencies already of sunk capital costs.
“This is where domain-aligned models really come into play,” he said, “because we can bring them into these environments where you may not have the best hardware, but you can still achieve your mission and still make a dramatic impact.”
Discover more articles and videos now on our AI & Data Exchange event page.
Copyright
© 2025 Federal News Network. All rights reserved. This website is not intended for users located within the European Economic Area.