Getty Images/iStockphoto
SLM vs. LLM: Rightsize data architecture to optimize AI use
It doesn't need to be a binary choice. Enterprises are warming up to smaller AI models to meet compliance and cost needs while reserving large models for complex jobs.
A one‑size‑fits‑all approach to AI models in data-driven applications risks leaving enterprises paying more, waiting longer for analytics results and taking on avoidable business risk.
For data and analytics leaders, the era of "bigger is better" in AI applications is giving way to right-sizing strategies that assign small language models (SLMs) to narrow, repeatable tasks and reserve large language models (LLMs) for complex reasoning workloads. The advantages of SLMs include easier governance, lower run-rate costs and the ability to run on-premises, which is particularly attractive to regulated industries where compliance and data privacy are paramount. Analyst outlooks and vendor guidance show organizations increasingly moving toward this mixed-model configuration.
"There is a space for smaller models to be utilized more," said FTI Consulting managing director and chief data scientist Dimitris Korres. He noted that organizations "can use a mix of large and small models to reduce costs and get better, faster outputs for certain tasks."
Small models are on the rise
This move to rightsize models to make them fit for purpose rather than simply using the largest model emerged as a viable strategy in 2024. It got increased attention when Nvidia researchers published a September 2025 paper titled Small Language Models are the Future of Agentic AI.
In it, they argued that SLMs are "sufficiently powerful, inherently more suitable, and necessarily more economical" for many agentic AI workloads and recommended the use of heterogeneous agents that send low-level tasks to an SLM and only escalate more advanced work to an LLM.
Interest in SLMs is still relatively limited, as most organizations remain focused on how to use the dominant LLMs to transform their workflows, products and services.
Yet, there is evidence of increasing use of SLMs, signifying a strategic shift toward task‑specific models. Gartner projects that by 2027, organizations will use SLMs three times more than general‑purpose LLMs. A 2025 SNS Insider report valued the global SLM market at $7.9 billion in 2023, predicting it will reach $29.64 billion by 2032.
Map tasks to the model size
LLMs have tens of billions to hundreds of billions of parameters. This scale makes LLMs valuable, enabling them to work through large numbers of complex, broad tasks. But LLMs require massive amounts of compute power, making them resource-intensive. Due to their scale, many enterprises use LLMs as an endpoint from a cloud provider rather than deploying them internally.
SLMs typically max out at 10 billion parameters and usually have far less -- in the millions in many cases. The smaller size of SLMs often makes them faster to serve up results and more feasible to deploy on-premises or in edge computing environments for specialized workloads.
SLMs have several advantages compared to LLMs:
- Fit for purpose. Due to their size and speed, SLMs are ideal for simpler work, such as document sorting and summarizing meeting notes.
- Better responsiveness. Smaller models can return answers faster on focused tasks because they require fewer computations.
- Lower costs. Nvidia claims AI inference can be 10 to 30 times less expensive with SLMs than LLMs if the hardware is fully utilized. Using SLMs can also help avoid the higher energy consumption and costs associated with LLMs.
- Private by design. SLMs are especially suited for working with sensitive data -- for example, processing legal contracts, medical records or financial data -- because they can run on in-house servers or edge devices. This keeps the data within the organization's environment rather than in a public cloud.
- Consistent results. SLMs are generally easier to adjust for predictable outputs.
For data leaders, the takeaway is simple: Start small and then move up to an LLM when needed. This approach helps companies maintain service quality while keeping runtime costs predictable. For high‑volume automation, using SLMs provides steadier spending than running large, cloud‑dependent models.
Such factors have prompted Zach Rossmiller, CIO at the University of Montana, to consider where smaller models might be a better fit than LLMs.
"It comes down to use cases for us," he said.
He has looked at using SLMs for workloads that keep sensitive data protected in on-premises systems and to support a possible digital tutoring service running on a Raspberry Pi or other end-user devices for students in rural areas that lack reliable high-speed internet access.
Rossmiller said SLMs seem like a good choice for specific use cases like those, where factors such as data privacy and connectivity concerns may prohibit the use of LLMs.
Too many small models might lead to slowdowns
Holger Mueller, principal analyst and vice president at Constellation Research, is more skeptical about SLMs. He doesn't see much value in them, saying that users will quickly reach their limits and ask, "That's it? That's all it can do?"
Mueller said relying on too many SLMs will also push users into an "integrator" role in which they must find ways to coordinate and integrate multiple small models, which could erase the savings in cost and time. He added that another potential disadvantage is the need for specific training on company data for the SLMs to be effective.
SLMs only seem to be advantageous in a handful of rare circumstances where the scenarios are narrow in scope and remain that way, Mueller said.
However, he said if more LLM developers start adding small models to their offerings, "that would be something to pay attention to."
Consider the architectural realities for a mixed-model platform
While SLMs can run on-premises, their usefulness depends on several factors.
Most organizations will continue to rely on LLMs, which require data platforms to support secure, high-speed transfers to cloud AI platforms. However, adding task-focused SLMs might require additional investments to build a distributed, hybrid architecture that keeps sensitive data where it already resides to meet compliance and privacy requirements and lower data transfer costs.
Nvidia's researchers said enterprises might need to tailor their ecosystems toward AI agents that can automatically route tasks to the right model.
"Cloud is always an option, but there are also self-hosting endpoints that have ridiculously low costs," said Korres. "And if a project requires more control, or isolation of the data flow or data residency, then smaller models can definitely be hosted on local devices."
On-premises models offer privacy and predictability
Because they are smaller and more focused, SLMs can be easier to manage from a governance, risk and compliance perspective, especially if they run on-premises or in a private cloud. The narrow scope of SLMs helps with transparency, monitoring and auditability, reducing the risk of data exposure compared to large, public cloud models.
Furthermore, SLMs running locally can improve reproducibility, meaning "that anyone else can follow the same steps and get the same result," said Jonathan Chang, assistant professor of computer science at Harvey Mudd College. In contrast, reproducibility is not guaranteed with models hosted in public clouds.
Korres also noted that because SLMs are often used for narrowly defined tasks, organizations can more easily evaluate their performance than when using a general-purpose LLM.
However, some risks are similar regardless of the model size.
"Models small and large will hallucinate. They can produce outputs that could be misleading. They could leak training data that could mean privacy concerns," said Chang.
Decision rules for cost vs. performance
As many enterprise technology executives are finding, the costs associated with LLM use can be high and unpredictable, Chang said.
Additionally, he pointed out the environmental impact of LLMs.
"Bigger models require more resources to run. They take more energy, so they emit more carbon," Chang said.
By contrast, SLMs provide lower, more predictable operational costs with less impact on the environment, he said. However, gaining those advantages requires having the right infrastructure to run the models. The financial benefits of running SLMs locally might dissipate if the organization needs to set up a new data center or make other IT investments.
Enterprises need to fully consider the strategic use of SLMs to make sure the model choice fits the task.
"If it is a well-defined task and it is narrow enough, I see more benefits than risks to using a smaller model," said Korres. "But if the function is quite general, or you need something that can manage agents or define things on the fly, that's where larger models make a difference."
Other experts had similar assessments, noting that the workload should determine whether to go with a small or large model.
"If you need a gigantic amount of data, then an LLM is the right answer," said Chang. "But if you're finding that most of your friction is with individual employees needing to automate small tasks, or you're dealing with sensitive data and want full control from end to end, or you care about reproducibility, then small models are a feasible option."
Mary K. Pratt is an award-winning freelance journalist with a focus on covering enterprise IT and cybersecurity management.