Small language models – more effective and efficient for enterprise AI

A silhouette office worker is replaced by computer code.

getty

As enthusiasm for generative AI continues to grow steadily, being deployed in our apps, devices, and businesses, and new tools and use cases are nearly brought to market, frontier models of billions and trillions of parameters are This has been the focus for the past two years. every day.

We also know that the rapid growth of large-scale AI models for language, voice, and video is putting a significant strain on resources. This has led to a resurgence of interest in nuclear power, with hyperscalers such as Microsoft, Google, and AWS making significant commitments to it. Nuclear power will support the construction of hundreds of billions of data center infrastructures expected over the next few years.

While models with hundreds of billions or trillions of parameters, such as those developed by researchers at OpenAI, NVIDIA, Google, and Anthropic, are state-of-the-art, these power-hungry next-generation models are often It has also been found to be much more powerful than the model. Required for most use cases. It’s like driving a Formula 1 race car in the middle of rush hour traffic.

This is where smaller models that can be driven with less energy and calculated horsepower come into play.

NVIDIA NIM and IBM Granite 3.0 offer a glimpse into the future of enterprise AI

Increasingly, we hear about small language models with hundreds of millions or less than 10 billion parameters that are highly accurate, consume significantly less energy and cost per token.

At the GTC conference in March of this year, NVIDIA announced its NIM (NVIDIA inference Microservice) software technology. It packages an optimized inference engine, industry standard APIs, and support for AI models into a container for easy deployment. In essence, NIM can handle larger models than smaller languages, but the idea is for an optimized container service with industry-specific models and APIs that can be used for visualization, game design, drug discovery, or code writing. can significantly simplify compute, data, models and frameworks, while also reducing the amount of computational power needed to run AI workloads. We believe the recently announced partnership between NVIDIA and Accenture is a great example of the combination of computing, industry-specific microservices, and expertise that enables rapid adoption of AI within the enterprise.

Last week, IBM announced its latest Granite 3.0 model. This is a family of small language models that has shown strong performance against small language models (7-8 billion parameters) such as Llama and Mistral. All three companies have developed flexible open source options that can be tailored and optimized for business use cases with incredible performance in areas such as math, languages, and code. Llama has been a staple of open source model development, but IBM’s rapid improvements are notable. I also see these advances as important because they offer open source products that can be used on clouds like AWS, but also on IBM’s own watsonx platform. Enterprise-centric companies like IBM, with their software, models, and extensive consulting, take into account the complexity of solving a set of use cases that often requires not just models but deep industry. , is an example of how an “AI for the enterprise” strategy can be effectively pursued. Expertise.

The head of it all is a combination of models and flexible infrastructure that allows companies to focus on outcomes-based AI projects that will enable the next wave of technological advancements such as agent AI, assistants and automation, and digital labor at scale.

Research continues, but the future will probably be a small model for businesses.

The idea that a one-size-fits-all model with trillions of parameters is the holy grail of enterprise AI fails in many ways. Of particular note is the energy consumption and cost per token in well-defined use cases where only a few parameters are actually required. If you have (at most) a billion parameters to manipulate, you’re better off running a small, specialized model tailored to your specific business use case. Additionally, data lineage is better understood and access to data is limited to only what is needed, rather than large models that require massive scale to deal with huge amounts of data. Easily manage and address the growing number of data security, privacy, and sovereignty issues. Use case description.

And there’s no question that we want to continue researching and building the world’s most sophisticated AI to support economic growth and help solve complex problems. However, for enterprises, smaller languages and underlying models have proven to be the better option for many business use cases, providing a more sustainable and fit-for-purpose solution while significantly reducing costs. ways to deploy AI at scale. A.I. This combination cannot and should not be ignored by companies looking to leverage the potential of generative and agential AI solutions.

Source link

What's Hot

What does homeowners insurance cover?

Build a dynamic learning ecosystem for MLS partners and organizations

Kudos! Achieved goal 6 and now on to goal 7: Create more creative reports.

AI discovers breast cancer in women missed during regular checkups

How every employee can become a technology creator

Demand for Nvidia’s AI chips remains strong despite concerns about sales growth

What does homeowners insurance cover?

Build a dynamic learning ecosystem for MLS partners and organizations

Kudos! Achieved goal 6 and now on to goal 7: Create more creative reports.

Who pays for what when selling a house?

Our Picks

What does homeowners insurance cover?

Build a dynamic learning ecosystem for MLS partners and organizations

Kudos! Achieved goal 6 and now on to goal 7: Create more creative reports.

Most Popular

Review: 7 Future Fashion Trends Shaping the Future of Fashion

Meta’s AlbedoGAN Advances Realistic 3D Face Generation

What does homeowners insurance cover?

Subscribe to Updates

What's Hot

Small language models – more effective and efficient for enterprise AI

NVIDIA NIM and IBM Granite 3.0 offer a glimpse into the future of enterprise AI

Research continues, but the future will probably be a small model for businesses.

Related Posts