Subscribe to Digital Engineering
Webcasts · Downloads · Archives
Companies · Glossary · Podcasts

NVIDIA NeMo SteerLM Allows Customizing of Model Responses

NVIDIA defines a new technique to tune LLMs more simply to align with user needs.

SteerLM is the latest advance in model customization, a hot area in AI research, according to NVIDIA. Image courtesy of NVIDIA.

Latest in Engineering Computing

Engineering Computing Resources

Latest News

Ready or Not, Manufacturers will Soon be Held to Rigid Sustainability Standards

AMGTA Shares Findings on Sustainability of Powder and Wire Additive Feedstock

MATLABS Features Modelithics EXEMPLAR Library

Fictiv Demonstrates New AI Capabilities

ADDMAN Earns Qualification Project for U.S. Navy

FREE WEBINAR May 7: Addressing the Skilled Worker Shortage with Customized eLearning

All posts

By DE Editors

October 13, 2023

Developers have a new artificial intelligence-powered steering wheel to help them hug the road while they drive large language models (LLMs) to their desired locations.

NVIDIA NeMo SteerLM lets companies define knobs to dial in a model’s responses as it’s running in production, a process called inference. It lets a single training run create one model that can serve dozens or even hundreds of use cases.

NVIDIA researchers created SteerLM to teach AI models what users care about, like road signs to follow in their particular use cases or markets. These user-defined attributes can gauge nearly anything—for example, the degree of helpfulness or humor in the model’s responses.

One Model, Many Uses

The result is a new level of flexibility.

With SteerLM, users define all the attributes they want and embed them in a single model. Then they can choose the combination they need for a given use case while the model is running.

For example, a custom model can now be tuned during inference to the needs of an accounting, sales or engineering department or a vertical market.

The method also enables a continuous improvement cycle. Responses from a custom model can serve as data for a future training run that dials the model into new levels of usefulness.

To date, fitting a generative AI model to the needs of a specific application has been the equivalent of rebuilding an engine’s transmission. Developers had to label datasets, write lots of new code, adjust the hyperparameters under the hood of the neural network and retrain the model several times. SteerLM replaces those complex processes with three simple steps:

Using a basic set of prompts, responses and desired attributes, customize an AI model that predicts how those attributes will perform.
Automatically generating a dataset using this model.
Training the model with the dataset using standard supervised fine-tuning techniques.

Game On With SteerLM

To show the potential of SteerLM, NVIDIA demonstrated it on one of its classic applications—gaming.

Some games pack dozens of non-playable characters—characters that the player can’t control—that mechanically repeat prerecorded text, regardless of the user or situation.

SteerLM makes these characters come alive, responding with more personality and emotion to players’ prompts. It’s a tool game developers can use to unlock new experiences for every player.