Kathy Gibson reports from Pinnacle TechScape, Durban – The pace of business today has accelerated beyond most people’s expectations – and the right technology is vital for businesses and individuals to keep up.
Claudio Polla, senior country manager: Africa at Nvidia, points out that artificial intelligence (AI) is the next step in making things more efficient.
Nvidia is the third-largest company in the world – and was briefly the largest a couple of weeks ago.
Behind Nvidia’s rapid growth is AI. “There is a race to the top at the moment,” Polla explains. As organisations like xAI, Meta and others build large language models, they are deploying literally thousands of Nvidia GPUs.
“The iPhone moment for AI has arrived,” Polla says. “This is a technological step forward for us – and it’s going to be big.”
AI is about efficiency of process, he adds.
As a specialist GPU manufacturer, Nvidia set the standard for gaming and graphics. With the advent of the company’s Cuda programming language, AI really got out of the starting blocks.
Today, the global Nvidia ecosystem spans 4,5-million developers, 40 000 companies and over 3 000 applications across the full ecosystem that includes platforms, acceleration libraires, system software and hardware – all supporting application frameworks.
There are about 350 frameworks that address different verticals and applications.
“So we do the hardware, we do the development platform, and we do the pre-trained models and frameworks,” Polla explains.
There are a lot of myths around AI and GPUs, he adds.
The first is that GPUs use a lot of power. And, yes, they use electricity, but they are still the most sustainable and cost-effective way to meet growing compute needs.
More than this, accelerated computing is efficient computing, delivering a reduced footprint and reduced power while increasing the workload.
Polla stresses that GenAI is not the only AI – but it has quickly become a solution for real world issues.
“There always has to be a use case,” Polla says. “This is critically important.”
2022 was the big year for AI: ChatGPT launched and became the quickest adopted technology in the world to date. “This created a lot of excitement,” Polla says.
In 2023, companies moved towards deploying GenAI in pilots. In 2024, those pilots are going into production.
“The next generation of GenAI is becoming more intelligent,” Polla explains. “This year, multi-model GenAI has launched – and now a mixture of experts are becoming mainstream. We will see production inference models.”
As it becomes more widely used, GenAI’s impact on productivity could add $44,4-trillion annually to the global economy, he adds.
In terms of the broader industry, use cases include fraud detection, factory simulation, personalised shopping, AI virtual assistants, and more across verticals like finance, healthcare, retail, telecommunication, and manufacturing.
Enterprises today are using GenAI in a number of ways, either as a managed service or on their own infrastructure. Both these models have merit.
As a managed service, GenAI services are easy to use with low barriers of entry and a quick route to market.
On the other hand, open source deployment lets users run anywhere – from data centre to cloud. They can securely manage their data in a self-hosted environment while custom coding APIs and fine-turning models.
Building a large language model is the foundation of any AI deployment, Polla adds, and there are a number of factors that organisations need to be cognisant of.
* The availability of training data – and lots of it;
* Accelerated computing in the form of GPUs;
* Training and inference tools; and
* AI expertise – people who can work on these models, and train other people on them.
He describes Nvidia AI Enterprise as the glue between GPUs and pre-trained models or frameworks. “It takes a lot of the complexity out of AI,” Polla says, “This lets you achieve a predictable outcome.”
The end-to-end secure cloud-native suite of AI software is available from Nvidia and through OEMs,.
“Nvidia AI Enterprise is a cloud-native platform for AI solutions that allows organisations to develop once and run anywhere,” Polla says.
The solution stack offers the fastest path to production.
Nvidia also offers enterprise-grade software that is built for business, with stability and support that can be run anywhere – in the cloud, on the edge, on-premise, and embedded.
“Because there are more than 4 000 software packages in the AI development stack, it offers a scalable, dependable platform from beginning to end.”
Nvidia’s NeMo LLM allows customers to experience, prototype and deploy the latest AI models.
Nvidia recently launched Nvidia Inference Microservices (NIM) for rapid AI deployment.
“A NIM is a complete AI inference stack delivered as a microservice in a container,” Polla says. “It is the fastest path to AI inference.”
Polla offers some advice for companies looking to start their GenAI journey.
* Identify the business opportunity – target use cases that have meaningful business impact and can be customised with unique data.
* Build out domain and AI teams – identify internal resources and augment them with AI expertise from partners.
* Analyse data for training/customisation – acquire, refine, and safeguard data to build foundation models or to customise existing models.
* Invest in accelerated infrastructure – assess the infrastructure, architecture, and operating mode while considering costs and energy consumption.
* Develop a plan for responsible AI – leverage tools and best practices to ensure responsible AI principles are adopted across the company