Cerebrium, a serverless AI infrastructure platform founded in Cape Town, has announced an $8,5-million seed round, led by Gradient, Google’s AI venture fund, with participation from Y Combinator, Authentic Ventures, and several strategic angel investors and operators.
Founded in Cape Town, South Africa and now headquartered in New York City, the company and its AI platform are responsible for technical advancements that enable teams to build and scale multimodal AI applications without the traditional complexity or cost.
The new funding will allow the team at Cerebrium to invest in new features and meet surging enterprise demand.
The company was founded by Michael Louis and Jonathan Irwin, who previously served as chief technology officer and lead engineer, respectively, at OneCart. OneCart was acquired by MassMart in 2021, serving as one of South Africa’s largest tech acquisitions at the time.
The duo founded Cerebrium after struggling to build their own AI-driven products.
CEO and co-founder, Michael Louis, comments: “Tooling was fragmented, there was an education gap between theory and production, the unit economics didn’t make sense, and development cycles took months.
“We built Cerebrium so engineers can focus on building AI products that users love with real business impact, instead of hiring an infrastructure team, racking up six-figure cloud bills or worrying about security and compliance.”
Cerebrium powers several companies pushing the boundaries in AI, including Tavus, Deepgram, Vapi, and more. The platform is purposefully built for high-performance, real-time multimodal AI applications: voice agents, LLM fine-tuning, video models, and large-scale data analytics use cases. “We know that AI is changing the world, and we want Cerebrium, a South African founded company, to be the platform powering it,” concludes Louis.
While Cerebrium is best known for its serverless GPU infrastructure, it also offers the ability to do batching, multi-region deployments, large scale data processing and much more. This enables teams to run compute-intensive workloads with minimal setup, scale elastically, and only pay for what they use, without the complexity of managing infrastructure while adhering to strict security and data residency requirements.