Cerebrium, a serverless AI infrastructure platform designed to simplify the development and scaling of multimodal AI applications, has raised $8.5 million in seed funding.
The round was led by Gradient Ventures, with participation from Y Combinator, Authentic Ventures, and several strategic angel investors and operators.
With the fresh capital, Cerebrium plans to bolster its engineering team to keep pace with enterprise demand and accelerate product development. The company aims to introduce new features to enhance its platform’s capabilities, particularly in real-time AI applications like digital avatars, which could have far-reaching implications for industries such as gaming, entertainment, and telehealth.
Founded by Michael Louis and Jonathan Irwin, Cerebrium is a serverless AI infrastructure platform built from the ground up to power the next generation of high-performance AI applications.
Cerebrium emerged from the founders’ own frustrations while building AI-powered products. “Tooling was fragmented, there was an education gap between theory and production, the unit economics didn’t make sense, and development cycles took months,” explained CEO Michael Louis. “We built Cerebrium so engineers can focus on building AI products users love with real business impact without needing a dedicated infrastructure team or incurring massive cloud costs.”
Cerebrium supports applications like voice AI, real-time digital avatars, and healthcare solutions, offering low-latency, cost-effective serverless GPU infrastructure with sub-5-second cold-start times and up to 40% cost savings compared to traditional cloud providers. From real-time voice bots to multimodal inference pipelines and large-scale batch jobs, the platform makes it radically easier for teams to deploy, scale, and operate AI workloads without managing a single server.
Cerebrium’s platform powers some of today’s most cutting-edge AI startups, including Tavus, Deepgram, and Vapi, among others. It is specifically optimized for real-time, high-performance use cases such as voice agents, LLM fine-tuning, video model inference, and large-scale data analytics.
Beyond its core offering of serverless GPU infrastructure, Cerebrium provides support for batching, multi-region deployments, and large-scale data processing. This allows engineering teams to seamlessly run compute-heavy workloads with minimal configuration, scaling on demand while only paying for what they use. Importantly, the platform also meets enterprise-level security and data residency requirements, reducing the burden of compliance.
The AI infrastructure platform performance has won praise from key users. Roey Paz-Priel, ML Engineer at Tavus, noted, “We run a range of real-time audio and video models, and performance is everything. Cerebrium consistently delivers the speed and reliability we need without overhead. Even as we scaled rapidly and went viral, they met our compute demands with stability.”
Eylul Kayin, Partner at Gradient, added, “What the Cerebrium team has accomplished with such a small group is impressive. They’re enabling some of the most advanced AI voice and video applications to scale effortlessly. As real-time AI becomes foundational to digital experiences, specialized elastic infrastructure like Cerebrium’s will be essential.”
Originally founded in Cape Town, South Africa, and now headquartered in New York City, Cerebrium plans to use the new capital to build additional features and meet growing enterprise demand. The company’s move to the U.S. reflects its ambition to compete globally, while its South African origins underscore the growing influence of African tech in AI innovation.