In a signal that the next phase of artificial intelligence may be defined less by scale and more by accessibility, enterprise AI firm Cohere has introduced a new family of compact multilingual models designed to run directly on everyday devices — no cloud connection required.
Unveiled on the sidelines of the India AI Summit, the models, collectively branded Tiny Aya, are open-weight systems supporting more than 70 languages. Their release underscores a strategic shift toward deployable, regionally tuned AI that prioritizes linguistic diversity, hardware efficiency, and developer autonomy.
Tiny Aya’s ability to run offline on consumer hardware positions it as a practical AI solution for low-connectivity and multilingual markets.
Register for Tekedia Mini-MBA edition 19 (Feb 9 – May 2, 2026).
Register for Tekedia AI in Business Masterclass.
Join Tekedia Capital Syndicate and co-invest in great global startups.
Register for Tekedia AI Lab.
Compact Architecture, Broad Language Coverage
The base Tiny Aya model contains 3.35 billion parameters — modest compared to frontier models that run into the hundreds of billions, but deliberately optimized for efficiency and portability. Parameter count reflects the internal complexity of a model and influences both capability and computational cost. By keeping the architecture compact, Cohere is targeting practical, real-world deployments rather than data-center-scale experimentation.
The family includes multiple variants. TinyAya-Global is instruction-tuned for general-purpose multilingual use. Regional versions include TinyAya-Earth for African languages, TinyAya-Fire for South Asian languages, and TinyAya-Water for Asia Pacific, West Asia, and Europe.
South Asian language support includes Bengali, Hindi, Punjabi, Urdu, Gujarati, Tamil, Telugu, and Marathi, addressing a long-standing imbalance in AI systems that have historically centered on English and a small set of European languages.
“This approach allows each model to develop stronger linguistic grounding and cultural nuance, creating systems that feel more natural and reliable for the communities they are meant to serve,” the company said in a statement.
The regional specialization model suggests a deliberate move away from one-size-fits-all training strategies toward geographically informed datasets and linguistic fine-tuning.
Training Efficiency and Hardware Strategy
Cohere said Tiny Aya was trained on a single cluster of 64 H100 GPUs produced by Nvidia. In the context of modern large language models, some of which are trained on thousands of GPUs, this represents a comparatively restrained computational footprint.
The efficiency claim is central to the model’s positioning. Cohere said it engineered its inference stack to require less computing power than most comparable multilingual systems, enabling deployment on laptops and other consumer-grade devices.
The on-device capability has several implications. First, it reduces dependence on constant internet connectivity, expanding usability in rural or bandwidth-constrained environments. Second, it lowers cloud infrastructure costs for developers. Third, it strengthens data privacy, since user inputs do not need to be transmitted to remote servers.
In linguistically diverse countries like India, offline AI can support translation tools, educational software, local-language assistants, and enterprise workflows without requiring persistent connectivity.
Open-Weight Strategy and Developer Ecosystem
Unlike proprietary closed models, Tiny Aya is open-weight. Developers can access, fine-tune, and redistribute the models, encouraging experimentation and localization. The models are available on Hugging Face as well as the Cohere Platform, with downloads supported through Hugging Face, Kaggle, and Ollama for local deployment.
Cohere is also releasing associated training and evaluation datasets and plans to publish a technical report outlining its methodology. This level of disclosure enhances reproducibility and positions the release within the open research ecosystem.
The open-weight approach aligns with a broader industry split between proprietary API-based models and adaptable open systems. For enterprises concerned about vendor lock-in, compliance, and customization, open-weight alternatives offer greater control.
The launch comes amid intensifying competition in enterprise AI. Cohere has historically positioned itself as a business-focused alternative to consumer-centric AI providers, emphasizing secure deployments and customizable solutions.
By targeting multilingual, compact, and offline-capable systems, the company is differentiating itself from firms competing primarily on model size and benchmark dominance.
The strategy reflects an emerging market thesis: growth may increasingly come from AI tailored to specific regions, industries, and linguistic communities rather than purely from scale-driven performance improvements.
Commercial Momentum and IPO Ambitions
Cohere’s operational expansion coincides with strong financial performance. According to CNBC, the company ended 2025 with $240 million in annual recurring revenue and reported quarter-over-quarter growth of 50% throughout the year.
Chief executive Aidan Gomez has previously said the company plans to go public “soon,” suggesting that scaling enterprise adoption and broadening product lines are part of a longer-term strategy to support an eventual listing.
The Tiny Aya launch strengthens Cohere’s narrative as an AI infrastructure company focused on pragmatic deployment rather than purely experimental research.
Tiny Aya’s release illustrates a broader shift in AI development priorities:
- Efficiency is becoming as important as scale.
- Regional language support is moving from a secondary feature to a core capability.
- Offline deployment is emerging as a competitive advantage in privacy-sensitive and connectivity-limited markets.
As AI adoption deepens globally, particularly in emerging markets, compact multilingual systems may prove critical in bridging accessibility gaps.
Rather than competing solely in the race for ever-larger models, Cohere is making a case that the future of AI may rely on systems that are smaller, regionally aware, open for adaptation — and capable of running wherever users are, even without the cloud.



