CIO Insider

CIOInsider India Magazine

Separator

Sarvam AI Takes on Giants with Indic LLM

CIO Insider Team | Thursday, 12 February, 2026
Separator

Global leaders in artificial intelligence are pushing forward in terms of size and scope, while the Indian startup Sarvam AI is focusing on customizing its technology for local markets. They are developing a Language Model (LLM) specifically designed for Indic languages, incorporating voice interfaces and regional scripts.

This endeavor is supported by the government and has yielded impressive results that rival those of larger models in important multilingual tasks. In April 2025, the Indian government, as part of the IndiaAI Mission, chose Sarvam to develop the inaugural sovereign Large Language Model (LLM) for India.

Sarvam will be granted exclusive computing resources for the purpose of developing an original foundational model entirely from the ground up. This model, equipped with reasoning abilities, tailored for voice recognition, and proficient in various Indian languages, will be prepared for widespread implementation.

Dr. Pratyush Kumar, Co-founder of Sarvam, stated, “Building an AI ecosystem for India has always been core to Sarvam’s mission. As part of the Sovereign LLM proposal, we are developing three model variants: Sarvam-Large for advanced reasoning and generation, Sarvam-Small for real-time interactive applications, and Sarvam-Edge for compact on-device tasks.”

In October 2024, the company unveiled Sarvam-1, a language model with 2 billion parameters specifically designed for Indian languages. As per the company's claims, numerous multilingual models necessitate 4–8 tokens for each Indic word, in contrast to 1.4 tokens required in English. On the other hand, Sarvam-1 decreases this requirement to 1.4–2.1 tokens for the various languages it supports.

The model demonstrated exceptional accuracy in both knowledge and reasoning tasks, particularly in Indic languages, surpassing the performance of Gemma-2-2B and Llama-3.2-3B on several established benchmarks.

In preparation for the upcoming India AI Impact Summit in 2026, Sarvam has unveiled a range of groundbreaking advancements. Among these innovations is Sarvam-Translate, which now offers support for 22 Indian languages, encompassing Bengali, Marathi, Telugu, Maithili, Santali, Kashmiri, Nepali, Sindhi, Dogri, and Sanskrit. The model provides support for translating paragraphs at the language level and has the ability to translate a variety of structured content across 15 languages.

Also Read: Lookback 2025: 7 Indian Companies That Drove Major Expansions

In evaluations conducted by language specialists, Sarvam-Translate was found to outperform more extensive models such as Gemma3-27B-IT, Llama4 Scout, and Llama-3.1-405B-FP8 in terms of quality and accuracy. Sarvam introduced the Bulbul v1, a code-mixed multilingual text-to-speech model, and subsequently released Bulbul V3 this year. The latter was specifically designed to provide enhanced, more authentic voices for Indian languages that are suitable for professional use.

According to information found on the corporation's official website, Sarvam's Text-to-Speech API, featuring the capabilities of Bulbul v3, is compatible with eleven Indian languages. These languages include Hindi, Bengali, Tamil, Telugu, Gujarati, Kannada, Malayalam, Marathi, Punjabi, Odia, and English. Additionally, each language is equipped with several speaker voices, each possessing unique attributes.

Saaras v3 represents the most recent advancement in Sarvam's speech-to-text technology, boasting the capability to automatically identify the spoken language and generate transcriptions for all 22 Indian languages it supports, such as Hindi, Bengali, Tamil, Telugu, Gujarati, Kannada, Malayalam, Marathi, Punjabi, Odia, and English. This sophisticated model is adept at managing code-mixed audio and has been fine-tuned for optimal performance in both real-time and batch processing scenarios.

Also Read: Republic Day 2026: India's Tech Triumphs & Global Leadership

The startup noted in its blog that despite their impressive performance on English documents, leading global vision-language models frequently fall short when it comes to Indian languages and regional scripts

Last week, the corporation introduced Sarvam Vision, marking a significant expansion of its sovereign model series from voice and text to include visual capabilities. This innovative product encompasses a 3B-parameter state-space vision-language model specifically crafted for tasks such as image captioning, scene text recognition, chart interpretation, and intricate table analysis.

Also Read: AI & Tech: Visionary Pre-Budget Insights from Industry Leaders

The startup noted in its blog that despite their impressive performance on English documents, leading global vision-language models frequently fall short when it comes to Indian languages and regional scripts. They further highlighted that Sarvam's 3B inference-efficient model is designed to bridge this disparity. The model demonstrated superior performance on the Sarvam Indic OCR Bench, which includes 20,267 document samples in 22 official Indian languages encompassing historical and contemporary texts. It outperformed Gemini 3 Pro, Opus 4.5, and GPT 5.2 in terms of both word and character accuracy, as measured by word error rate-based metrics.



Current Issue
2026 The AI Tipping Point



🍪 Do you like Cookies?

We use cookies to ensure you get the best experience on our website. Read more...