AI Impact Summit and Sarvam AI's Contribution
At the AI Impact Summit held in New Delhi, Sarvam AI, a Bengaluru-based startup, introduced two Large Language Models (LLMs) trained on 35 billion and 105 billion parameters. These models are designed to be less power- and compute-intensive while improving performance in Indian languages compared to other models.
Challenges for Indian Language LLMs
- Data Availability:
- Indian languages are underrepresented in internet data, creating challenges in training LLMs effectively.
- Capital and Resource Scarcity:
- Training LLMs requires significant financial and computational resources, which are limited in India.
Government Support and Initiatives
- The IndiaAI Mission has subsidized LLM training, commissioning over 36,000 GPUs to support domestic AI development.
- The government provided Sarvam with 4,096 GPUs, with an estimated subsidy of nearly ₹100 crore.
- The Ministry of Electronics and Information Technology encourages domestic LLMs to enhance the Indian AI ecosystem.
Breakthroughs and Innovations
- Mixture of Experts (MoE) Architecture:
- This architecture allows the activation of only a fraction of parameters, enhancing efficiency and reducing computational needs.
Future Directions
Sarvam aims to enhance its model's depth and performance with future investments. The focus remains on accuracy and efficiency within the Indian context before expanding to larger models.
Other Indian LLM Developments
- BharatGen, incubated by IIT Bombay, developed a multilingual 17 billion parameter model for applications in education and healthcare.