AIKosh Initiative
The government plans to integrate content from religious books and scriptures in various languages into the AIKosh database. This initiative will also include local dialects and oral traditions, providing a rich source of wisdom and context for AI models and Large Language Models (LLMs).
Objectives and Launch
- AIKosh was launched as a domestic dataset containing non-personal data to facilitate the development of AI models, applications, LLMs, and other AI tools.
- The initiative aims to build a comprehensive AI resource by inviting companies and startups to contribute their non-personal data voluntarily.
Collaboration and Data Sources
- A memorandum of understanding with the Lok Sabha Secretariat allows the use of Parliament questions, government reports, and committee meetings data.
- The platform also plans to tap datasets from various ministries, including those on the Open Governance Data Platform.
Platform Features and Regulations
- AIKosh offered approximately 350 datasets and nearly 150 AI models, including both LLMs and Small Language Models (SLMs).
- AIKosh is part of the ₹10,372 crore India AI mission. The budget allocation for 2025-26 is ₹200 crore.
- Datasets on AIKosh will not be available for monetization. The platform focuses solely on providing access to non-personal data for AI development purposes.
- The platform adheres to stringent data protection standards and complies with Indian laws, including the Information Technology Act, 2000, and the Data Protection Bill.
- AIKosh's objective is to provide non-personal public and private sector data for building AI applications, not for monetization.
- The platform implements stringent data protection measures to ensure user information security and confidentiality.
Future Directions
- AIKosh will include diverse inputs from religious texts, local dialect scriptures, and oral stories.
- It will leverage the vast corpus of parliamentary and government data, potentially adding datasets from various ministries.