Northeast India AI Research Fellowship 2026
An 8–12 week remote research fellowship to build NLP and language technology for indigenous Northeast Indian languages — Khasi, Garo, Meitei, Assamese, and others. Fellows co-author papers, build datasets, and train language models with MWire Labs researchers.
One of the few fellowships anywhere focused on AI for indigenous languages — MWire Labs is building the foundational NLP stack for Northeast India's 200+ languages, and fellows directly co-author the papers that come out of it.
The Northeast India AI Research Fellowship is run by MWire Labs, a Shillong-based AI research lab focused on building language technology for the linguistic diversity of Northeast India. The region has over 200 languages, most without any NLP tooling.
What you’ll work on:
- Training and evaluating transformer-based language models (NE-BERT, Kren-M series)
- Designing and curating text, speech, and annotation datasets
- Building tools for language preservation and community access
What you gain:
- Co-authorship on papers, reports, or open-source releases
- Hands-on low-resource NLP experience (fine-tuning, evaluation, dataset design)
- Close mentorship from MWire Labs research team
- Fellowship certificate + letters of recommendation for top performers
- Portfolio contributions to GitHub and Hugging Face
Who should apply:
- UG/PG students in CS, AI, Linguistics, or related fields
- Native or heritage speakers of Northeast Indian languages especially encouraged
- Northeast India residents and diaspora prioritised but global applicants welcome
- Prior ML/NLP a plus — not required
Applications reviewed on a rolling basis. Early applicants prioritised for selection calls.