Preeti Jyothi: Developing AI to unlock Indian accents, dialects, and languages
Jyothi, an associate professor at the Department of Computer Science and Engineering, IIT-Bombay, is building speech and language technologies to ensure more equitable access to people of varied linguistic and educational backgrounds
Preethi Jyothi, Associate professor, Department of Computer Science and Engineering, IIT-Bombay
Preethi Jyothi had envisioned a career in an artificial intelligence (AI)-related field, but what she hadn’t was that Natural Language Processing (NLP)—the ability of computers to understand, interpret and generate human language—would become as popular as it has because it wasn’t a hot field when she was pursuing her PhD.
“There was even concern about another AI winter,” she says. “I recall having a conversation with my advisor about what to minor in to make myself more employable.”
Currently, Jyothi, 40, is an associate professor at the Department of Computer Science and Engineering, IIT-Bombay. Through her work, in automatic speech recognition (ASR) and machine learning (ML), she is advancing technologies that could make voice interfaces more inclusive of low-resource languages or those that don’t have the resources to build the technologies that can enable ASR, she says.
With recent advances, ASR is now an integral part of many applications such as chat interfaces like GPT-4o, home devices like Google Home, or transcriptions for WhatsApp voice messages. However, she explains, ASR is a long-standing AI problem of transcribing natural speech into text because it does not work equally well across users from diverse demographics, particularly in India where users span a wide variety of speech accents, dialects and languages.
“My work focuses on building speech and language technologies for such low-resource settings to ensure more equitable access across users of varying linguistic and educational backgrounds,” she explains. “Building speech technologies for Indian languages (at large) could potentially be impactful, especially in India, by making technologies accessible even to users who cannot read or write.”
Jyothi’s fascination with AI began in high school, and she recalls reading Douglas Hofstadter’s Gödel, Escher, Bach that teased questions about human consciousness and its implications for AI. “Building software that could mimic human abilities was interesting to think about,” she says.
A gold-medallist computer science graduate from the National Institute of Technology, Calicut, Jyothi went on to pursue a PhD programme at Ohio State University and became a Beckman Postdoctoral Fellow at the University of Illinois at Urbana-Champaign.
Over the last few years, along with students and collaborators, Jyothi has worked on two projects—building ASR systems that are robust across varying speech accents and building computational models for code-switching where users switch between languages (like Hindi-English or Tamil-English).
“These are both projects that are rooted in the Indian setting and are areas with remaining open problems that we continue to work on. I am also interested in the mechanics of multilingual large language models and how to elicit cross-lingual understanding from such models,” she says.
Her research papers and projects have won her accolades, and she believes AI systems for low-resource languages can be game-changing in real-world scenarios. For instance, whether it’s helping farmers become aware about market prices, or enabling disaster communication in remote areas. But she’s also aware of the concerns.
“A constant ethical challenge of working on technologies for languages is that some languages will inevitably be left behind,” she notes. “Speakers of these languages will be pressured to shift to using other languages, thus endangering the former. This is a challenge that one needs to be constantly aware of”.
While she’s striving for equitable access wherever possible, she also throws light on the systemic support AI researchers could get, especially those working on non-English languages. For instance, targeted research funding for faculty and students working to develop NLP technologies for low-resource languages. In addition, launching competitions or challenges with broad reach where these languages are a core focus could help shift the focus to this field. Also, enabling the development and release of high-quality multilingual resources such as pre-trained models and datasets may serve as a good starting point for further explorations, she adds.
As a woman working in a male-dominated field, she says she has been lucky and feels privileged to not have faced very overt gender biases.
“That said, I am aware of real challenges faced by many women in technical fields,” she says. “As an academic occupying a position of responsibility, I am committed to creating a welcoming space for all students regardless of differences in gender or social backgrounds,” she adds.
That said, she advises young women to choose what truly interests and excites them without paying attention to any supposed norms.
“Interest in an area is what keeps one motivated and fuels growth. There truly is nothing better than working on things you actually care about,” she adds.
Ask her what she looks forward to in the rapidly changing world of AI and she says, “It would be exciting to see voice-driven interfaces used seamlessly in regional dialects and languages across Indian cities and towns.”