DOI: 10.1177/10554181261457000 ISSN: 1055-4181

Design and development of an intelligent pen with OCR and text-to-speech capabilities for Gujarati text

Mehulkumar Dalwadi, Faikaf Ahmad, Ehtesham Arshi, Dhruv Gupta, Anchal Maheshwari, Abhinandan Shah

This paper presents an online application of handwritten Gujarati text in Devanagari script (the script, which is traditionally used to write Gujarati in certain scenarios) to speech with the help of an Internet of Things (IoT) integrated digital pen and artificial neural networks. However, most readily available TTS systems primarily support English, while users often feel more comfortable communicating in their native language. Addressing this gap, the proposed system focuses on Gujarati, particularly when written in the Devanagari script, which introduces additional challenges due to its complex structure, including modifiers and compound characters. Handwritten character recognition remains a difficult task, especially for scripts like Devanagari that involve intricate patterns and variations in writing styles. The system tackles this challenge by employing a neural network capable of robust character recognition across diverse handwriting styles. An IoT-integrated digital pen is used to capture handwritten input in real time from any surface. The captured data is transmitted to a cloud-based processing unit, where the neural network identifies individual characters and converts them into text format. Following recognition, the text undergoes linguistic processing, including syllabification based on Gujarati phonological rules. Each syllable is stored and mapped to pre-recorded audio units. These units are then concatenated to generate coherent speech output. The system carefully manages silence and transitions between syllables to ensure natural pronunciation, minimizing abrupt pauses and maintaining speech quality. Overall, this approach offers a flexible, real-time, and user-friendly solution for converting handwritten Gujarati text into speech. By integrating IoT technology with advanced neural networks, the system significantly enhances accessibility and communication for visually impaired individuals, helping bridge language and usability barriers in assistive technologies.

More from our Archive