ChatGPT Based Voice Assistant For Blind People

Year : 2024 | Volume :11 | Issue : 02 | Page : –
By

Mohini R. Deore,

Vaishnavi Premkumar Sangave,

Pratiksha Lahu Padwal,

Mayuri Bharat Amte,

  1. Assistant Professor Department of Electronics & Telecommunication Engineering , Sinhgad College of Engineering, Pune Mahrashtra India
  2. Student Department of Electronics & Telecommunication Engineering , Sinhgad College of Engineering, Pune Mahrashtra India
  3. Student Department of Electronics & Telecommunication Engineering , Sinhgad College of Engineering, Pune Mahrashtra India
  4. Student Department of Electronics & Telecommunication Engineering , Sinhgad College of Engineering, Pune Mahrashtra India

Abstract

The proposed system for converting speech input into text format to facilitate interaction with ChatGPT is a sophisticated integration of hardware and cloud-based services. Utilizing state-of-the-art technologies, it facilitates seamless communication between users and the AI model. At the outset, the microphone serves as the input device, capturing audio signals from the user’s speech. These signals are then amplified to ensure clarity and fidelity before being transmitted to the ESP32 microcontroller for further processing. The ESP32, known for its versatility and computational power, plays a central role in signal conversion and transmission. Utilizing its analog-to-digital converter, the ESP32 converts the analog signals from the microphone into digital format. This digital representation of the speech input is then sent to the Google Cloud platform through an API connection. Google Cloud’s Speech-to-Text service, powered by advanced machine learning algorithms, analyzes the digital audio data and accurately transcribes it into text. Once the speech input is transcribed into text, it is handed over to ChatGPT for interpretation and response generation. ChatGPT, an AI language model, processes the text input to understand user queries, provide information, or engage in conversation. Its capacity to produce contextually relevant responses through natural language comprehension renders it an ideal conversational companion. To complete the loop of communication, the system employs a text-to-speech (TTS) library on the ESP32 microcontroller. This library converts the text-based responses generated by ChatGPT back into spoken format. The synthesized speech is then amplified and delivered through a speaker, allowing the user to hear the AI-generated responses in a natural and comprehensible manner. By seamlessly integrating speech recognition and text-to-speech synthesis, the system enables users to interact with ChatGPT through both spoken and text-based inputs. The hybrid approach improves accessibility and usability, accommodating a diverse array of users with different communication preferences. Overall, the system represents a significant advancement in conversational AI technology, paving the way for more intuitive and immersive user experiences.

Keywords: ChatGPT, ESP32, Google cloud, Text to Speech, Speech to Text , API.

[This article belongs to Journal of Operating Systems Development & Trends(joosdt)]

How to cite this article: Mohini R. Deore, Vaishnavi Premkumar Sangave, Pratiksha Lahu Padwal, Mayuri Bharat Amte. ChatGPT Based Voice Assistant For Blind People. Journal of Operating Systems Development & Trends. 2024; 11(02):-.
How to cite this URL: Mohini R. Deore, Vaishnavi Premkumar Sangave, Pratiksha Lahu Padwal, Mayuri Bharat Amte. ChatGPT Based Voice Assistant For Blind People. Journal of Operating Systems Development & Trends. 2024; 11(02):-. Available from: https://journals.stmjournals.com/joosdt/article=2024/view=157426



References

  1. Subhash S. Voice Control Using AI-Based Voice Assistant. In2020 International Conference on Smart Electronics and Communication (ICOSEC), Bangalore, India 2020 (pp. 592-596).
  2. Kiran H, Girish Kumar, Hanumanta DH, Dilshad Ahmad, Lalitha S “Voice Based Virtual Assistant”, 2023 International journal of Scientific Research in Engineering and Management (IJSREM),2023.7(7):1-5.Available from: https://ijsrem.com/download/voice-based-virtual-assistant/
  3. Kuzdeuov A, Mukayev O, Nurgaliyev S, Kunbolsyn A, Varol HA. ChatGPT for visually impaired and blind. In2024 International Conference on Artificial Intelligence in Information and Communication (ICAIIC) 2024 Feb 19 (pp. 722-727). IEEE..
  4. Burbach L, Halbach P, Plettenberg N, Nakayama J, Ziefle M, Valdez AC. ” Hey, Siri”,” Ok, Google”,” Alexa”. Acceptance-Relevant Factors of Virtual Voice-Assistants. In2019 IEEE international professional communication conference (procomm) 2019 Jul 23 (pp. 101-111). IEEE.
  5. Ghadage YH, Shelke SD. Speech to text conversion for multilingual languages. In2016 International Conference on Communication and Signal Processing (ICCSP) 2016 Apr 6 (pp. 0236-0240). IEEE..
  6. Carducci CG, Monti A, Schraven MH, Schumacher M, Mueller D. Enabling ESP32-based IoT applications in building automation systems. In2019 II Workshop on Metrology for Industry 4.0 and IoT (MetroInd4. 0&IoT) 2019 Jun 4 (pp. 306-311). IEEE.
  7. Babiuch M, Foltýnek P, Smutný P. Using the ESP32 microcontroller for data processing. In2019 20th International Carpathian Control Conference (ICCC) 2019 May 26 (pp. 1-6). IEEE.
  8. Mondal A, Dey M, Das D, Nagpal S, Garda K. Chatbot: An automated conversation system for the educational domain. In2018 International Joint Symposium on Artificial Intelligence and Natural Language Processing (iSAI-NLP) 2018 Nov 15 (pp. 1-5). IEEE.
  9. Ye Y, You H, Du J. Improved trust in human-robot collaboration with ChatGPT. IEEE Access. 2023 Jun 1;11:55748-54..
  10. Van der Zee RA, van Tuijl EA. A power-efficient audio amplifier combining switching and linear techniques. IEEE Journal of Solid-State Circuits. 1999 Jul;34(7):985-91.
  11. Kumar V, Singh H, Mohanty A. Real-Time Speech-To-Text/Text-To-Speech Converter with Automatic Text Summarizer Using Natural Language Generation and Abstract Meaning Representation. International Journal of Engineering and Advanced Technology (IJEAT). 2020 Apr 3;9:2361-5.

Regular Issue Subscription Review Article
Volume 11
Issue 02
Received April 24, 2024
Accepted July 12, 2024
Published July 22, 2024