OpenAI is enhancing the ChatGPT capabilities by enabling the tool to'see, hear, and speak' in the latest updates to the viral chatbot.
OpenAI is releasing updates that will allow ChatGPT to understand verbal prompts and respond in a return-and-forth conversation with the user using the chatbot's new voice. The chatbot can respond to image prompts. The changes allow more along the lines of those supported by Siri, Google Lens and voice assistant, and Amazon's Alexa.
ChatGPT's new voice feature is powered by a text-to-speech model capable of generating human-like audio from text and a few seconds of sample speech.
The company employed professional voice actors to create its sounds and utilizes Whisper, an open-source speech recognition system utilized by OpenAI, to transcribe spoken words into text.
The company acknowledges that there are risks posed by the new voice technology, including the possibility of fraud or impersonation.
OpenAI said in a statement that it will add voice and image capabilities to users of the Plus and Enterprise versions of ChatGPT in the next two weeks.