What are ChatGPT's Latest Updates? As It Can Now See, Hear, Speak

OpenAI's latest breakthrough has brought us exciting enhancements to ChatGPT, ushering in a new era of interactive and immersive AI experiences. ChatGPT can now see, hear, and speak. 

Sonu Vivek
New Update

Many of us have grown up watching our beloved superhero and the sassy Avenger, Ironman, with his AI assistant named Jarvis, helping him perform numerous tasks. At some point in time, we've all wished for something similar. We've all fantasized about having our own AI assistant that caters to our every whim. With the advancements we are witnessing in AI, that dream may not be too far away. ChatGPT, the most revered chatbot nowadays, has become even more dynamic with its latest upgrades. Let's delve into the new features rolled out by OpenAI and discover who can benefit from them.

What are ChatGPT's Latest Updates? 

OpenAI is at the forefront of innovation in a world where technology continues to evolve at an unprecedented pace constantly pushing the boundaries of what artificial intelligence can do. Their latest breakthrough has brought us exciting enhancements to ChatGPT, ushering in a new era of interactive and immersive AI experiences. ChatGPT can now see, hear, and speak. 

Voice: Conversations Come Alive

A New Era of Interaction

Voice recognition and synthesis have long been goals of AI research, and OpenAI has taken a giant leap forward by integrating voice capabilities into ChatGPT. This exciting addition allows users to engage in dynamic, real-time conversations with their AI assistant. Whether you're on the go, spending time with family, or resolving a debate at the dinner table, ChatGPT is now your conversational companion.

How to Get Started?

Getting started with voice is a breeze. Simply head to your ChatGPT settings and opt into voice conversations. Once you've done that, tap the headphone icon in the top-right corner of the home screen to choose from five different voices. These voices have been meticulously crafted in collaboration with professional voice actors to deliver a human-like and engaging experience.

Behind the Scenes

The magic behind ChatGPT's new voice capability lies in its cutting-edge text-to-speech model. This model can generate remarkably human-like audio from plain text and a short sample of speech. Additionally, OpenAI employs Whisper, its open-source speech recognition system, to transcribe your spoken words into text, ensuring seamless communication.

A World of Images: ChatGPT's Visual Proficiency

Exploring the Power of Visual Information

The addition of image capabilities to ChatGPT opens up a world of possibilities. Now, you can effortlessly share images with ChatGPT to troubleshoot issues, plan meals, or analyze complex data. This feature is made possible by the multimodal capabilities of GPT-3.5 and GPT-4, which apply their language reasoning skills to a wide range of images, including photographs, screenshots, and documents containing text and images.

How to Utilize Image Capabilities?

To make the most of ChatGPT's image recognition, simply tap the photo button to capture or select an image. If you're using iOS or Android, tap the plus button to initiate this process. You can also discuss multiple images or employ the drawing tool to guide your assistant's attention to specific areas of interest within an image.

OpenAI's Gradual Approach: Safety and Progress

Ensuring Safe and Beneficial AI

OpenAI's overarching goal is to develop Artificial General Intelligence (AGI) that is not only powerful but also safe and beneficial to humanity. The gradual deployment of voice and image capabilities is a strategic choice that allows for continuous improvement and risk mitigation.

Voice Technology: A Unique Use Case

Voice technology, while promising, introduces new challenges, including the potential for malicious actors to misuse it. To address these concerns, OpenAI has harnessed this technology for a specific use case: voice chat. Collaboration with trusted voice actors and partners like Spotify ensures responsible and creative application of this powerful tool.

Vision: Making it Safe and Useful

Vision-based models also bring novel challenges, such as model hallucinations and high-stakes decision-making based on images. Before broad deployment, OpenAI rigorously tested the model with red teamers and alpha testers to evaluate potential risks.

ChatGPT and Your Daily Life

A Companion for All Occasions

ChatGPT's voice and image capabilities are designed to be valuable in your daily life. Whether you're a podcast enthusiast looking to expand your audience through voice translation or a visually impaired individual seeking assistance, ChatGPT is here to help.

Transparency and Limitations

OpenAI is committed to transparency about the model's limitations. While ChatGPT excels at transcribing English text, it may not perform as well with languages that use non-Roman scripts. Users are advised to exercise caution when using ChatGPT for specialized topics and languages outside its primary proficiency.

"Users might depend on ChatGPT for specialized topics, for example in fields like research. We are transparent about the model's limitations and discourage higher-risk use cases without proper verification. We’ve also taken technical measures to significantly limit ChatGPT’s ability to analyze and make direct statements about people since ChatGPT is not always accurate and these systems should respect individuals’ privacy," OpenAI said in a blog. 

Who can access ChatGPT's new updates? 

OpenAI is rolling out voice and image capabilities to Plus and Enterprise users initially, with plans to expand access to other groups shortly. As these capabilities become more accessible, we anticipate even more innovative applications and exciting possibilities on the horizon.

OpenAI's latest advancements in voice and image capabilities have transformed ChatGPT into a versatile and indispensable tool for users across the globe. These enhancements not only open up new avenues for interaction but also underscore OpenAI's commitment to responsible and beneficial AI development.