Skip to main content
  1. Home
  2. Computing
  3. News

ChatGPT already listens and speaks. Soon it may see as well

Add as a preferred source on Google
ChatGPT meets a dog
OpenAI

ChatGPT’s Advanced Voice Mode, which allows users to converse with the chatbot in real time, could soon gain the gift of sight, according to code discovered in the platform’s latest beta build. While OpenAI has not yet confirmed the specific release of the new feature, code in the ChatGPT v1.2024.317 beta build spotted by Android Authority suggests that the so-called “live camera” could be imminently forthcoming.

OpenAI had first shown off Advanced Voice Mode’s vision capabilities for ChatGPT in May, when the feature was first launched in alpha. During a demo posted at the time, the system was able to identify that it was looking at a dog through the phone’s camera feed, identify the dog based on past interactions, recognize the dog’s ball, and associate the dog’s relationship to the ball (i.e. playing fetch).

Dog meets GPT-4o

The feature was an immediate hit with alpha testers as well. X user Manuel Sainsily employed it to great effect in answering verbal questions about his new kitten based on the camera’s video feed.

Recommended Videos

Trying #ChatGPT’s new Advanced Voice Mode that just got released in Alpha. It feels like face-timing a super knowledgeable friend, which in this case was super helpful — reassuring us with our new kitten. It can answer questions in real-time and use the camera as input too! pic.twitter.com/Xx0HCAc4To

— Manuel Sainsily (@ManuVision) July 30, 2024

Advanced Voice Mode was subsequently released in beta to Plus and Enterprise subscribers in September, albeit without its additional visual capabilities. Of course, that didn’t stop users from going wild in testing the feature’s vocal limits. Advanced Voice, “offers more natural, real-time conversations, allows you to interrupt anytime, and senses and responds to your emotions,” according to the company.

The addition of digital eyes would certainly set Advanced Voice Mode apart from OpenAI’s primary competitors Google and Meta, both of whom have in recent months introduced conversational features of their own.

Gemini Live may be able to speak more than 40 languages, but it cannot see the world around itself (at least until Project Astra gets off the ground) — nor can Meta’s Natural Voice Interactions, which debuted at the Connect 2024 event in September, use camera inputs.

OpenAI also announced today that Advanced Voice mode was now also available for paid ChatGPT Plus accounts on desktop. It was available exclusively on mobile for a bit, but can now be accessed right at your laptop or PC as well.

Andrew Tarantola
Former Computing Writer
Andrew Tarantola is a journalist with more than a decade reporting on emerging technologies ranging from robotics and machine…
A simple coding mistake is exposing API keys across thousands of websites
Security gaps that are easier to miss than you think
Computer, Electronics, Laptop

After analyzing 10 million webpages, researchers have found thousands of websites accidentally exposing sensitive API credentials, including keys linked to major services like Amazon Web Services, Stripe, and OpenAI.

This is a serious issue because APIs act as the backbone of the apps we use today. They allow websites to connect to services like payments, cloud storage, and AI tools, but they rely on digital keys to stay secure. Once exposed, API keys can allow anyone to interact with those services with malicious intent.

Read more
AMD’s latest Ryzen 9 9950X3D2 pushes X3D to the limit
Dual 3D V-Cache, higher power, and a focus on enthusiast performance
AMD Ryzen 9 9950X3D2 FEatured

AMD has unveiled what might be its most extreme desktop CPU yet, the Ryzen 9 9950X3D2. And it’s going all-in on one thing: cache.

https://twitter.com/jackhuynh/status/2037159705395491033?s=20

Read more
Next-gen AI breakthrough promises chatbots that can read the room better
Researchers are teaching AI chatbots to read between the lines
Generative AI

Have you ever asked a chatbot something and felt like it completely missed your point? You say something with a bit of nuance, and the AI misses the subtlety entirely. That is exactly the problem researchers are trying to solve.

Even though the emotional connection with AI can feel deeper than human conversation for many users, most AI systems today still treat a sentence as a single block of sentiment. If you mix praise and criticism, the nuance often gets lost.

Read more