Community Insights

OpenAI Announces New Feature that Enables ChatGPT to ‘Speak’, ‘Listen’ and Process Images

September 26, 2023 | by Samuel Nwite | 0

ChatGPT has gained the ability to comprehend spoken language, respond with a synthetic voice, and process images, effectively allowing it to “see, hear, and speak”, OpenAI announced on Monday.

The new feature, which is OpenAI’s biggest since the introduction of GPT-4, is being introduced amid declining usage of ChatGPT – fueled by competition.

The chatbot’s recent update enables users to engage in voice conversations through ChatGPT’s mobile app, offering a selection of five distinct synthetic voices for the bot’s responses. Additionally, users can now share images with ChatGPT, providing the capability to emphasize specific areas of interest or request analysis, for example, inquiring about the type of clouds present in an image.

Register for Tekedia Mini-MBA edition 17 (June 9 – Sept 6, 2025) today for early bird discounts. Do annual for access to Blucera.com.

Tekedia AI in Business Masterclass opens registrations.

Join Tekedia Capital Syndicate and co-invest in great global startups.

Register to become a better CEO or Director with Tekedia CEO & Director Program.

OpenAI said that these updates will be gradually rolled out to paying users over the next two weeks. Voice functionality will be accessible exclusively through the iOS and Android apps, while image processing capabilities will be available across all platforms.

The launch of ChatGPT-3 late last year came with a frenzy that accelerated investment in artificial intelligence, with more companies pumping billions of dollars into the research and development of chatbots powered by AI language models.

However, the burgeoning tech has come with several concerns yet to be addressed by the industry leaders. The concerns include the use of private data.

In a report on Monday, CNBC said OpenAI directed it to the company’s guidance on voice interactions, which clarifies that OpenAI does not retain audio clips, and these clips are not employed to enhance models. However, the guidelines also note that transcriptions are regarded as inputs and may be used to improve the large-language models.

In the ever-intensifying competition of the artificial intelligence landscape, where chatbot leaders like OpenAI, Microsoft, Google, and Anthropic are vying for supremacy, a significant emphasis has been placed on feature enhancements.

Tech giants are in a race to not only roll out new chatbot applications but also introduce a slew of new features, with this summer being a particularly active period. For instance, Google has unveiled a range of updates for its Bard chatbot, while Microsoft has integrated visual search capabilities into Bing.

Earlier this year, Microsoft’s substantial additional investment of $10 billion in OpenAI solidified its position as the most substantial AI investment of the year, as reported by PitchBook. In April, OpenAI reportedly concluded a $300 million share sale, valuing the startup between $27 billion and $29 billion, with prominent investments from firms like Sequoia Capital and Andreessen Horowitz.

However, these advancements are not without their share of concerns. Experts have raised alarms about AI-generated synthetic voices, which, while providing users with a more natural experience, also hold the potential to create more convincing deepfakes. Cybersecurity experts and researchers have already begun to delve into the ways deepfakes could be exploited to infiltrate cybersecurity systems.

OpenAI, in its Monday announcement, acknowledged these concerns, emphasizing that the synthetic voices were meticulously crafted in collaboration with voice actors with whom the company had direct working relationships, rather than being sourced from unknown individuals.

Nonetheless, the release provided limited details on how OpenAI intends to use consumer voice inputs and the measures it will take to secure that data when utilized. According to the company’s terms of service, consumers maintain ownership of their inputs “to the extent permitted by applicable law.”

OpenAI is looking to roughly triple its valuation in less than a year to as much as $90 billion, The Wall Street Journal reports, citing anonymous sources. It says the artificial intelligence startup behind ChatGPT is reaching out to investors about a new share sale that would catapult OpenAI’s valuation to the $80 billion to $90 billion range. It was valued at about $29 billion in a share sale earlier this year. OpenAI is credited with intensifying interest in AI after releasing ChatGPT late last year.

OpenAI Announces New Feature that Enables ChatGPT to ‘Speak’, ‘Listen’ and Process Images

Like this:

No posts to display

Post Comment Cancel reply

Share this:

Like this:

No posts to display

Post Comment Cancel reply