Share

ChatGPT-Plus can read images, hear your voice and answer back

Humankind on the hunt for Artificial General Intelligence (AGI)
ChatGPT-Plus can read images, hear your voice and answer back
ChatGPT-Plus

ChatGPT-Plus has gained voice chat abilities and can discuss images in what is a major upgrade from OpenAI. The highly anticipated upgrades allow its popular ChatGPT chatbot to interact with images and voices, not just text.

This move is part of OpenAI’s vision (or GPT-V) for artificial general intelligence (AGI) that can perceive and process information from multiple modes.

AGI could learn to perform any intellectual task that human beings can, including impersonation, and visual interpretation.

“We are beginning to roll out new voice and image capabilities in ChatGPT. They offer a new, more intuitive type of interface by allowing you to have a voice conversation or show ChatGPT what you’re talking about,” OpenAI said in its official blog post.

OpenAI said the new ChatGPT-Plus includes voice chat that mimicks human voices and has the ability to discuss images. This is thanks to integration with the company’s image generation models. ChatGPT-Plus is a subscription-based service powered by GPT-4.

OpenAI also recently unveiled DALL-E 3, its most advanced text-to-image generator yet. It can create high-fidelity images from text prompts while understanding complex contexts and concepts expressed in natural language. It will be built into ChatGPT Plus.

DALL-E 3 and ChatGPT-Plus launch OpenAI into the next era where AI assistants simulate how humans perceive the world with multiple senses.

Multi-modality

Read: Microsoft’s conversational AI, Copilot, launches next week

Multimodality creates opportunities for things like analyzing a photo of a geometry problem and providing tips on how to solve it, instead of explaining the entire problem through typed text.

These new multi-model features allow users to, for example, request a dinner recipe based on a mere photo of their fridge content. If you like bedtime stories, why not ask ChatGPT to invent one for you based on your taste of authors and genre, and read it for you?

ChatGPT-Plus’s ability to understand spoken words and respond out loud is reminiscent of what Apple’s Siri and Amazon’s Alexa do. Except, it’s much more advanced.

OpenAI’s release came on the same day as the announcement of Amazon’s investment in Anthropic.

ChatGPT-plus

Amazon and AI

Large language model (LLM) developer Anthropic announced that it had raised as much as $4 billion from Amazon. Amazon Web Services will become Anthropic’s primary cloud provider. Anthropic will train and deploy its future LLM models on AWS’ training- and inference-specialized chips. Those are Trainium and Inferentia.

Last February, Google invested $400 million in Anthropic and announced it would be Anthropic’s “preferred” cloud provider.

Google’s upcoming Gemini models will also be multimodal and more accurate than other currently available models.

The Amazon investment is the latest AI and LLM bet from a major cloud provider.

Microsoft and OpenAI integration

Microsoft is the largest investor in OpenAI, pouring over $10 billion into it. the tech giant is integrating advanced generative AI capabilities into its own consumer products.

The company announced AI upgrades to Windows 11, Office, and Bing search leveraging models like DALL-E 3 and Copilot. THe latter is OpenAI’s programming assistant.

OpenAI, however, is wary of potential risks with more powerful multimodal AI systems. This is especially true as these involve vision and voice generation, furthering man’s hunt for AGI.

“OpenAI’s goal is to build AGI that is safe and beneficial,” the company wrote in its announcement. “We believe in making our tools available gradually, which allows us to make improvements and refine risk mitigations over time.”

OpenAI said that Plus and Enterprise users will have access to these new functionalities over the next two weeks, with plans to later expand availability to developers.

SAP and ChatGPT-like Joule

On September 26, 2023, SAP, a German multinational software company, is set to announce Joule, the company’s ChatGPT-like generative AI assistant for customers in corporate finance, and human resources, among others.

Joule looks similar to the EinsteinGPT product Salesforce recently launched but for more applications than just gearing it to sales teams.

To power Joule, SAP is using a multitude of LLMs from providers including OpenAI as well as companies in which SAP holds stakes, such as Aleph Alpha, Anthropic and Cohere.

The race to dominate the AI industry is just beginning.

For more tech news, click here.

Disclaimer: The content of this article is intended for informational purposes only.It does not constitute advice on tax and legal matters; neither are they financial or investment recommendations. Refer to our full disclaimer policy here.