From AI Overviews to Gemini, what was announced?

May 15, 2024

62 5 minutes read

From AI Overviews to Gemini, what was announced? — MGAZ4BIPBTNTP6VYT75A73GX5E.jpg

Alphabet-owned Google, which announced several new generative artificial intelligence capabilities on Tuesday, said the new technology will not cannibalise its multi-billion-dollar Search business, but rather augment it, boosting result relevance.

Chief executive Sundar Pichai said it is unlikely that Google’s generative AI tool Gemini, previously known as Bard, could jeopardise its highly-profitable existing product.

“From day one, how we have approached everything and the same thing we do going through this moment is to stay focused on users, meet them as their needs evolve … and we are seeing that people are responding and engaging with our products more,” Mr Pichai said at a media event.

“Across Search in Gemini, we are excited as we can expand the use cases, help users with more complex questions. I view all that as a net positive and feel like this is the moment of growth and opportunity and not the other way around.”

Google Search and related businesses accounted for more than 57 per cent of the company’s total sales in the first quarter of this year. It added nearly $46.2 billion to overall revenue, 14 per cent more on an annual basis.

But Gemini has yet to show any significant contribution to the company’s sales.

Sundar Pichai, chief executive of Alphabet, says he remains confident that generative AI will support growth in the group’s existing business. Bloomberg

The company announced various new AI features and products during its annual Google I/O conference at its headquarters in Mountain View, California, on Tuesday.

Following the announcements, Alphabet’s shares were slightly up, trading at $171.84 at 11.50pm in the UAE on Tuesday, giving the company a market value of $2.11 trillion.

Gemini 1.5 Flash: ‘cost-efficient’ model

Google introduced Gemini 1.5 Flash, its latest generative AI model that’s sleeker than previous versions and designed to be faster and more efficient.

The new variant is optimised for “high-volume, high-frequency tasks at scale and is more cost-efficient”, said Demis Hassabis, chief executive of Google DeepMind.

The lighter weight allows 1.5 Flash to do multimodal reasoning across vast amounts of information, perform quick summarisation, chat applications, and data extraction from long documents and tables.

It is trained by its predecessor 1.5 Pro, through a process called distillation, where the “most essential knowledge and skills from a larger model are transferred to a smaller, more efficient model”, Mr Hassabis said.

In December, Google launched its first multimodal model, Gemini 1.0, in three sizes – ultra, pro and nano.

That was followed by an enhanced version 1.5 Pro with a one million token context window. The latest 1.5 Flash is trained on up to a two million token context window.

In natural language processing, a token refers to a single word or character.

“We are really pushing the frontiers here and we are making progress towards the ultimate goal of an infinite context window,” Mr Pichai said.

Next generation of open models

Google also unveiled Gemma 2, its next generation of open-source models that can be accessed by developers globally through various platforms. The company said the aim is to build AI innovation more “responsibly”.

First announced in February, Gemma is built through the same research and technology used to create Gemini, which is a closed AI model.

Google and Microsoft-backed OpenAI, the two frontrunners in the generative AI field, have predominantly maintained shut foundational models, expressing concern that large language models could be manipulated to spread misinformation or other potentially dangerous content.

In December, Google launched its first multimodal model, Gemini 1.0, in three sizes - ultra, pro and nano. Getty — In December, Google launched its first multimodal model, Gemini 1.0, in three sizes – ultra, pro and nano. Getty

But proponents of open-source software say keeping these systems closed unfairly curtails innovation and hampers their potential to improve the world.

Gen AI for creators

Google also announced its latest video generation model Veo and Imagen 3, its “highest quality text-to-image model yet”.

Veo comes with an advanced understanding of natural language and visual semantics. It can generate video that closely represents the user’s creative vision, the company said. Google is expected to add Veo’s capabilities to YouTube Shorts soon.

Imagen 3 is capable of generating close details, producing lifelike images, with less distracting visual artifacts than the company’s previous models, Google said.

“Imagen 3 better understands natural language, the intent behind your prompt, and incorporates small details from longer prompts … it’s also our best model yet for rendering text, which has been a challenge for image generation models,” said Eli Collins, vice president for product management.

Both Veo and Imagen 3 are available for select creators from Tuesday.

What about responsible AI?

Google has attracted backlash over its use of AI in the past.

In February, it suspended the image generation of individuals by Gemini, following criticism regarding its handling of racial issues. Google at the time apologised for “missing the mark”.

AI Overviews aims to boost the results of Search and help with complex questions. Reuters

“That’s completely unacceptable and we got it wrong,” Mr Pichai wrote in an employees’ memo in February seen by The National.

In 2015, the company had to issue an apology after its photos app categorised a black couple as “gorillas”.

On Tuesday, the company said it is taking measures to address the challenges raised by generative technologies.

“We have been working with the creative community and other external stakeholders, gathering insights and listening to feedback to help us improve and deploy our technologies in safe and responsible ways,” Ms Collins said.

“We have been conducting safety tests, applying filters, setting guardrails and putting our safety teams at the centre of development.”

AI Overviews coming to the US

From Tuesday, Google is making AI Overviews available to all users in the US. It expects to add more countries “soon” and bring the technology to more a billion people by the end of the year.

AI Overviews adds to the results of Search and aims to help users with complex questions.

“Rather than breaking your question into multiple searches, you can ask your most complex questions, with all the nuances and caveats you have in mind, all in one go,” said Liz Reid, vice president and head of Google Search.

For example, if users are looking for a trendy cafe or a brunch spot in Dubai that is highly rated by locals and also offers outdoor seating and is pet-friendly, then they can ask, “find popular cafe or brunch spots in Dubai and show details on outdoor seating availability and pet policies”.

Google to send alerts for suspected scams during phone calls

Google said it is testing a new feature that uses Gemini nano to offer real-time alerts during a call if it detects conversation patterns are commonly associated with potential scams.

For example, users will receive an alert if a bank representative asks them to urgently transfer funds, make a payment through a gift card or requests personal information like card personal identification numbers or passwords, which are uncommon bank requests.

“This protection all happens on-device, so your conversation stays private to you,” said Sameer Samat, president of Android ecosystem.

Google said it will share more about this opt-in feature later this year.

Google founders Sergey Brin, left, and Larry Page at the company’s HQ in Mountain View, California, in 2003. Getty Images

Updated: May 15, 2024, 7:19 AM

Source

May 15, 2024

62 5 minutes read