Google ushers in the “Gemini era” with AI advancements

This post was originally published on this site

Ryan Daws is a senior editor at TechForge Media, with a seasoned background spanning over a decade in tech journalism. His expertise lies in identifying the latest technological trends, dissecting complex topics, and weaving compelling narratives around the most cutting-edge developments. His articles and interviews with leading industry figures have gained him recognition as a key influencer by organisations such as Onalytica. Publications under his stewardship have since gained recognition from leading analyst houses like Forrester for their performance. Find him on X (@gadget_ry) or Mastodon (@[email protected])


.pp-multiple-authors-boxes-wrapper {display:none;}
img {width:100%;}

Google has unveiled a series of updates to its AI offerings, including the introduction of Gemini 1.5 Flash, enhancements to Gemini 1.5 Pro, and progress on Project Astra, its vision for the future of AI assistants.

Gemini 1.5 Flash is a new addition to Google’s family of models, designed to be faster and more efficient to serve at scale. While lighter-weight than the 1.5 Pro, it retains the ability for multimodal reasoning across vast amounts of information and features the breakthrough long context window of one million tokens.

“1.5 Flash excels at summarisation, chat applications, image and video captioning, data extraction from long documents and tables, and more,” explained Demis Hassabis, CEO of Google DeepMind. “This is because it’s been trained by 1.5 Pro through a process called ‘distillation,’ where the most essential knowledge and skills from a larger model are transferred to a smaller, more efficient model.”

Meanwhile, Google has significantly improved the capabilities of its Gemini 1.5 Pro model, extending its context window to a groundbreaking two million tokens. Enhancements have been made to its code generation, logical reasoning, multi-turn conversation, and audio and image understanding capabilities.

The company has also integrated Gemini 1.5 Pro into Google products, including the Gemini Advanced and Workspace apps. Additionally, Gemini Nano now understands multimodal inputs, expanding beyond text-only to include images.

Google announced its next generation of open models, Gemma 2, designed for breakthrough performance and efficiency. The Gemma family is also expanding with PaliGemma, the company’s first vision-language model inspired by PaLI-3.

Finally, Google shared progress on Project Astra (advanced seeing and talking responsive agent), its vision for the future of AI assistants. The company has developed prototype agents that can process information faster, understand context better, and respond quickly in conversation.

“We’ve always wanted to build a universal agent that will be useful in everyday life. Project Astra, shows multimodal understanding and real-time conversational capabilities,” explained Google CEO Sundar Pichai.

“With technology like this, it’s easy to envision a future where people could have an expert AI assistant by their side, through a phone or glasses.”

Google says that some of these capabilities will be coming to its products later this year. Developers can find all of the Gemini-related announcements they need here.

See also: GPT-4o delivers human-like AI interaction with text, audio, and vision integration

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

Tags: , , , , , , , , ,

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top