Google launches most capable AI model Gemini — All you need to know

Google Gemini
Google Gemini AI mode. (Image Credit: Google)

Google has introduced its highly anticipated general purpose, multimodal, generative AI model, Gemini. The tech giant claims Gemini is more powerful than any other model out there including OpenAI’s GPT-4.

“Human beings have five senses, and the world we built, and the media we consume is in those different modalities,” said Demis Hassabis, CEO of Google DeepMind. Similarly, “Gemini can understand the world around us in the way that we do and can absorb any type of input and output. Not just text like most models but also code, audio, image, and video.”

What is Google Gemini?

Gemini is a new and powerful artificial intelligence (AI) model from Google that can understand not just text but also images, videos, and audio. Google’s latest large language model Gemini is capable of completing complex tasks in math, physics, and others. It can also understand, explain and generate high-quality code in the world’s most popular programming languages, like Python, Java, C++, and Go.

“Gemini is the result of large-scale collaborative efforts by teams across Google, including our colleagues at Google Research,” Google CEO Sundar Pichai wrote in a blog post. “It was built from the ground up to be multimodal, which means it can generalize and seamlessly understand, operate across and combine different types of information including text, code, audio, image and video.”


What are LLMs?

A large language models (LLMs) is a type of AI program that can recognize and generate text, among other tasks. LLMs are trained on huge sets of data making them capable of understanding and generating natural language and other types of content to perform a wide range of tasks across diverse functions and applications.

LLMs have become a household name now due to its pivotal role in popularizing generative AI. Some of the examples of LLMs are ChatGPT from OpenAI, Bard by Google, Llama by Meta), and Bing Chat by Microsoft.

Google claims Gemini beats GPT-4

Google’s Gemini is among the largest and most advanced AI models to date. Compared to other leading AI chatbot models, Gemini stands out for its inherent multimodal capability on a wide variety of tasks.

From image, audio and video understanding to mathematical reasoning, “Gemini Ultra’s performance exceeds current state-of-the-art results on 30 of the 32 widely-used academic benchmarks” used in LLM research and development. “With a score of 90.0%, Gemini Ultra is the first model to outperform human experts on MMLU (massive multitask language understanding), which uses a combination of 57 subjects such as math, physics, history, law, medicine and ethics for testing both world knowledge and problem-solving abilities,” Google said.

Three versions of Gemini

Google has released Gemini 1.0 in three different sizes:

  • Gemini Ultra — This version is the largest and most capable model for highly complex tasks and built to quickly understand and act on different types of information — including text, images, audio, video and code. Google DeepMind claims that Gemini outmatches GPT-4 on 30 out of 32 standard measures of performance.
  • Gemini Pro — It is best for scaling across a wide range of tasks. Pro is designed to power the latest version of Google’s AI chatbot, Bard. It is capable at things like understanding, summarizing, reasoning, coding and planning. In six out of eight benchmarks, Gemini Pro outperformed GPT-3.5, including in massive multitask language understanding.
  • Gemini Nano — This is the most efficient model for on-device tasks. Nano model size is designed to run on smartphones, specifically the Google Pixel 8 Pro phone. Gemini Nano model will power new features such as suggesting replies within chat applications or summarizing text.

How to access Google Gemini?

You can try out Gemini for free. Gemini Pro is currently available on Google’s chatbot Bard. You can also use it on the Pixel 8 Pro phones, where Google will soon roll out AI-suggested text replies with WhatsApp now, and with Gboard in the future.

Gemini will gradually be integrated into other Google services.

Related Posts