Google Unveils Gemini: A Multimodal AI for Seamless Information Processing

Google Gemini

Google has launched Gemini model, which is available globally starting December 6, 2023. Touted as the most capable and flexible AI model, Google Gemini will be integrated into Bard including the latest Pixel 8 Pro smartphones. If you login to Google Bard, you will view a notification stating that Gemini Pro is the Bard’s biggest upgrade.

Powerful Capabilities

According to Google, Gemini Pro has been fine-tuned in such a way that it’s more capable at things such as understanding, summarizing, reasoning, coding, and planning. You can test drive Google Bard with Gemini Pro for various text-based prompts. The company revealed that it will add support for other modalities within the next few days. As of this writing, Google Gemini is available in more than 170+ countries. Google will add support for additional languages and regions like Europe in the near future.

Commenting on the development, Google CEO Sundar Pichai disclosed that Gemini 1.0 is optimized for different sizes such as Ultra, Pro, and Nano. This underlines Gemini’s state-of-the-art performance across several leading benchmarks. He added that the company has revealed the first models of the Gemini era and this is the first realization of the vision. Pichai disclosed that this new era of models represents one of the biggest science and engineering efforts.

Gemini Part of Extensive Teamwork

Google’s Gemini, a result of extensive collaboration among various teams, including Google Research, is designed to be multimodal, enabling it to “generalize and seamlessly understand, operate across, and combine different types of information including text, code, audio, image, and video.” Demonstrating its capabilities, Google showcased Gemini’s ability to perceive like a human eye, comprehend and evaluate in real-time, and provide suggestions for the next course of action.

DONT MISS  Google Celebrates 25th Birthday With Kickass Doodle

Gemini Variants

Gemini comes in three variants: Ultra, Pro, and Nano. While Gemini Ultra is touted as the largest and most capable model suitable for highly complex tasks, Gemini Pro excels in scaling across a broad spectrum of tasks, and Gemini Nano handles on-device tasks. As of today, Gemini Nano is integrated into Pixel 8 Pro, powering features such as Summarise in the Recorder app and Smart Reply via Gboard, initially with WhatsApp. The rollout of Gemini will extend to various Google products and services, including Search, Ads, Chrome, and Duet AI.

Upcoming Search Integration

In ongoing experiments, Google is incorporating Gemini into Search, aiming to enhance the Search Generative Experience (SGE) by achieving a 40% reduction in latency in English in the US, coupled with improvements in quality.

Access to Enterprise Customers

Starting from December 13, developers and enterprise customers can access Gemini Pro through the Gemini API in Google AI Studio or Google Cloud Vertex AI. Android developers can also leverage Gemini Nano via AICore, a new system capability introduced in Android 14, starting with Pixel 8 Pro devices. Gemini Ultra, still undergoing trust and safety checks, will be available for select customers, developers, partners, and safety experts for early experimentation and feedback before its broader release to developers and enterprise customers early next year.

Additionally, Bard, another offering from Google, will receive a “specifically tuned version of Gemini Pro in English for more advanced reasoning, planning, understanding, and more.” A forthcoming release, Bard Advanced, will grant users early access to Google’s most advanced models and capabilities, beginning with Gemini Ultra.

DONT MISS  Google's Bard AI Demo Fails to Meet Expectations, Triggers $100 Billion Loss

Addressing concerns about hallucinations in AI models, Eli Collins, VP of Product at Google DeepMind, stated that while Gemini has made strides in improving factuality, the model is still capable of hallucinating. Integration with products like Bard involves additional techniques to enhance response accuracy.

Google asserts that Gemini Ultra’s performance surpasses current state-of-the-art results on 30 out of 32 widely-used academic benchmarks in large language model (LLM) research and development. Gemini Ultra achieves a score of 90.0%, making it the first model to outperform human experts in massive multitask language understanding (MMLU). This evaluation involves a combination of 57 subjects, such as math, physics, history, law, medicine, and ethics, testing both world knowledge and problem-solving abilities. Moreover, Google claims that Gemini can “understand, explain, and generate high-quality code” in popular programming languages like C++, Python, Go, and Java.

We tested Google Bard powered by Gemini today but the system still provides the same old replies. Google Bard provides information about a person I requested but it doesn’t provide top 5 songs rendered by that particular person. However, the same prompt works well with Microsoft Copilot.

Anand Narayanaswamy is the editor-in-chief of Netans. He was recognized as a Microsoft Most Valuable Professional (MVP) for 9 years (2002 to 2011) and again as a Microsoft MVP in Surface under Windows and Devices in January 2024. He worked as a Chief Technical Editor with ASPAlliance and was part of ASPInsider program. Anand has published several articles and reviews related to various software and hardware products for various software and technology related websites. He is also active on social media and also participates as an Influencer for various brands. Anand can be reached at admin@netans.com