Alibaba Releases Two New AI Models for Image Interpretation (1)

Aug 28, 2023

📰 News

Alibaba Releases Two New AI Models for Image Interpretation (1)

The rapid evolution of A.I. tools, like Alibaba's image-reading models, underscores the need to harness A.I.'s capabilities responsibly. By releasing these models as open-source, Alibaba empowers users to customize the tools for app development or research.

Summary: Alibaba's New AI Models

Alibaba, a major Chinese tech giant, unveiled two new open-source A.I. models: Qwen-VL and Qwen-VL-Chat.

Unlike text-based models like ChatGPT and Google Bard, these are vision language models that interpret images.

Qwen-VL-Chat's capabilities:

Provides directions by analyzing street signs.
Solves math problems from a photo.
Constructs narratives from multiple images.
Translates signs from Mandarin to English.
Assists in captioning photos for news agencies.

Qwen-VL is an enhanced version of Alibaba's existing image-reading chatbot, now supporting higher resolution images.

Alibaba's announcement was limited to the public release, with no additional comments to Fortune.

Applications and Competition

Alibaba's image-scanning tech can aid visually impaired individuals, e.g., scanning product labels and reading them aloud.

The models will be accessible on Alibaba Cloud’s Modelscope and Hugging Face, a renowned startup with an A.I. model library.

Meta, a day prior, introduced an A.I. model for coding, based on the open-source Llama 2 model from July.

Alibaba has been racing to match Meta's A.I. advancements. They recently launched Qwen-7B and Qwen-7B-Chat, which are foundational to the latest releases.

In a collaboration, Meta's Llama 2 model became available to the Chinese market via Alibaba’s cloud division in July.

Remodel.ai Lets You Redesign Your Room with AI (1) AI Detects Heart Disease with a Simple X-Ray (1)

Alibaba Releases Two New AI Models for Image Interpretation (1)

Related Posts

Steve Jobs - Humans Build Tools

Jim Fan on X Here's why DALLE 3, once deployed, will improve at a faster rate than MidJourney:

Meet ‘Open Interpreter’: A Free Code Interpreter that Runs Locally