Alibaba Releases Two New AI Models for Image Interpretation (1)

Aug 28, 2023
The rapid evolution of A.I. tools, like Alibaba's image-reading models, underscores the need to harness A.I.'s capabilities responsibly. By releasing these models as open-source, Alibaba empowers users to customize the tools for app development or research.
The rapid evolution of A.I. tools, like Alibaba's image-reading models, underscores the need to harness A.I.'s capabilities responsibly. By releasing these models as open-source, Alibaba empowers users to customize the tools for app development or research.
Summary: Alibaba's New AI Models
  • Alibaba, a major Chinese tech giant, unveiled two new open-source A.I. models: Qwen-VL and Qwen-VL-Chat.
  • Unlike text-based models like ChatGPT and Google Bard, these are vision language models that interpret images.
  • Qwen-VL-Chat's capabilities:
    • Provides directions by analyzing street signs.
    • Solves math problems from a photo.
    • Constructs narratives from multiple images.
    • Translates signs from Mandarin to English.
    • Assists in captioning photos for news agencies.
  • Qwen-VL is an enhanced version of Alibaba's existing image-reading chatbot, now supporting higher resolution images.
  • Alibaba's announcement was limited to the public release, with no additional comments to Fortune.
Applications and Competition
  • Alibaba's image-scanning tech can aid visually impaired individuals, e.g., scanning product labels and reading them aloud.
  • The models will be accessible on Alibaba Cloud’s Modelscope and Hugging Face, a renowned startup with an A.I. model library.
  • Meta, a day prior, introduced an A.I. model for coding, based on the open-source Llama 2 model from July.
  • Alibaba has been racing to match Meta's A.I. advancements. They recently launched Qwen-7B and Qwen-7B-Chat, which are foundational to the latest releases.
  • In a collaboration, Meta's Llama 2 model became available to the Chinese market via Alibaba’s cloud division in July.
 
Â