Truly a historic Week in AI (Notes)

Sep 30, 2023
Riley Brown 29 Sep 2023
Riley Brown 29 Sep 2023
notion image
I mean it when I say this. This was the most insane week in AI that I’ve seen yet. Below you will find the notes that I took from this week.
These are the updates in the past week:

ChatGPT New multi-modal Features

  • Recently, OpenAI announced that its popular AI chatbot, ChatGPT, is going multimodal, meaning it will now support voice prompts and image uploads[1]. This enhancement will be available for Enterprise and Plus users over the next two weeks[1]. The addition of voice capabilities allows for more conversational interactions, while image support enables users to ask questions about images they upload, similar to Google Lens[1]. ChatGPT will analyze the image in the context of the accompanying text and produce an answer, even engaging in back-and-forth conversations around the subject
  • This development matters because it brings OpenAI's vision for artificial general intelligence closer to reality, allowing AI to perceive and interact with the world in a more human-like manner[3]. The integration of voice and image functionalities creates a more intuitive and versatile conversational AI, opening up new possibilities for applications in various industries and enhancing user experiences

ChatGPT brings back the "browse" functionality

  • OpenAI recently reintroduced the "browse" functionality to ChatGPT, allowing the AI to access the internet and retrieve information in real-time6. This feature is significant because it addresses one of ChatGPT's main limitations, which was its reliance on training data that stopped in 20217. By enabling browsing, ChatGPT can now provide users with more up-to-date and relevant information, enhancing its usefulness in various applications such as research, content creation, and more. The browsing feature operates using the Bing search API, benefiting from Microsoft's work on source reliability and safe-mode to prevent problematic content retrieval

OpenAI and Jony Ive working on iPhone for AI

  • OpenAI, Jony Ive, and Masayoshi Son collaboration:
  • - OpenAI, led by Sam Altman, is teaming up with former Apple designer Jony Ive and SoftBank's CEO Masayoshi Son to develop an AI-powered consumer device[1][2].
  • - The project aims to create a new kind of device that leverages OpenAI's generative AI technology, such as ChatGPT[2].
  • - The business structure and specifics of the device remain unclear, but the collaboration has been in discussion for much of the year[1].
  • Why it matters:
  • - This collaboration brings together prominent figures from the tech industry, potentially leading to groundbreaking innovations in AI-powered devices[1].
  • - The device could revolutionize the way users interact with AI, similar to how Apple's touchscreen technology transformed mobile internet[2].
  • - An AI-native operating system or a reimagined phone are among the speculative possibilities for the device[5].

Anthropic partners with Amazon in $4b deal

Anthropic Partners with Amazon in $4 Billion Deal
  • Who: Amazon and Anthropic
  • What: Strategic collaboration and investment in Anthropic
  • When: Announced on September 25, 2023
  • Where: Amazon Web Services (AWS) and Anthropic's AI technologies
  • Why: To advance generative AI and enhance AI capabilities across Amazon's businesses
  • How: Amazon will invest up to $4 billion in Anthropic and have a minority ownership position in the company
Why It Matters
  1. Collaboration: Amazon and Anthropic are working together to develop reliable and high-performing foundation models in the AI industry2.
  1. Investment: Amazon's investment of up to $4 billion in Anthropic demonstrates its commitment to advancing AI technologies and keeping pace with rivals like Microsoft and Google8.
  1. Integration: Amazon developers and engineers will be able to build with Anthropic models via Amazon Bedrock, incorporating generative AI capabilities into their work and enhancing existing applications1.
  1. Safety and security: Both Amazon and Anthropic are committed to the safe training and deployment of advanced foundation models, with Amazon promoting and implementing safety best practices on Amazon Bedrock2.
  1. Impact on various industries: The collaboration between Amazon and Anthropic can benefit industries by providing access to advanced AI technologies, driving innovation, and improving customer experiences1.

    Amazon Bedrock becomes widely available

    Amazon Bedrock Becomes Widely Available
    • Who: Amazon Web Services (AWS) and AI companies like AI21 Labs, Anthropic, Cohere, Stability AI, and Meta
    • What: General availability of Amazon Bedrock, a fully managed service for generative AI applications
    • When: Announced on September 28, 2023
    • Where: AWS Regions US East (N. Virginia) and US West (Oregon)
    • Why: To provide AWS customers with secure cloud access to foundation models and tools for building generative AI applications
    • How: Through an API that offers a choice of high-performing foundation models from leading AI companies
    Why It Matters
    1. Accessibility: Amazon Bedrock is now generally available to all AWS customers, allowing them to build generative AI applications using foundation models from leading AI companies15.
    1. Versatility: The service supports a wide range of use cases, including search, personalization, and Retrieval-Augmented Generation (RAG)5.
    1. Ease of use: Users can explore the capabilities of foundation models using the Amazon Bedrock console or playgrounds, which provide a conversational interface to interact with chat, text, and image models5.
    1. Collaboration: Amazon Bedrock includes models from AI21 Labs, Anthropic, Cohere, Stability AI, and Meta, showcasing the collaboration between AWS and leading AI companies1.
    1. Innovation: The general availability of Amazon Bedrock advances the development and integration of transformative AI technologies in various industries6.

    Spotify cloning podcaster voices for translation

    What happened
    • Spotify partnered with OpenAI to develop an AI-powered voice translation feature for podcasts1.
    • The feature clones podcasters' voices and translates their content into other languages, starting with Spanish, French, and German2.
    • Initial participants include Dax Shepard, Monica Padman, Lex Fridman, Bill Simmons, and Steven Bartlett1.
    Why it matters
    • The technology allows podcasters to reach a wider audience by breaking language barriers6.
    • It retains the original voices and styles of podcasters, providing a more authentic listening experience5.
    • The success of this feature could potentially lead to its application in movies and TV shows, translating content while preserving the original voices3.
    • However, concerns have been raised about the potential misuse of voice cloning technology and its impact on traditional voice actors3.

    Tesla's Optimus robot can sort autonomously

    What happened
    • Tesla released an update video demonstrating the Optimus robot's ability to autonomously sort objects2.
    • The robot uses an end-to-end trained neural network to perform tasks such as self-calibration and physically sorting colored blocks into their respective trays2.
    • Optimus can adapt to dynamic changes in its environment, such as when a human intervenes and moves the blocks around8.
    Why it matters
    • The progress in Optimus' development showcases Tesla's advancements in AI and robotics5.
    • The robot's ability to perform tasks autonomously and adapt to changing environments demonstrates its potential for various applications in the future8.
    • The success of Optimus could lead to further advancements in AI-driven humanoid robots and their integration into various industries6.

    Getty Launches AI Image Generator

    • Who: Getty Images partnered with Nvidia to develop the AI image generator1.
    • What: Generative AI by Getty Images is a new tool that allows users to create images using Getty's library of licensed photos1.
    • When: The tool was announced on September 25, 20237.
    • Where: The tool can be accessed on GettyImages.com and through an API7.
    • Why: The launch aims to provide a commercially safe and legally protected AI-generated image solution for customers4.
    • How: The AI tool was developed using Nvidia's Edify model from its generative AI model library Picasso and trained on Getty Images' vast library of licensed content2.
    Why It Matters
    • Commercially Safe: Getty's AI image generator is considered "commercially safe" because it uses only licensed images from Getty's library, ensuring legal rights and indemnification for commercial use4.
    • Responsibly Developed: Getty has been intentional in developing its AI tool, addressing both excitement and hesitation around generative AI2.
    • Customization: Users can write their own prompts or use a prompt builder, and later in the year, they will be able to add their own training data to generate images2.
    • Artist Compensation: Getty will pay artists who have helped train its AI system on a recurring basis, acknowledging their expertise and investment in the content10.
    • Competitive Landsca
    • pe: Getty's launch follows other companies like Adobe and Shutterstock, which have also introduced AI image generators25.

    New Windows 11 rolled out this week with AI Copilot

    New Windows 11 Rolled Out with AI Copilot
    • Who: Microsoft1.
    • What: Windows 11 update with AI Copilot feature1.
    • When: The update was released on September 26, 20231.
    • Where: The update is available for Windows 11 users1.
    • Why: To enhance productivity and creativity with AI integration1.
    • How: AI Copilot is integrated into Windows 11 and works with Bing Chat and ChatGPT plugins2.
    Why It Matters
    • AI Integration: AI Copilot is a significant step towards integrating AI technology into Microsoft products, enhancing user experience1.
    • Improved Productivity: AI Copilot can assist users in various tasks across different apps, such as Word, PowerPoint, and Edge, improving productivity11.
    • Voice Control: Users can interact with AI Copilot using voice commands, making it more accessible and convenient11.
    • Cross-Platform Functionality: AI Copilot can connect to users' phones, providing seamless integration and access to information across devices11.
    • Competitive Edge: The introduction of AI Copilot in Windows 11 demonstrates Microsoft's commitment to staying ahead in the technology landscape and providing innovative solutions to its users1.

    Microsoft shares AutoGen research for customizable agents

    Microsoft Shares AutoGen Research for Customizable Agents
    • Who: Microsoft2.
    • What: AutoGen, a framework for customizable and conversable AI agents1.
    • When: The research paper was published on August 16, 202311.
    • Where: AutoGen is available as an open-source Python package2.
    • Why: To simplify the orchestration, optimization, and automation of large language model (LLM) workflows1.
    • How: AutoGen enables multi-agent conversations, integrating AI, humans, and tools seamlessly1.
    Why It Matters
    • Streamlined LLM Workflows: AutoGen simplifies complex LLM workflows, making it easier for developers to create AI applications1.
    • Customizable Agents: Developers can create custom agents with specialized capabilities and roles, allowing for more tailored AI solutions2.
    • Multi-Agent Conversations: AutoGen facilitates conversations between multiple AI agents, enhancing reasoning abilities, task completion, and error reduction15.
    • Collaboration: The open-source nature of AutoGen encourages contributions from a diverse community, fostering innovation and improvements2.
    • Competitive Advantage: AutoGen demonstrates Microsoft's commitment to advancing AI technology and staying ahead in the competitive landscape2.

    The writers strike ended and studios will be allowed to train AI on writer's works

    Hollywood Studios Can Train AI Models on Writers' Work
    • The Writers Guild of America (WGA) reached a tentative agreement with the Alliance of Motion Picture and Television Producers (AMPTP) to end a nearly five-month strike1.
    • Hollywood studios are expected to retain the right to train artificial intelligence models based on writers' work under the terms of the tentative labor agreement1.
    • The writers will receive credit and compensation for work they do on scripts, even if studios partially rely on AI tools11.
    • The WGA reserves the right to assert that exploitation of writers' material to train AI is prohibited by the Minimum Basic Agreement (MBA) or other law3.
    Why It Matters
    • AI tools can assist writers in producing content, and many companies have already utilized these tools to generate various types of content2.
    • The use of AI in writing has raised concerns about job displacement and the possibility that AI-generated content may lack a human touch8.
    • The new agreement regulates how AI can be credited or trained, ensuring that writers are protected and compensated for their work13.
    • The outcome of this agreement could set a precedent for other industries facing the impact of AI on creative work15.

    Google accidentally indexed Bard conversations

    • Google's search engine accidentally indexed conversations with its AI chatbot, Bard12.
    • The issue was discovered after an SEO Consultant reported it on Twitter (now X) 1.
    • Google confirmed that the indexing was unintentional and is working to block Bard chat transcripts from appearing in search results3.
    • The indexing only occurred for conversations that users had explicitly shared using Bard's public link-sharing feature6.
    • Private conversations that were not shared publicly were not indexed1.
    Why It Matters
    • The accidental indexing raised privacy concerns, as one-on-one conversations were being surfaced publicly without users' knowledge3.
    • This incident highlights the importance of exercising caution when sharing sensitive information with AI chatbots3.
    • Google's response to the issue demonstrates its commitment to user privacy and addressing potential privacy breaches1.

    Meta shows off Emu, their new AI Image model

    Meta Introduces Emu, a New AI Image Model
    • Meta unveiled Emu, an AI image generator that can create realistic and aesthetically pleasing images from text prompts9.
    • Emu is based on a latent diffusion model pre-trained on 1.1 billion image-text pairs and fine-tuned with a few thousand carefully selected high-quality images10.
    • The AI model will be integrated with Meta's chat platforms, including WhatsApp, Messenger, and Instagram, to generate AI-generated stickers11.
    Why It Matters
    • Emu represents a significant advancement in AI-generated content, highlighting the importance of curation and human expertise in refining AI-generated images12.
    • The integration of Emu with Meta's chat platforms can enhance user experience and enable more creative and expressive communication11.
    • As AI continues to reshape industries and how we interact with technology, Emu demonstrates the potential for combining art and science in the quest for higher-quality AI-generated images12.

    WhatsApp and Messenger to get AI stickers

    • Meta announced the introduction of AI stickers for WhatsApp, Messenger, Instagram, and Facebook Stories1.
    • The AI stickers use technology from Llama 2 and Meta's image generation model, Emu, to generate customized stickers based on text prompts1.
    • The feature will roll out to select English-language users over the next month7.
    Why It Matters
    • AI stickers provide users with a more creative and expressive way to communicate in chats and stories1.
    • The integration of AI-generated content in popular messaging apps demonstrates the growing influence of AI in enhancing user experiences8.
    • The introduction of AI stickers showcases Meta's commitment to incorporating AI technologies across its platforms8.

    Meta Announcements

    • will be rolling out niche chatbots to help in specific areas
    • Meta is working on personal chatbots tailored to our individual needs
    • Meta is rolling out image editing with Segment Anything functionality
    • Meta is adding natural language chat to its sunglasses in collaboration with Ray-Ban
    • Meta / Mark have virtual podcast with Zuckerberg

    Leonardo AI just added the ability to use LORAs

    Leonardo AI Adds LORAs Support
    • Leonardo AI, an AI art image generator, has added support for Low-Rank Adaptation (LoRA) models in its latest update5.
    • LoRAs are files designed to enhance the specificity of AI-generated images without downloading an entirely new model8.
    • Users can now infuse themes into their generations using LoRA models, allowing for more refined and detailed AI-generated images5.
    Why It Matters
    • The addition of LoRAs support in Leonardo AI enables users to create more customized and high-quality AI-generated images8.
    • This update demonstrates the continuous improvement and innovation in AI art generation, providing users with more control over the creative process12.
    • By offering more advanced features, Leonardo AI is contributing to the growth and adoption of AI technologies in the art and design industries12.

    Pika Labs adds "Text Encrypt"

    Pika Labs Adds "Text Encrypt" Feature
    • Pika Labs introduced a new feature called "Text Encrypt" that allows users to embed text messages within videos5.
    • The feature can be used to hide words within images, adding an extra layer of creativity and customization to video content7.
    • Pika Labs is a text-to-video platform that leverages advanced AI technology to transform text or images into engaging, dynamic content6.
    Why It Matters
    • The "Text Encrypt" feature enhances the capabilities of Pika Labs' platform, providing users with more creative options for their video content5.
    • This update demonstrates the continuous innovation in AI-powered content creation tools, making it easier for users to produce captivating and personalized content6.
    • By offering advanced features like "Text Encrypt," Pika Labs is contributing to the growth and adoption of AI technologies in the content creation industry6.

    Genmo adds "Video Effects"

    Genmo Adds "Video Effects" Feature
    • Genmo, an AI-powered platform for generating art and videos, introduced a new "Video Effects" feature14.
    • The feature allows users to create more engaging and visually appealing videos by adding various effects to their AI-generated content14.
    Why It Matters
    • The addition of "Video Effects" enhances the capabilities of Genmo's platform, providing users with more creative options for their video content14.
    • This update demonstrates the continuous innovation in AI-powered content creation tools, making it easier for users to produce captivating and personalized content14.
    • By offering advanced features like "Video Effects," Genmo is contributing to the growth and adoption of AI technologies in the content creation industry14.

    Zapier now has AI Canvas

    Zapier Introduces AI-Powered Canvas
    • Zapier announced the launch of Canvas, an AI-powered diagramming tool designed to help users visualize, plan, and automate business-critical processes1.
    • Canvas is currently in alpha and available through Zapier Early Access1.
    • The tool allows users to map out their entire workflows, regardless of whether they are connected to Zapier or not4.
    Why It Matters
    • Canvas enhances Zapier's automation capabilities by providing a visual platform for users to plan and optimize their business processes1.
    • The AI-powered tool can help users identify areas for improvement and potential automation opportunities10.
    • By offering advanced features like Canvas, Zapier is contributing to the growth and adoption of AI technologies in the automation and business process management industries12.

    Dall-e 3 is available in Bing Chat & Image Creator

    Dall-e 3 in Bing Chat & Image Creator
    • Who: Microsoft and OpenAI
    • What: Integration of DALL-E 3 into Bing Chat and Bing Image Creator
    • When: DALL-E 3 was released for ChatGPT's enterprise users in October 20231. It became available in Bing Image Creator around September 30, 202310.
    • Where: Bing Chat and Bing Image Creator platforms
    • Why: To provide users with the ability to generate detailed and engaging images using AI technology
    • How: By leveraging OpenAI's DALL-E 3 text-to-image model
    Importance
    1. Improved image generation: DALL-E 3 excels in understanding and translating textual descriptions into highly detailed and accurate images3.
    1. Integration with ChatGPT: DALL-E 3 is built upon DALL-E 2 and ChatGPT, allowing users to generate images using ChatGPT prompts7.
    1. Enhanced user experience: DALL-E 3 generates higher-quality images that more accurately reflect prompts, especially when dealing with longer prompts12.
    1. Free access: Unlike ChatGPT, DALL-E 3 is free to use in Bing Chat13.
    1. Impact on various industries: DALL-E 3 can benefit industries like gaming, entertainment, and visual branding by expediting the creation of game environments, characters, assets, and more8.
    1. Microsoft has integrated OpenAI's DALL-E 3 into Bing Chat and Bing Image Creator, allowing users to generate detailed and engaging images using AI technology1. DALL-E 3 is built upon DALL-E 2 and ChatGPT, and it excels in understanding and translating textual descriptions into highly detailed and accurate images3. The integration with ChatGPT allows users to generate images using ChatGPT prompts7. DALL-E 3 generates higher-quality images that more accurately reflect prompts, especially when dealing with longer prompts12. Unlike ChatGPT, DALL-E 3 is free to use in Bing Chat13. This AI technology can benefit various industries like gaming, entertainment, and visual branding by expediting the creation of game environments, characters, assets, and more8.
     
     
    Â