Home > Episode > #177 - Instagram AI Bots, Noam

#177 - Instagram AI Bots, Noam Shazeer -> Google, FLUX.1, SAM2

enAugust 11, 2024

Last Week in AI

What advancements were made in AI technology last week?

How is Apple enhancing AI accessibility and privacy?

What impact does the EU AI Act have on AI regulation?

Why are companies paying publishers for content access?

What challenges do humanoid robots face in real-world deployment?

What advancements were made in AI technology last week?

How is Apple enhancing AI accessibility and privacy?

What impact does the EU AI Act have on AI regulation?

Why are companies paying publishers for content access?

What challenges do humanoid robots face in real-world deployment?

Podcast Summary

AI advancements and privacy: New AI model A.I.R.I. leads in research, Apple focuses on privacy and on-device processing, open-source models allow for private infrastructure use, prioritizing privacy sets companies apart
Last week saw significant advancements in AI technology, with the release of a new AI model named A.I.R.I. This Canadian-developed model is leading the way in AI research, and its secrets are causing excitement in the industry. Apple, too, is making strides in AI accessibility and privacy, integrating advanced capabilities and private cloud compute systems. These developments are shaping the future of AI, with companies prioritizing user privacy and on-device processing to differentiate themselves. Apple's recent focus on these areas is a safe bet for prioritizing privacy, as they continue to hammer on this point to stand out from other tech giants. The release of open-source models like Llama 3.1 is also making it possible for companies to run these models on their own infrastructure, reducing the need to send queries to third parties. Overall, these advancements in AI technology and privacy are crucial steps forward in the ever-evolving landscape of artificial intelligence.
Apple's AI integration strategy: Apple's deep and specialized AI integration in iOS results in a polished user experience, even with later launches, while Meta's new AI Studio on Instagram enables users to create custom AI versions for increased engagement.
Apple's approach to AI integration in their products stands out due to its deep and specialized integration throughout iOS, rather than a general-purpose chatbot like Google and others. This strategy allows Apple to ensure a polished user experience, even if it comes later than competitors. Apple's cautious approach was highlighted by the delayed launch of Apple Intelligence and the careful testing and packaging of its features. Additionally, Meta recently launched AI Studio, enabling users in the US to create AI versions of themselves on Instagram. This tool allows creators to customize the AI's behavior and interactions, potentially leading to increased engagement for popular Instagram accounts. Overall, these companies' different approaches to AI integration reflect their unique strengths and priorities.
AI competition: Companies like Meta, OpenAI, and Google are releasing character model weights to attract talent, undercut competitors, and make AI more accessible. Text-to-video tools and updates to existing tools are advancing, offering more specificity and better results. New companies are developing AI-powered hardware products.
The release of character model weights by Meta, following in the footsteps of competitors like OpenAI and Google, is a strategic move aimed at attracting AI talent, undercutting competitors, and making generative AI more accessible. Meta's investment in creating advanced LLMs like Llama 3.1 and 405b is substantial, and releasing the model weights publicly helps justify the investment to shareholders. Additionally, the release of character models allows for the creation of various bots, from real people to more specialized ones, adding to the reach of Instagram's messaging interface. Another significant development is the advancement of text-to-video tools, such as Runway's Gen 3 video generation model, which can create photorealistic videos from images and text prompts. These tools offer more specificity and better results compared to text-to-image or text-to-video tools, making it easier for users to achieve their desired outcomes. Furthermore, updates to existing tools, like Mid-Journey's V6.1, continue to improve image and text-to-image capabilities, making them more advanced and user-friendly. Additionally, new companies are attempting to create hardware products with built-in AI, like the AI-powered necklace, which aims to be a friendly companion rather than a replacement for a phone. Overall, these developments demonstrate the ongoing advancements in generative AI and the increasing competition among companies to release innovative products and features.
AI-driven search: Google and Microsoft are integrating AI into search engines to enhance user experience, while character.io receives funding from Google to scale up its chatbot business
There are ongoing developments in the integration of AI technology into various industries and businesses. In the consumer sector, there are experiments with AI-powered companions like necklaces, but the reception is mixed. Microsoft is adding AI-powered summaries to Bing search results, following Google's lead. In the business world, character.io, a popular chatbot company, is seeing some of its leadership team, including co-founders Noam Shazir and Daniel Defrados, return to Google. Google is reportedly providing funding to help character.io continue scaling. These moves come as big tech companies explore ways to avoid antitrust scrutiny through strategic partnerships and acquisitions. In the realm of AI-driven search, Perplexity is reportedly cutting checks to publishers. These developments underscore the rapidly evolving landscape of AI technology and its impact on various sectors.
AI and Publishing: Companies are paying publishers for access to content to train AI models, and the trend is significant for future AI development as publishers close off data access. Competition in the AI space is increasing, and hardware delays are impacting companies' ability to deploy new models. Advancements in robotics continue, but challenges remain in deploying humanoid robots in real-world scenarios.
In the rapidly evolving world of AI, companies are increasingly paying publishers for access to high-quality content to train their models. This trend, seen with companies like Perplexity, OpenAI, and Google, is necessary as publishers close off access to data unless payment is made. This is significant for the future of AI development as these companies rely on vast amounts of data for model training. Another trend is the increasing competition in the AI space, with companies like Google and OpenAI emerging as major players. Additionally, the development of advanced hardware like NVIDIA's Blackwell B200 AI chip has been crucial for AI model training, but a delay in its release has left some tech companies in a bind. Microsoft, Google, and Meta, among others, have committed significant resources to acquire these chips, and the delay could impact their ability to deploy new, large-scale AI models. Furthermore, advancements in robotics, such as Neura's new humanoid robot RNE1, are making strides towards real-world applications. However, the development of humanoid robots remains a complex and time-consuming process, with significant challenges in deploying them in real-world scenarios. Overall, the intersection of AI, robotics, and publishing is a dynamic and evolving space, with companies constantly striving to stay ahead of the curve in terms of technology and access to data.
AI integration challenges: Despite advancements in AI and autonomous technology, significant challenges like latency and closed-loop control remain, hindering full integration into physical systems like robots. Regulatory approvals and safety concerns add to the delay in implementing driverless vehicles.
While we're making significant strides in artificial intelligence and autonomous technology, there are still challenges to be addressed before we can fully integrate AI into physical systems like robots. The latency and need for closed-loop control present significant hurdles. In the transportation sector, companies like Waymo are making progress in scaling up their driverless services, but regulatory approvals and safety concerns mean it will take time before we see these vehicles on highways. Meanwhile, tech companies continue to make acquisitions and advancements in AI and design, such as Canva's acquisition of Leonardo.ai and the launch of Stable Diffusion's Black Forest Labs. These developments demonstrate the ongoing investment and innovation in AI and related technologies.
AI advancements: Recent developments in AI include open-source 3D model generation (Stable Fast 3D) and safety classifiers (Shield Gemma) from Stability AI and Google, expanding accessibility and versatility of language models and AI applications.
There have been several recent developments in the world of AI, specifically in text-to-image generation and language models. Stability AI has released their fast model for 3D asset generation, named Stable Fast 3D, which can create 3D models from a single image in about half a second. This model is open source under the community license for non-commercial use for individuals and organizations with up to 1 million revenue. Google, on the other hand, has released new variants of their Gemma models, including Gemma 2B, Shield Gemma, and Gemma Scope. Shield Gemma is a set of safety classifiers designed to detect toxic content and hate speech, while Gemma Scope is a tool that allows developers to examine specific points within the Gemma 2 model. These developments are significant as they allow for the creation of safer and more versatile language models and AI applications. The open-source nature of these models also enables wider access to these technologies, fostering innovation and collaboration in the AI community. These advancements are part of the larger trend of open-source AI models and tools, such as the Llama series, which have been making waves in the industry. Overall, these developments represent a major step forward in the field of AI, making it easier and more accessible for individuals and organizations to create and utilize advanced AI applications.
Agentic AI and Machine Vision: Open source efforts in agentic AI and machine vision, such as Meta SAM 2 and MoMA research, are driving rapid advancements in these fields, with the potential to revolutionize industries that rely on video analysis and efficient AI systems.
The field of AI is rapidly advancing, particularly in the areas of agentic AI and machine vision. Agentic AI refers to AI systems that can act on their own based on instructions provided by users, making them more convenient and efficient. Open source efforts in this area are expected to drive faster progress due to the accessibility and flexibility they offer. One example of this is the development of Meta Segment Anything 2 (SAM 2), an extension of the original SAM model that can segment objects in real-time in videos, without the need for training. This technology has the potential to revolutionize industries that rely on video analysis, such as medical and industrial applications. Another development from Meta is the MoMA (Mixture of Modality Aware Experts) research, which combines early fusion and mixture of experts to make AI systems more efficient. This approach achieves significant flop savings while also outperforming standard MOEs. Meanwhile, Llama's 3.1 release last week highlighted their decision not to go down the mixture of experts route due to efficiency concerns. This illustrates the ongoing trade-offs between efficiency and capabilities in AI research. In summary, the AI landscape is witnessing significant progress in agentic AI and machine vision, with open source efforts playing a key role in driving innovation. These developments are poised to bring about transformative changes in various industries and applications.
Mixture of experts models: Recent advancements in multimodal mixture of experts models, like Meta's MoMA, show potential benefits, but language models still face challenges in achieving human-level performance. Research continues to explore methods like C-Plan Act and collaboration between academia and industry to improve LLMs and ensure alignment.
Mixture of experts models, which involve multiple sub-experts handling different tasks, can be challenging to train. Despite this, recent advancements, such as Meta's MoMA approach that uses image and text experts, demonstrate the potential benefits of multimodality in these models. However, language models still struggle with complex tasks and achieving human-level performance. Researchers are exploring methods like C-Plan Act to improve LLM performance and ensure alignment with human preferences. Additionally, collaboration between academia and industry, as seen in the use of proprietary and open-source models, is crucial for driving advancements in AI. A new method called stretching each dollar diffusion training also aims to make training more cost-effective, enabling the creation of high-quality models on a smaller budget. These developments underscore the ongoing progress and challenges in the field of AI research.
AI regulation and costs: The decreasing cost of AI development and deployment collides with increasing regulation, such as the EU AI Act, creating both opportunities and challenges for businesses in the AI industry.
The cost of developing and deploying AI systems is decreasing, making it more accessible to businesses, while regulation, such as the EU AI Act, is increasing, adding more compliance costs. The EU AI Act, the most impactful AI regulation currently in effect, categorizes AI systems into low, medium, and high-risk levels, with varying degrees of regulation. Open source AI models are currently not restricted in the US, but the government continues to monitor potential dangers. Meanwhile, enforcing export controls on advanced AI hardware to China remains challenging. The intersection of these trends presents both opportunities and challenges for businesses in the AI industry.
Podcast hosting and investigations: Staying focused during podcast hosting and dealing with investigations requires multitasking skills. Google's ties with Anthropic face scrutiny over potential monopolistic practices, while deep fakes and misinformation pose threats to public safety and trust in information sources.
Hosting a podcast or show involves multitasking and staying focused, even for simple tasks. The UK's competition and markets authority is investigating Google's ties with Anthropic, raising concerns about potential monopolistic practices. Deep fakes, like a manipulated video of Kamala Harris shared by Elon Musk, can cause confusion and potentially lead to dangerous situations, such as violence and riots. Misinformation, while not primarily driven by deep fakes, still poses a significant threat when people believe and spread false information. Kamala Harris, the presumptive Democratic nominee for Vice President, has become a target of misinformation, including deep fakes, as the US election approaches. It's important to fact-check information and rely on trusted sources to avoid falling victim to misinformation. Stay informed and stay tuned for the latest developments in AI and technology.

Recent Episodes from Last Week in AI

#181 - Google Chatbots, Cerebras vs Nvidia, AI Doom, ElevenLabs Controversy

Our 181st episode with a summary and discussion of last week's big AI news!

With hosts Andrey Kurenkov and Jeremie Harris

Read out our text newsletter and comment on the podcast at https://lastweekin.ai/

If you would like to become a sponsor for the newsletter, podcast, or both, please fill out this form.

Email us your questions and feedback at contact@lastweekinai.com and/or hello@gladstone.ai

In this episode:

- Google's AI advancements with Gemini 1.5 models and AI-generated avatars, along with Samsung's lithography progress. - Microsoft's Inflection usage caps for Pi, new AI inference services by Cerebrus Systems competing with Nvidia. - Biases in AI, prompt leak attacks, and transparency in models and distributed training optimizations, including the 'distro' optimizer. - AI regulation discussions including California’s SB1047, China's AI safety stance, and new export restrictions impacting Nvidia’s AI chips.

Timestamps + Links:

(00:00:00) Intro / Banter
(00:03:08)Response to listener comments / corrections
Tools & Apps
- (00:09:19) Google’s custom AI chatbots have arrived
- (00:12:52) Google releases three new experimental AI models
- (00:17:14) Google Gemini will let you create AI-generated people again
- (00:22:32) Five months after Microsoft hired its founders, Inflection adds usage caps to Pi
- (00:26:42:) Plaud takes a crack at a simpler AI pin
Applications & Business
- (00:30:31) Cerebras Systems throws down gauntlet to Nvidia with launch of ‘world’s fastest’ AI inference service
- (00:41:06) Nvidia announces $50 billion stock buyback
- (00:46:24) OpenAI in talks to raise funding that would value it at more than $100 billion
- (00:50:44) OpenAI Aims to Release New AI Model, ‘Strawberry,’ in Fall
- (00:52:53) 3 Co-Founders Leave French AI Startup H Amid ‘Operational Differences’
- (00:57:29) Samsung to Adopt High-NA Lithography Alongside Intel, Ahead of TSMC
- (01:02:11) Unitree's $16,000 G1 could become the first mainstream humanoid robot
Projects & Open Source
- (01:04:59) Meta leads open-source AI boom, Llama downloads surge 10x year-over-year
- (01:09:08) A_Preliminary_Report_on_DisTrO.
Research & Advancements
- (01:13:56) Diffusion Models Are Real-Time Game Engines
- (01:23:18) LLM Defenses Are Not Robust to Multi-Turn Human Jailbreaks Yet
- (01:32:21) Interviewing AI researchers on automation of AI R&D
- (01:40:33) Anthropic releases AI model system prompts, winning praise for transparency
Policy & Safety
Synthetic Media & Art
- (02:11:13) Actors Say AI Voice-Over Generator ElevenLabs Cloned Likenesses
(02:14:06) Outro

Last Week in AI

enSeptember 15, 2024

#180 - Ideogram v2, Imagen 3, AI in 2030, Agent Q, SB 1047

Our 180th episode with a summary and discussion of last week's big AI news!

With hosts Andrey Kurenkov (https://twitter.com/andrey_kurenkov) and Jeremie Harris (https://twitter.com/jeremiecharris)

If you would like to get a sneak peek and help test Andrey's generative AI application, go to Astrocade.com to join the waitlist and the discord.

Read out our text newsletter and comment on the podcast at https://lastweekin.ai/

If you would like to become a sponsor for the newsletter, podcast, or both, please fill out this form.

Email us your questions and feedback at contact@lastweekinai.com and/or hello@gladstone.ai

Episode Highlights:

Ideogram AI's new features, Google's Imagine 3, Dream Machine 1.5, and Runway's Gen3 Alpha Turbo model advancements.
Perplexity's integration of Flux image generation models and code interpreter updates for enhanced search results.
Exploration of the feasibility and investment needed for scaling advanced AI models like GPT-4 and Agent Q architecture enhancements.
Analysis of California's AI regulation bill SB1047 and legal issues related to synthetic media, copyright, and online personhood credentials.

Timestamps + Links:

(00:00:00) Intro / Banter
(00:01:08) Response to Listener Comments / Corrections
Tools & Apps
- (00:03:58) Ideogram AI expands its features with v2 model and color palette options
- (00:07:48) Google Releases Powerful AI Image Generator You Can Use for Free
- (00:11:41) Perplexity adds Flux.1 model for Pro users alongside Playground v3 update
- (00:13:58) Luma drops Dream Machine 1.5 — here’s what’s new
- (00:17:49) Runway’s Gen-3 Alpha Turbo is here and can make AI videos faster than you can type
- (00:20:21) Perplexity’s latest update improves code interpreter, charts included
Applications & Business
Projects & Open Source
Research & Advancements
- (01:12:35) Can AI Scaling Continue Through 2030?
- (01:15:35) Agent Q: Advanced Reasoning and Learning for Autonomous AI Agents
- (01:23:58) Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models
- (01:31:18) Loss of plasticity in deep continual learning
Policy & Safety
Synthetic Media & Art
- (01:58:33) Authors sue Claude AI chatbot creator Anthropic for copyright infringement
- (01:59:32) Artists’ lawsuit against Stability AI and Midjourney gets more punch
(02:01:43) Outro

Last Week in AI

enSeptember 03, 2024

#179 - Grok 2, Gemini Live, Flux, FalconMamba, AI Scientist

Our 179th episode with a summary and discussion of last week's big AI news!

With hosts Andrey Kurenkov (https://twitter.com/andrey_kurenkov) and Jeremie Harris (https://twitter.com/jeremiecharris)

If you would like to get a sneak peek and help test Andrey's generative AI application, go to Astrocade.com to join the waitlist and the discord.

Read out our text newsletter and comment on the podcast at https://lastweekin.ai/

If you would like to become a sponsor for the newsletter, podcast, or both, please fill out this form.

Email us your questions and feedback at contact@lastweekinai.com and/or hello@gladstone.ai

Episode Highlights:

- Grok 2's beta release features new image generation using Black Forest Labs' tech.

- Google introduces Gemini Voice Chat Mode available to subscribers and integrates it into Pixel Buds Pro 2.

- Huawei's Ascend 910C AI chip aims to rival NVIDIA's H100 amidst US export controls.

- Overview of potential risks of unaligned AI models and skepticism around SingularityNet's AGI supercomputer claims.

Timestamps + Links:

(00:00:00) Intro / Banter
(00:02:15) Response to listener comments / corrections
Tools & Apps
- (00:04:24) Grok-2 is out in beta, now with added AI image generation
- (00:11:28) OpenAI reveals an updated GPT-4o model - but can't quite explain how it's better
- (00:13:48) Google Gemini’s voice chat mode is here
- (00:16:18) Google’s Pixel Buds Pro 2 bring Gemini to your ears
- (00:19:55) Google’s AI-generated search summaries change how they show their sources
- (00:23:13) Prompt Caching is Now Available on the Anthropic API for Specific Claude Models
Applications & Business
Projects & Open Source
Research & Advancements
- (01:14:40) The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery
- (01:30:24) Imagen 3
- (01:32:48) The Data Addition Dilemma
- (01:37:35) LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs
Policy & Safety
Synthetic Media & Art
- (01:48:21) SAG-AFTRA Strikes Groundbreaking AI Digital Voice Replica Pact With Startup Firm Narrativ
- (01:51:52) How ‘Deepfake Elon Musk’ Became the Internet’s Biggest Scammer
(01:56:21) AI Song Outro

Last Week in AI

enAugust 20, 2024

#178 - More Not-Acquihires, More OpenAI drama, More LLM Scaling Talk

Our 178th episode with a summary and discussion of last week's big AI news!

NOTE: this is a re-upload with fixed audio, my bad on the last one! - Andrey

With hosts Andrey Kurenkov (https://twitter.com/andrey_kurenkov) and Jeremie Harris (https://twitter.com/jeremiecharris)

If you would like to get a sneak peek and help test Andrey's generative AI application, go to Astrocade.com to join the waitlist and the discord.

Read out our text newsletter and comment on the podcast at https://lastweekin.ai/

If you would like to become a sponsor for the newsletter, podcast, or both, please fill out this form.

Email us your questions and feedback at contact@lastweekinai.com and/or hello@gladstone.ai

In this episode: - Notable personnel movements and product updates, such as Character.ai leaders joining Google and new AI features in Reddit and Audible. - OpenAI's dramatic changes with co-founder exits, extended leaves, and new lawsuits from Elon Musk. - Rapid advancements in humanoid robotics exemplified by new models from companies like Figure in partnership with OpenAI, achieving amateur-level human performance in tasks like table tennis. - Research advancements such as Google's compute-efficient inference models and self-compressing neural networks, showcasing significant reductions in compute requirements while maintaining performance.

Timestamps + Links:

(00:00:00) Intro / Banter
(00:03:14) Response to listener comments / corrections
Applications & Business
Tools & Apps
- (00:55:40) OpenAI cuts GPT-4o prices, launches Structured Outputs amidst price war with Google
- (01:02:08) Apple Intelligence could get a $20 Plus version
- (01:04:05) Audible is testing an AI-powered search feature
- (01:05:53) Reddit to test AI-powered search result pages
Research & Advancements
- (01:06:35) Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters
- (01:16:27) Achieving Human Level Competitive Robot Table Tennis
- (01:20:19) Self-Compressing Neural Networks
- (01:28:30) Let Me Speak Freely? A Study on the Impact of Format Restrictions on Performance of Large Language Models
- (01:32:43) Berkeley Humanoid: A Research Platform for Learning-based Control
Policy & Safety
- (01:33:35) METR announces results of study on comparative capabilities of humans and agents
- (01:39:35) ‘The Godmother of AI’ says California’s well-intended AI bill will harm the U.S. ecosystem
- (01:49:13) Google Monopolized Search Through Illegal Deals, Judge Rules
- (01:54:56) Amazon faces UK merger probe over $4B Anthropic AI investment
- (01:55:44) GPT-4o System Card
(02:03:09) Outro

Last Week in AI

enAugust 16, 2024

#177 - Instagram AI Bots, Noam Shazeer -> Google, FLUX.1, SAM2

Our 177th episode with a summary and discussion of last week's big AI news!

NOTE: apologies for this episode again coming out about a week late, next one will be coming out soon...

With hosts Andrey Kurenkov (https://twitter.com/andrey_kurenkov) and Jeremie Harris (https://twitter.com/jeremiecharris)

If you'd like to listen to the interview with Andrey, check out https://www.superdatascience.com/podcast

If you would like to get a sneak peek and help test Andrey's generative AI application, go to Astrocade.com to join the waitlist and the discord.

In this episode, hosts Andrey Kurenkov and John Krohn dive into significant updates and discussions in the AI world, including Instagram's new AI features, Waymo's driverless cars rollout in San Francisco, and NVIDIA’s chip delays. They also review Meta's AI Studio, character.ai CEO Noam Shazir's return to Google, and Google's Gemini updates. Additional topics cover NVIDIA's hardware issues, advancements in humanoid robots, and new open-source AI tools like Open Devon. Policy discussions touch on the EU AI Act, the U.S. stance on open-source AI, and investigations into Google and Anthropic. The impact of misinformation via deepfakes, particularly one involving Elon Musk, is also highlighted, all emphasizing significant industry effects and regulatory implications.

Read out our text newsletter and comment on the podcast at https://lastweekin.ai/

If you would like to become a sponsor for the newsletter, podcast, or both, please fill out this form.

Email us your questions and feedback at contact@lastweekinai.com and/or hello@gladstone.ai

(00:00:00) AI Song / Intro Banter
(00:05:32) Response to listener comments / corrections
Tools & Apps
- (00:10:16) Apple Intelligence to Miss Initial Launch of Upcoming iOS 18 Overhaul
- (00:16:35) Instagram starts letting people create AI versions of themselves
- Lighting round
  - (00:22:49) Runway just dropped image-to-video in Gen3
  - (00:25:41) Midjourney drops surprise v6.1 update — now humans look more real than ever
  - (00:28:07) AI-Powered Necklace Will Be Your Friend for $99
  - (00:30:06) Microsoft is adding AI-powered summaries to Bing search results
Applications & Business
- (00:31:44) Character.AI CEO Noam Shazeer returns to Google
- (00:39:41) Perplexity is cutting checks to publishers following plagiarism accusations
- Lighting round
  - (00:43:30) Nvidia reportedly delays its next AI chip due to a design flaw
  - (00:41:08) Neura shows off humanoid robot 4NE-1
  - (00:46:0) Yes, there are more driverless Waymos in S.F. Here’s how busy they are
  - (00:57:27) Canva acquires Leonardo.ai to boost its generative AI efforts
Projects & Open Source
- (00:59:19) Black Forest Labs Open-Source FLUX.1: A 12 Billion Parameter Rectified Flow Transformer Capable of Generating Images from Text Descriptions
- (01:01:59) Google releases new ‘open’ AI models with a focus on safety
- Lighting round
  - (01:05:09) Stability AI releases super-fast model for 3D asset image generation
  - (01:09:29) OpenDevin: An Open Platform for AI Software Developers as Generalist Agents
Research & Advancements
- (01:12:10) Meta AI Introduces Meta Segment Anything Model 2 (SAM 2): The First Unified Model for Segmenting Objects Across Images and Videos
- (01:19:20) MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts
- Lighting round
Policy & Safety
- (01:33:03) World's First-Ever AI Law Now Enforced in Europe, Targeting US Tech Giants
- (01:39:12) White House says no need to restrict ‘open-source’ artificial intelligence — at least for now
- Lighting round
  - (01:41:12) With Smugglers and Front Companies, China Is Skirting American A.I. Bans
  - (01:44:03) UK antitrust body probes Google’s ties with AI rival Anthropic
  - (01:45:20) Elon Musk posts deepfake of Kamala Harris that violates X policy
(01:50:10) AI Outro

Last Week in AI

enAugust 11, 2024

#176 - SearchGPT, Gemini 1.5 Flash, Lamma 3.1 405B, Mistral Large 2

Our 176th episode with a summary and discussion of last week's big AI news!

NOTE: apologies for this episode coming out about a week late, things got in the way of editing it...

With hosts Andrey Kurenkov (https://twitter.com/andrey_kurenkov) and Jeremie Harris (https://twitter.com/jeremiecharris)

Read out our text newsletter and comment on the podcast at https://lastweekin.ai/

If you would like to become a sponsor for the newsletter, podcast, or both, please fill out this form.

Email us your questions and feedback at contact@lastweekinai.com and/or hello@gladstone.ai

(00:00:00) Intro Song
(00:00:34) Intro Banter
Tools & Apps
- (00:03:39) OpenAI announces SearchGPT, its AI-powered search engine
- (00:08:03) Google gives free Gemini users access to its faster, lighter 1.5 Flash AI model
- (00:09:10) X launches underwhelming Grok-powered ‘More About This Account’ feature
- (00:11:36) Kuaishou Launches Full Beta Testing for 'Kling AI' to Global Users, Elevates Model Capabilities
- (00:13:39) Adobe rolls out more generative AI features to Illustrator and Photoshop
- (00:14:25) Meta AI gets new ‘Imagine me’ selfie feature
Projects & Open Source
Applications & Business
Research & Advancements
Policy & Safety
Synthetic Media & Art
- (01:20:58) Video game performers will go on strike over artificial intelligence concerns
(01:23:03) Outro
(01:23:58) AI Song

Last Week in AI

enAugust 03, 2024

#175 - GPT-4o Mini, OpenAI's Strawberry, Mixture of A Million Experts

Our 175th episode with a summary and discussion of last week's big AI news!

With hosts Andrey Kurenkov (https://twitter.com/andrey_kurenkov) and Jeremie Harris (https://twitter.com/jeremiecharris)

In this episode of Last Week in AI, hosts Andrey Kurenkov and Jeremy Harris explore recent AI advancements including OpenAI's release of GPT 4.0 Mini and Mistral’s open-source models, covering their impacts on affordability and performance. They delve into enterprise tools for compliance, text-to-video models like Hyper 1.5, and YouTube Music enhancements. The conversation further addresses AI research topics such as the benefits of numerous small expert models, novel benchmarking techniques, and advanced AI reasoning. Policy issues including U.S. export controls on AI technology to China and internal controversies at OpenAI are also discussed, alongside Elon Musk's supercomputer ambitions and OpenAI’s Prover-Verify Games initiative.

Read out our text newsletter and comment on the podcast at https://lastweekin.ai/

If you would like to become a sponsor for the newsletter, podcast, or both, please fill out this form.

Email us your questions and feedback at contact@lastweekinai.com and/or hello@gladstone.ai

Timestamps + links:

(00:00:00) AI Song Intro
(00:00:40) Intro / Banter
Tools & Apps
- (00:03:57) OpenAI unveils GPT-4o mini, a small AI model powering ChatGPT
- (00:11:38) Meet Haiper 1.5, the new AI video generation model challenging Sora, Runway
- (00:16:32) Anthropic releases Claude app for Android
- (00:18:59) Google Vids is available to test out Gemini AI-created video presentations
- (00:20:27) YouTube Music sound search rolling out, AI ‘conversational radio’ in testing
Applications & Business
Projects & Open Source
Research & Advancements
- (01:01:49) FlashAttention-3 unleashes the power of H100 GPUs for LLMs
- (01:06:38) Mixture of A Million Experts
- (01:12:51) AutoBencher: Creating Salient, Novel, Difficult Datasets for Language Models
- (01:18:23) SpreadsheetLLM: Encoding Spreadsheets for Large Language >Models
Policy & Safety
(01:44:59) Outro + AI Song

Last Week in AI

enJuly 25, 2024

#174 - Odyssey Text-to-Video, Groq LLM Engine, OpenAI Security Issues

Our 174rd episode with a summary and discussion of last week's big AI news!

With hosts Andrey Kurenkov (https://twitter.com/andrey_kurenkov) and Jeremie Harris (https://twitter.com/jeremiecharris)

In this episode of Last Week in AI, we delve into the latest advancements and challenges in the AI industry, highlighting new features from Figma and Quora, regulatory pressures on OpenAI, and significant investments in AI infrastructure. Key topics include AMD's acquisition of Silo AI, Elon Musk's GPU cluster plans for XAI, unique AI model training methods, and the nuances of AI copying and memory constraints. We discuss developments in AI's visual perception, real-time knowledge updates, and the need for transparency and regulation in AI content labeling and licensing.

See full episode notes here.

Read out our text newsletter and comment on the podcast at https://lastweekin.ai/

If you would like to become a sponsor for the newsletter, podcast, or both, please fill out this form.

Email us your questions and feedback at contact@lastweekinai.com and/or hello@gladstone.ai

Timestamps + links:

(00:00:00) Intro AI Song
(00:00:41) Pre News Banter
Tools & Apps
- (00:07:09) Odyssey Building 'Hollywood-Grade' AI Text-to-Video Model to Compete With Sora, Gen-3 Alpha
- (00:10:28) Anthropic’s Claude adds a prompt playground to quickly improve your AI apps
- (00:15:06) Figma pauses its new AI feature after Apple controversy
- (00:18:30) Quora’s Poe now lets users create and share web apps
- (00:20:54) Suno launches iPhone app — now you can make AI music on the go
Applications & Business
Research & Advancements
Policy & Safety
- (01:26:49) Covert Malicious Finetuning
- (01:31:23) OpenAI’s week of security issues
- (01:36:39) Here’s how OpenAI will determine how powerful its AI systems are
- (01:39:56) Me, Myself and AI: The Situational Awareness Dataset for LLMs
- (01:44:34) Exclusive: OpenAI partners with Los Alamos to study AI in the lab
- (01:47:36) Judge dismisses coders’ DMCA claims against Microsoft, OpenAI and GitHub
- (01:49:55) A former OpenAI safety employee said he quit because the company's leaders were 'building the Titanic' and wanted 'newer, shinier' things to sell
Synthetic Media & Art
- (01:52:46) Vimeo joins YouTube and TikTok in launching new AI content labels
- (01:54:50) Tech Startup Aims to Help Media License Content for AI Training
- (01:57:23) Etsy adds AI-generated item guidelines in new seller policy
- (01:59:44) Bumble users can now report profiles that use AI-generated photos
(02:02:05) Outro + AI Song

Last Week in AI

enJuly 17, 2024

#173 - Gemini Pro, Llama 400B, Gen-3 Alpha, Moshi, Supreme Court

Our 173rd episode with a summary and discussion of last week's big AI news!

With hosts Andrey Kurenkov (https://twitter.com/andrey_kurenkov) and Jeremie Harris (https://twitter.com/jeremiecharris)

See full episode notes here.

Read out our text newsletter and comment on the podcast at https://lastweekin.ai/

If you would like to become a sponsor for the newsletter, podcast, or both, please fill out this form.

Email us your questions and feedback at contact@lastweekinai.com and/or hello@gladstone.ai

In this episode of Last Week in AI, we explore the latest advancements and debates in the AI field, including Google's release of Gemini 1.5, Meta's upcoming LLaMA 3, and Runway's Gen 3 Alpha video model. We discuss emerging AI features, legal disputes over data usage, and China's competition in AI. The conversation spans innovative research developments, cost considerations of AI architectures, and policy changes like the U.S. Supreme Court striking down Chevron deference. We also cover U.S. export controls on AI chips to China, workforce development in the semiconductor industry, and Bridgewater's new AI-driven financial fund, evaluating the broader financial and regulatory impacts of AI technologies.

Timestamps + links: