Logo

    #176 - SearchGPT, Gemini 1.5 Flash, Lamma 3.1 405B, Mistral Large 2

    enAugust 03, 2024
    What is OpenAI's new AI-powered search engine called?
    How many test users can access Search GPT?
    What is the significance of DARPA's AI safety research?
    What are the key features of Google's updated Gemini model?
    What does OpenAI's rule-based rewards approach aim to achieve?

    Podcast Summary

    • AI advancements, competition, and safetyOpenAI introduces Search GPT, a chatbot and search engine combo, while Google offers free access to its 1.5 flash AI model, showcasing ongoing innovation and competition in AI. DARPA's interest in AI safety research is a positive step towards ensuring safe deployment of AI systems.

      There's been recent developments in the field of AI, with OpenAI announcing their new AI-powered search engine, Search GPT, and Google providing free access to its faster and lighter 1.5 flash AI model. These advancements showcase the ongoing competition and innovation in the AI industry, as companies continue to push for new use cases and improvements. Additionally, DARPA, the Defense Advanced Research Projects Agency, has shown interest in AI safety and alignment research, which could benefit the entire community. This news serves as a reminder of the importance of addressing safety concerns in the development and deployment of AI systems. OpenAI's Search GPT is currently a prototype and will only be accessible to 10,000 test users. The search engine combines a chatbot and a search engine, answering queries and providing links to sources. OpenAI's entry into the search market is significant due to their reputation and the market's dominance by Google, which has managed to maintain its 90% market share despite competition. Google's updated Gemini now offers access to its 1.5 flash AI model, joining the trend towards smaller, lighter, and quicker AI models. This move could help Google target cheaper and faster response use cases and compete with other companies in the space. Overall, these advancements demonstrate the continuous progress in the AI industry and the importance of addressing safety concerns, innovation, and competition.

    • Llama-3.1 language modelMeta's new Llama-3.1 language model, with a data set of 15 trillion tokens and trained on a massive compute budget, is expected to perform better due to its size and carefully curated data mixture. The accompanying research provides valuable insights into training and utilizing large language models.

      Meta has released a new language model called Llama-3.1, which is much larger than its predecessors, with a data set consisting of 15 trillion tokens. This model is expected to perform better due to its increased size and carefully curated data mixture, which includes general knowledge tokens, mathematical and reasoning tokens, code, and multilingual tokens. The model was trained using a significant compute budget, approximately 10 to the 25 flops, making it larger than GPT-4. Meta is also focusing on predicting the model's performance on various benchmarks based on the flops put in, addressing the question of how well the model will perform in practical applications. The paper includes insights into the hardware architecture and engineering details of training models at this scale, making it a valuable resource for the AI community. The release of this detailed paper, along with the model itself, is a significant step forward in understanding how to train and utilize large language models. Additionally, the paper provides an analysis of unexpected interruptions during the training process, shedding light on the challenges of scaling up models. Overall, the Llama-3.1 model represents a significant advancement in language model technology, and the accompanying research provides valuable insights into the process of training and utilizing such models.

    • Large model performance vs fine-tuningLarge models like Mistral's 405 billion parameter model may have strong in-context learning capabilities and require minimal fine-tuning for good performance, but open source companies like Mistral face economic challenges and need to consider monetization strategies

      While fine-tuning can improve performance for smaller models, it may have negligible effects on larger models like Mistral's 405 billion parameter model. This suggests that these large models have strong in-context learning capabilities and do not require extensive in-domain training to achieve strong performance. However, open source AI companies like Mistral face economic challenges and need to consider monetization strategies as they cannot allow commercial use of their models. The release of Mistral Large, a 123 billion parameter model, has been met with impressive benchmark scores and competition with other leading models. The open source nature of the model poses potential risks, including the possibility of bad actors exploiting the models. Additionally, there are new open source models like Grok's Llama Free Grok 70B Tool Use and 8 billion variation, which are fine-tuned for tool use and perform well on relevant leaderboards. The landscape of private versus open models in AI is shifting, and it will be interesting to see how Mistral and other open source companies continue to compete.

    • Open Source AIApple and Meta are open-sourcing models, datasets, and training code, while Tesla is considering investing in XAI. NVIDIA is releasing Blackwell GPUs for the Chinese market, and there's a larger research collaboration to standardize model architectures, training code, and data curation strategies.

      This week has seen a significant push towards open source models and datasets in the world of AI, with companies like Apple and Meta making strides in this area. Apple, in particular, is trying to differentiate itself by open-sourcing not just the models but also the pre-training datasets and training code. This comes as part of a larger research collaboration aimed at standardizing model architectures, training code, and data curation strategies. Elon Musk's Tesla is also reportedly considering investing $5 billion in XAI, a well-funded startup working on AI. NVIDIA, meanwhile, is preparing to release Blackwell GPUs for the Chinese market, designed to comply with US export limits. The focus on open source models and datasets, as well as strategic partnerships, is a reflection of the growing importance and scale required to build and deploy advanced AI technologies.

    • NVIDIA vs Chinese chipmakersNVIDIA focuses on export controls and memory bandwidth while Chinese companies invest heavily in R&D. Chinese company Cohere raised $5.5B and focuses on LLMs for enterprise, but lacks resources and deals with data providers. DeepMind's AlphaProof and AlphaGeometry models demonstrate advanced mathematical reasoning, but lack of Lean language data is a challenge.

      The technology race between NVIDIA and Chinese chipmakers is intensifying, with NVIDIA focusing on export controls and increasing memory bandwidth, while Chinese companies are investing heavily in research and development. A notable example is the Chinese company Cohere, which has raised $5.5 billion in its latest funding round and is focusing on LLMs for enterprise customers. Cohere has also introduced a copyright assurance policy to mitigate potential copyright infringement risks. However, the company's lack of resources and deals with data providers could limit its competitiveness in the long run. In the research and advancements sector, DeepMind continues to make strides with its AlphaProof and AlphaGeometry models, which have demonstrated advanced mathematical reasoning capabilities. These models use the Lean language to formalize mathematical statements and train models to translate between Lean and plain English, unlocking a library of problem-solving techniques. However, the lack of readily available data in the Lean language remains a challenge.

    • AI research advancementsResearchers are developing advanced AI systems, such as a neurosymbolic RL system and an automated interoperability agent, and creating large open-source multimodal datasets, which contribute to solving complex problems, automating interpretability, and ensuring safety.

      Researchers are making significant strides in creating advanced AI systems, with two examples being a bespoke RL system for solving complex problems and an automated interoperability agent for understanding other neural models. The RL system, using neurosymbolic methods and Alpha Geometry Two, is nearly at the gold medal threshold in a competition and can solve some problems in minutes while others take days. The automated interoperability agent, named Maya, uses neural models to experiment on and describe other models' behaviors. Both of these systems represent early but promising steps towards automating interpretability research and understanding complex AI systems. Another key development is the creation of large open-source multimodal datasets, like Mint-1T, which is 10 times larger than previous open datasets. This is crucial for the AI research community to keep up with advancements in closed-source data and to train models on diverse data. Lastly, OpenAI has shared their rule-based rewards approach, which provides clear and simple rules for evaluating model outputs to ensure safety and compliance with desired behavior. This is an automated source of feedback that maintains a balance between being helpful and preventing harm. These advancements in AI research, including complex problem-solving, automated interpretability, and safety, demonstrate the ongoing progress and potential of the field.

    • AI safety regulationsOpenAI is using a human-machine hybrid approach to assess generated text based on rules and improving AI safety through a reward metric. Lawmakers are demanding transparency and information on their safety efforts and commitments.

      OpenAI is combining human feedback with rule-based strategies to improve the safety and usefulness of its AI models. They are training a model to assess generated text based on specific rules, and then using a linear model to map these assessments into a single reward metric. This human-machine hybrid feedback will be used to enhance their PPO training loop. Additionally, OpenAI is facing pressure from lawmakers to provide more transparency and follow through on their safety commitments. Senators have demanded information about their efforts to ensure AI safety and secure employee agreements, as well as requests for documentation on their voluntary safety commitments. The topic of universal basic income as a response to AI-threatened jobs is also worth noting, as Silicon Valley is increasingly promoting this idea.

    • AI and UBIThe UBI experiment showed that people used the money for necessities but didn't significantly invest in education or businesses, and the psychological impact of not working was not fully captured. AI policy and contracts need continuous updates to adapt to the evolving landscape.

      The universal basic income (UBI) experiment led by Sam Altman through Open Research showed that people spent the money on necessities like food and rent. However, there were no significant increases in higher education or business start-ups. The psychological impact of not having productive work was not fully captured in the data. In policy news, Democratic senators introduced the Stop Corporate Capture Act to restore the ability of federal agencies to interpret ambiguous laws, which could impact AI policy. Video game performers went on strike due to concerns about AI replacing them in physical acting roles. These developments highlight the need for continuous updates in contracts and regulations to adapt to the rapidly evolving AI landscape.

    • AI essentials, collaborationFocus on essential concepts, collaborate with others to learn and make progress in AI, and maintain a growth mindset to keep up with new information and advancements

      Learning from last week's discussions in AI is the importance of understanding the underlying concepts and filtering out irrelevant information. The poem-like text suggests that in the field of AI, there's a constant influx of new information and advancements, which can be overwhelming. The phrase "Last week in AI, tuning out" implies the need to step back and focus on the essentials. The text also highlights the significance of collaboration and continuous learning. The lines "Deep minds, medals they shine, say the best flag, through the hour paradigm, together we'll unwind" emphasize the importance of working together and sharing knowledge to make progress in AI. The phrase "breaking down the house, breaking down the wine" suggests that to truly understand and apply new concepts, one must break them down into their fundamental components. Moreover, the text encourages a growth mindset, with the line "This week, there's so much in store" implying that there's always more to learn and explore in the field of AI. The reference to "Johnny, all the toy toys" suggests that there are many tools and resources available to help in this learning process. In summary, the takeaway from last week's discussions in AI is the importance of focusing on the essentials, collaborating with others, and maintaining a growth mindset in the face of constant new information and advancements.

    Recent Episodes from Last Week in AI

    #181 - Google Chatbots, Cerebras vs Nvidia, AI Doom, ElevenLabs Controversy

    #181 - Google Chatbots, Cerebras vs Nvidia, AI Doom, ElevenLabs Controversy

    Our 181st episode with a summary and discussion of last week's big AI news!

    With hosts Andrey Kurenkov and Jeremie Harris

    Read out our text newsletter and comment on the podcast at https://lastweekin.ai/

    If you would like to become a sponsor for the newsletter, podcast, or both, please fill out this form.

    Email us your questions and feedback at contact@lastweekinai.com and/or hello@gladstone.ai

    In this episode:

    - Google's AI advancements with Gemini 1.5 models and AI-generated avatars, along with Samsung's lithography progress.  - Microsoft's Inflection usage caps for Pi, new AI inference services by Cerebrus Systems competing with Nvidia.  - Biases in AI, prompt leak attacks, and transparency in models and distributed training optimizations, including the 'distro' optimizer.  - AI regulation discussions including California’s SB1047, China's AI safety stance, and new export restrictions impacting Nvidia’s AI chips.

    Timestamps + Links:

    Last Week in AI
    enSeptember 15, 2024

    #180 - Ideogram v2, Imagen 3, AI in 2030, Agent Q, SB 1047

    #180 - Ideogram v2, Imagen 3, AI in 2030, Agent Q, SB 1047

    Our 180th episode with a summary and discussion of last week's big AI news!

    With hosts Andrey Kurenkov (https://twitter.com/andrey_kurenkov) and Jeremie Harris (https://twitter.com/jeremiecharris)

    If you would like to get a sneak peek and help test Andrey's generative AI application, go to Astrocade.com to join the waitlist and the discord.

    Read out our text newsletter and comment on the podcast at https://lastweekin.ai/

    If you would like to become a sponsor for the newsletter, podcast, or both, please fill out this form.

    Email us your questions and feedback at contact@lastweekinai.com and/or hello@gladstone.ai

    Episode Highlights:

    • Ideogram AI's new features, Google's Imagine 3, Dream Machine 1.5, and Runway's Gen3 Alpha Turbo model advancements.
    • Perplexity's integration of Flux image generation models and code interpreter updates for enhanced search results. 
    • Exploration of the feasibility and investment needed for scaling advanced AI models like GPT-4 and Agent Q architecture enhancements.
    • Analysis of California's AI regulation bill SB1047 and legal issues related to synthetic media, copyright, and online personhood credentials.

    Timestamps + Links:

    Last Week in AI
    enSeptember 03, 2024

    #179 - Grok 2, Gemini Live, Flux, FalconMamba, AI Scientist

    #179 - Grok 2, Gemini Live, Flux, FalconMamba, AI Scientist

    Our 179th episode with a summary and discussion of last week's big AI news!

    With hosts Andrey Kurenkov (https://twitter.com/andrey_kurenkov) and Jeremie Harris (https://twitter.com/jeremiecharris)

    If you would like to get a sneak peek and help test Andrey's generative AI application, go to Astrocade.com to join the waitlist and the discord.

    Read out our text newsletter and comment on the podcast at https://lastweekin.ai/

    If you would like to become a sponsor for the newsletter, podcast, or both, please fill out this form.

    Email us your questions and feedback at contact@lastweekinai.com and/or hello@gladstone.ai

    Episode Highlights:

    - Grok 2's beta release features new image generation using Black Forest Labs' tech.

    - Google introduces Gemini Voice Chat Mode available to subscribers and integrates it into Pixel Buds Pro 2.

    - Huawei's Ascend 910C AI chip aims to rival NVIDIA's H100 amidst US export controls.

    - Overview of potential risks of unaligned AI models and skepticism around SingularityNet's AGI supercomputer claims.

    Timestamps + Links:

    Last Week in AI
    enAugust 20, 2024

    #178 - More Not-Acquihires, More OpenAI drama, More LLM Scaling Talk

    #178 - More Not-Acquihires, More OpenAI drama, More LLM Scaling Talk

    Our 178th episode with a summary and discussion of last week's big AI news!

    NOTE: this is a re-upload with fixed audio, my bad on the last one! - Andrey

    With hosts Andrey Kurenkov (https://twitter.com/andrey_kurenkov) and Jeremie Harris (https://twitter.com/jeremiecharris)

    If you would like to get a sneak peek and help test Andrey's generative AI application, go to Astrocade.com to join the waitlist and the discord.

    Read out our text newsletter and comment on the podcast at https://lastweekin.ai/

    If you would like to become a sponsor for the newsletter, podcast, or both, please fill out this form.

    Email us your questions and feedback at contact@lastweekinai.com and/or hello@gladstone.ai

    In this episode: - Notable personnel movements and product updates, such as Character.ai leaders joining Google and new AI features in Reddit and Audible. - OpenAI's dramatic changes with co-founder exits, extended leaves, and new lawsuits from Elon Musk. - Rapid advancements in humanoid robotics exemplified by new models from companies like Figure in partnership with OpenAI, achieving amateur-level human performance in tasks like table tennis. - Research advancements such as Google's compute-efficient inference models and self-compressing neural networks, showcasing significant reductions in compute requirements while maintaining performance.

    Timestamps + Links:

    Last Week in AI
    enAugust 16, 2024

    #177 - Instagram AI Bots, Noam Shazeer -> Google, FLUX.1, SAM2

    #177 - Instagram AI Bots, Noam Shazeer -> Google, FLUX.1, SAM2

    Our 177th episode with a summary and discussion of last week's big AI news!

    NOTE: apologies for this episode again coming out about a week late, next one will be coming out soon...

    With hosts Andrey Kurenkov (https://twitter.com/andrey_kurenkov) and Jeremie Harris (https://twitter.com/jeremiecharris)

    If you'd like to listen to the interview with Andrey, check out https://www.superdatascience.com/podcast

    If you would like to get a sneak peek and help test Andrey's generative AI application, go to Astrocade.com to join the waitlist and the discord.

    In this episode, hosts Andrey Kurenkov and John Krohn dive into significant updates and discussions in the AI world, including Instagram's new AI features, Waymo's driverless cars rollout in San Francisco, and NVIDIA’s chip delays. They also review Meta's AI Studio, character.ai CEO Noam Shazir's return to Google, and Google's Gemini updates. Additional topics cover NVIDIA's hardware issues, advancements in humanoid robots, and new open-source AI tools like Open Devon. Policy discussions touch on the EU AI Act, the U.S. stance on open-source AI, and investigations into Google and Anthropic. The impact of misinformation via deepfakes, particularly one involving Elon Musk, is also highlighted, all emphasizing significant industry effects and regulatory implications.

    Read out our text newsletter and comment on the podcast at https://lastweekin.ai/

    If you would like to become a sponsor for the newsletter, podcast, or both, please fill out this form.

    Email us your questions and feedback at contact@lastweekinai.com and/or hello@gladstone.ai

    Last Week in AI
    enAugust 11, 2024

    #176 - SearchGPT, Gemini 1.5 Flash, Lamma 3.1 405B, Mistral Large 2

    #176 - SearchGPT, Gemini 1.5 Flash, Lamma 3.1 405B, Mistral Large 2

    Our 176th episode with a summary and discussion of last week's big AI news!

    NOTE: apologies for this episode coming out about a week late, things got in the way of editing it...

    With hosts Andrey Kurenkov (https://twitter.com/andrey_kurenkov) and Jeremie Harris (https://twitter.com/jeremiecharris)

     

    Read out our text newsletter and comment on the podcast at https://lastweekin.ai/

    If you would like to become a sponsor for the newsletter, podcast, or both, please fill out this form.

    Email us your questions and feedback at contact@lastweekinai.com and/or hello@gladstone.ai

    Last Week in AI
    enAugust 03, 2024

    #175 - GPT-4o Mini, OpenAI's Strawberry, Mixture of A Million Experts

    #175 - GPT-4o Mini, OpenAI's Strawberry, Mixture of A Million Experts

    Our 175th episode with a summary and discussion of last week's big AI news!

    With hosts Andrey Kurenkov (https://twitter.com/andrey_kurenkov) and Jeremie Harris (https://twitter.com/jeremiecharris)

    In this episode of Last Week in AI, hosts Andrey Kurenkov and Jeremy Harris explore recent AI advancements including OpenAI's release of GPT 4.0 Mini and Mistral’s open-source models, covering their impacts on affordability and performance. They delve into enterprise tools for compliance, text-to-video models like Hyper 1.5, and YouTube Music enhancements. The conversation further addresses AI research topics such as the benefits of numerous small expert models, novel benchmarking techniques, and advanced AI reasoning. Policy issues including U.S. export controls on AI technology to China and internal controversies at OpenAI are also discussed, alongside Elon Musk's supercomputer ambitions and OpenAI’s Prover-Verify Games initiative.  

    Read out our text newsletter and comment on the podcast at https://lastweekin.ai/

    If you would like to become a sponsor for the newsletter, podcast, or both, please fill out this form.

    Email us your questions and feedback at contact@lastweekinai.com and/or hello@gladstone.ai

     

    Timestamps + links:

    Last Week in AI
    enJuly 25, 2024

    #174 - Odyssey Text-to-Video, Groq LLM Engine, OpenAI Security Issues

    #174 - Odyssey Text-to-Video, Groq LLM Engine, OpenAI Security Issues

    Our 174rd episode with a summary and discussion of last week's big AI news!

    With hosts Andrey Kurenkov (https://twitter.com/andrey_kurenkov) and Jeremie Harris (https://twitter.com/jeremiecharris)

    In this episode of Last Week in AI, we delve into the latest advancements and challenges in the AI industry, highlighting new features from Figma and Quora, regulatory pressures on OpenAI, and significant investments in AI infrastructure. Key topics include AMD's acquisition of Silo AI, Elon Musk's GPU cluster plans for XAI, unique AI model training methods, and the nuances of AI copying and memory constraints. We discuss developments in AI's visual perception, real-time knowledge updates, and the need for transparency and regulation in AI content labeling and licensing.

    See full episode notes here.

    Read out our text newsletter and comment on the podcast at https://lastweekin.ai/

    If you would like to become a sponsor for the newsletter, podcast, or both, please fill out this form.

    Email us your questions and feedback at contact@lastweekinai.com and/or hello@gladstone.ai

     

    Timestamps + links:

    Last Week in AI
    enJuly 17, 2024

    #173 - Gemini Pro, Llama 400B, Gen-3 Alpha, Moshi, Supreme Court

    #173 - Gemini Pro, Llama 400B, Gen-3 Alpha, Moshi, Supreme Court

    Our 173rd episode with a summary and discussion of last week's big AI news!

    With hosts Andrey Kurenkov (https://twitter.com/andrey_kurenkov) and Jeremie Harris (https://twitter.com/jeremiecharris)

    See full episode notes here.

    Read out our text newsletter and comment on the podcast at https://lastweekin.ai/

    If you would like to become a sponsor for the newsletter, podcast, or both, please fill out this form.

    Email us your questions and feedback at contact@lastweekinai.com and/or hello@gladstone.ai

    In this episode of Last Week in AI, we explore the latest advancements and debates in the AI field, including Google's release of Gemini 1.5, Meta's upcoming LLaMA 3, and Runway's Gen 3 Alpha video model. We discuss emerging AI features, legal disputes over data usage, and China's competition in AI. The conversation spans innovative research developments, cost considerations of AI architectures, and policy changes like the U.S. Supreme Court striking down Chevron deference. We also cover U.S. export controls on AI chips to China, workforce development in the semiconductor industry, and Bridgewater's new AI-driven financial fund, evaluating the broader financial and regulatory impacts of AI technologies.  

    Timestamps + links:

    Last Week in AI
    enJuly 07, 2024

    #172 - Claude and Gemini updates, Gemma 2, GPT-4 Critic

    #172 - Claude and Gemini updates, Gemma 2, GPT-4 Critic

    Our 172nd episode with a summary and discussion of last week's big AI news!

    With hosts Andrey Kurenkov (https://twitter.com/andrey_kurenkov) and Jeremie Harris (https://twitter.com/jeremiecharris)

    Feel free to leave us feedback here.

    Read out our text newsletter and comment on the podcast at https://lastweekin.ai/

    Email us your questions and feedback at contact@lastweekinai.com and/or hello@gladstone.ai

    Last Week in AI
    enJuly 01, 2024