Context windows, computer constraints, and energy consumption with Sarah and Elad

enMay 09, 2024

No Priors: Artificial Intelligence | Machine Learning | Technology | Startups

Podcast Summary

Exploring the Future of Music Generation with AI: AI-driven music generation is gaining traction, offering users the ability to create various music styles, lyrics, and even vocals. The trend continues with text, image, chat, and video content, potentially leading to voice cloning for personalized songs. However, the impact on the music creation landscape remains to be seen.
We are witnessing an exciting time in content creation as advancements in AI technology continue to empower individuals to generate various forms of media. Music generation, specifically, is gaining popularity through models like Suno and Yu Dio from OpenAI, which allow users to specify the type of music they want and even generate lyrics and vocals. This follows the trend of text-based, image-based, chat-based, and video-based content waves. The future could hold even more possibilities, such as voice cloning for generating personalized songs. However, it's essential to consider the ratio of consumers to creators in media platforms and ponder whether the ease of creating good music will significantly increase the number of people engaging in music creation.
Growing demand for localized and smaller language models on edge devices: Apple's entry into smaller language models signals potential shift towards local interfaces, enabling new consumer experiences and potentially more integrated user experiences on devices, but the extent of dependence on specific OS companies versus a broader ecosystem is uncertain.
There's a growing demand for localized and smaller language models (OLMs) that can fit on edge devices, offering lower latency and enabling new consumer experiences. Apple's recent entry into this space with smaller models signals a potential shift towards local interfaces for running models within their ecosystem. This could lead to more integrated and standardized user experiences on devices, although the extent of this dependence on specific OS companies versus a broader ecosystem remains uncertain. The success of vertical SaaS companies like Veeva, built on top of larger platforms, shows that even breakout successes can lead to eventual integration into the platform itself.
The Evolution of Veeva from a Third-Party App to a Significant Part of Salesforce: Historically, third-party apps can become part of larger platforms, as seen with Veeva's integration into Salesforce. Understanding the capabilities of device-resident vs cloud-based resources is crucial for building effective apps and products.
Companies often start as third-party applications on top of larger platforms but can eventually be subsumed by or become part of the larger platform. This was the case with Veeva, which focused on compliance workflows for selling to doctors in the life sciences industry, a vertical that Salesforce didn't have significant expertise in. Veeva became a significant part of Salesforce, worth about 20% of its market cap. Another takeaway is the limitations of small models in AI. There are certain capabilities that can be device-resident, such as reasoning and some synthesis abilities, but larger knowledge bases or complex tasks may require access to the cloud. As device capabilities expand, the line between what can be done on device and what requires cloud access will continue to shift. Additionally, there's a historical precedent for third-party applications being integrated into larger platforms. For example, Microsoft Office started as separate applications from companies like Lotus and eventually became part of Microsoft's distribution platform. The question of what capabilities can be device-resident versus cloud-based is an important one for building apps and products. Understanding these limitations can help developers determine what can be done on device and what requires access to larger cloud-based resources.
Integration of AI into Hardware: New Possibilities for Innovation: The integration of AI into hardware will involve a combination of devices and cloud-based models, with new form factors and capabilities like vision emerging in specific applications. The launch of meta AI as a standalone product showcases the potential for seamless integration into various services and applications.
The integration of AI technology into hardware will vary greatly depending on the application and the capabilities required. The distribution of compute will likely involve a combination of devices and cloud-based models. New form factors, such as passive devices for data collection, may emerge as interesting areas for consumer-focused AI hardware. The use of new capabilities, like vision, will depend on the specific application and the circumstances in which the data will be used. The launch of meta AI as a standalone product shows potential for integrating AI into various services and platforms, potentially including chatbots or other interactive services. Overall, the integration of AI into hardware and software will continue to evolve and offer new possibilities for innovation. I was particularly impressed by the meta AI product launch, which showcased the potential for seamless integration of AI into various services and applications. The one-click animation feature was a standout feature that my kids enjoyed using across different platforms. While it's unclear what meta's plans are for integrating the product into their existing surfaces, the potential for encapsulating meta.ai as just another line item or account in chat or other properties they own is intriguing. The future of AI technology will continue to bring unexpected applications and new possibilities for innovation.
Meta's push for advanced AI models and efficiency: Meta's investment in training larger models shows performance improvements, but efficiency, creativity, and architectural approach remain debated. Data and compute platform providers are hosting models from various players, blurring competition lines, and the long-term capital scale question arises for smaller players.
The race for advanced AI models and efficient inference is heating up, with large tech firms like Meta pushing the boundaries of model training and efficiency. Meta's investment in training models beyond supposedly optimal points has shown improvements in performance, but the importance of efficiency, creativity, and architectural approach for various use cases remains debated. Meanwhile, the landscape of competition is becoming increasingly blended as data and compute platform providers like Snowflake and Databricks host models from various players, including their own. It's unclear if these platforms need to own the models or just demonstrate expertise in training, fine-tuning, and deploying them. However, the long-term capital scale question arises as to how long these other players can keep up with the hyperscalers' investments in building bigger and bigger models. Focusing on medium to small, more specialized use cases and providing efficient inference platforms may be a viable alternative for these players.
Hyperscalers investing $200B in AI technology: Hyperscalers invest massive resources in AI, comparable to oil majors and broadband providers, with belief in future ROI as technology scales and becomes more intelligent and applicable.
A select few companies and entities, particularly the hyperscalers, are investing massive amounts of resources, approaching $200 billion collectively this year, towards the development of AI technology. This investment is comparable to that of industries like oil majors and broadband providers. The belief is that the ROI from these investments will come later as AI scales and becomes more intelligent and applicable to various industries and applications. The hyperscalers' commitment to this investment is significant, and sovereigns may also contribute to the immense scale in the long run by customizing models specific to their regions or cultures. Companies like Magic, which has a long context window, have already begun this trend with large context windows of 5 million tokens or more. This investment in AI is notable for its scale and the belief in its potential future impact.
Revolutionizing fields with larger context windows in AI: Larger context windows in AI are set to revolutionize various fields, particularly in biology, but come with new energy consumption challenges, requiring the construction of large-scale energy infrastructure
The use of larger context windows in AI is set to revolutionize various fields, much like how advancements in microprocessors and bandwidth transformed technology in the late 90s. This shift will be significant and long-lasting, particularly in areas like biology, where larger context windows have already shown promising results in protein folding. However, this progress comes with new challenges, specifically regarding energy consumption. As data centers continue to grow in size and power requirements, energy constraints may become a limiting factor. The construction of large-scale energy infrastructure, such as 500 megawatt or gigawatt data centers, will be necessary to keep up with the energy demands of AI development. This shift from solely software engineering challenges to physical world constraints, including permitting and infrastructure development, will slow down the pace of innovation but is a necessary step in the continued advancement of AI technology.
Physical limitations in AI development: Data wall and energy consumption: The physical limitations of AI development include the data wall and energy consumption. Microsoft's investment in Abu Dhabi highlights the need for energy-efficient AI data centers. Historical infrastructure investments provide context for the economic impact of energy and AI.
The advancement of AI technology faces both computational and physical limitations. The physical limitations include the challenges of increasing energy consumption and data availability. The concept of a "data wall" refers to the limited number of cheap data tokens available on the internet, necessitating the collection of more data or the generation of synthetic data. The energy consumption required for AI development is significant, and the location of new AI data centers near energy sources, such as Microsoft's investment in Abu Dhabi, highlights this issue. The scarcity of energy resources in the past, like nuclear power, could have led to greater capabilities but was hindered by policy decisions. The adoption of solutions to these challenges is crucial, and the intersection of energy and AI could be a game-changer. Additionally, the geopolitical implications of energy dependency on AI are significant, with potential consequences for global power dynamics. The discussion also touched upon the historical significance of infrastructure investments, such as hat making, and their economic impact.
Support the No Priorities Podcast with a donation for exclusive merchandise: Donate $100 to the No Priorities Podcast to receive a branded bottle of tequila or mocktail and a cool hat as a thank you gift
The No Priorities podcast is offering a branded bottle of tequila or mocktail and a cool hat for a $100 donation. You can find them on Twitter @nopriorispod, subscribe to their YouTube channel, and listen on Apple Podcasts, Spotify, or any other podcast platform for a new episode every week. Don't forget to sign up for emails or check out transcripts on their website, nodashpriors.com. This is a unique opportunity to support the podcast and receive some exclusive merchandise in return. So, if you're a fan of the show, consider making a donation and becoming a part of the No Priorities community.

Recent Episodes from No Priors: Artificial Intelligence | Machine Learning | Technology | Startups

State Space Models and Real-time Intelligence with Karan Goel and Albert Gu from Cartesia

This week on No Priors, Sarah Guo and Elad Gil sit down with Karan Goel and Albert Gu from Cartesia. Karan and Albert first met as Stanford AI Lab PhDs, where their lab invented Space Models or SSMs, a fundamental new primitive for training large-scale foundation models. In 2023, they Founded Cartesia to build real-time intelligence for every device. One year later, Cartesia released Sonic which generates high quality and lifelike speech with a model latency of 135ms—the fastest for a model of this class. Sign up for new podcasts every week. Email feedback to show@no-priors.com Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil | @krandiash | @_albertgu Show Notes: (0:00) Introduction (0:28) Use Cases for Cartesia and Sonic (1:32) Karan Goel & Albert Gu’s professional backgrounds (5:06) Steady State Models (SSMs) versus Transformer Based Architectures (11:51) Domain Applications for Hybrid Approaches (13:10) Text to Speech and Voice (17:29) Data, Size of Models and Efficiency (20:34) Recent Launch of Text to Speech Product (25:01) Multimodality & Building Blocks (25:54) What’s Next at Cartesia? (28:28) Latency in Text to Speech (29:30) Choosing Research Problems Based on Aesthetic (31:23) Product Demo (32:48) Cartesia Team & Hiring

No Priors: Artificial Intelligence | Machine Learning | Technology | Startups

enJune 27, 2024

Can AI replace the camera? with Joshua Xu from HeyGen

AI video generation models still have a long way to go when it comes to making compelling and complex videos but the HeyGen team are well on their way to streamlining the video creation process by using a combination of language, video, and voice models to create videos featuring personalized avatars, b-roll, and dialogue. This week on No Priors, Joshua Xu the co-founder and CEO of HeyGen, joins Sarah and Elad to discuss how the HeyGen team broke down the elements of a video and built or found models to use for each one, the commercial applications for these AI videos, and how they’re safeguarding against deep fakes. Links from episode: HeyGen McDonald’s commercial Sign up for new podcasts every week. Email feedback to show@no-priors.com Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil | @joshua_xu_ Show Notes: (0:00) Introduction (3:08) Applications of AI content creation (5:49) Best use cases for Hey Gen (7:34) Building for quality in AI video generation (11:17) The models powering HeyGen (14:49) Research approach (16:39) Safeguarding against deep fakes (18:31) How AI video generation will change video creation (24:02) Challenges in building the model (26:29) HeyGen team and company

No Priors: Artificial Intelligence | Machine Learning | Technology | Startups

enJune 20, 2024

How the ARC Prize is democratizing the race to AGI with Mike Knoop from Zapier

The first step in achieving AGI is nailing down a concise definition and Mike Knoop, the co-founder and Head of AI at Zapier, believes François Chollet got it right when he defined general intelligence as a system that can efficiently acquire new skills. This week on No Priors, Miked joins Elad to discuss ARC Prize which is a multi-million dollar non-profit public challenge that is looking for someone to beat the Abstraction and Reasoning Corpus (ARC) evaluation. In this episode, they also get into why Mike thinks LLMs will not get us to AGI, how Zapier is incorporating AI into their products and the power of agents, and why it’s dangerous to regulate AGI before discovering its full potential. Show Links: About the Abstraction and Reasoning Corpus Zapier Central Sign up for new podcasts every week. Email feedback to show@no-priors.com Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil | @mikeknoop Show Notes: (0:00) Introduction (1:10) Redefining AGI (2:16) Introducing ARC Prize (3:08) Definition of AGI (5:14) LLMs and AGI (8:20) Promising techniques to developing AGI (11:0) Sentience and intelligence (13:51) Prize model vs investing (16:28) Zapier AI innovations (19:08) Economic value of agents (21:48) Open source to achieve AGI (24:20) Regulating AI and AGI

No Priors: Artificial Intelligence | Machine Learning | Technology | Startups

enJune 11, 2024

The evolution and promise of RAG architecture with Tengyu Ma from Voyage AI

After Tengyu Ma spent years at Stanford researching AI optimization, embedding models, and transformers, he took a break from academia to start Voyage AI which allows enterprise customers to have the most accurate retrieval possible through the most useful foundational data. Tengyu joins Sarah on this week’s episode of No priors to discuss why RAG systems are winning as the dominant architecture in enterprise and the evolution of foundational data that has allowed RAG to flourish. And while fine-tuning is still in the conversation, Tengyu argues that RAG will continue to evolve as the cheapest, quickest, and most accurate system for data retrieval. They also discuss methods for growing context windows and managing latency budgets, how Tengyu’s research has informed his work at Voyage, and the role academia should play as AI grows as an industry. Show Links: Tengyu Ma Key Research Papers: Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training Non-convex optimization for machine learning: design, analysis, and understanding Provable Guarantees for Self-Supervised Deep Learning with Spectral Contrastive Loss Larger language models do in-context learning differently, 2023 Why Do Pretrained Language Models Help in Downstream Tasks? An Analysis of Head and Prompt Tuning On the Optimization Landscape of Tensor Decompositions Sign up for new podcasts every week. Email feedback to show@no-priors.com Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil | @tengyuma Show Notes: (0:00) Introduction (1:59) Key points of Tengyu’s research (4:28) Academia compared to industry (6:46) Voyage AI overview (9:44) Enterprise RAG use cases (15:23) LLM long-term memory and token limitations (18:03) Agent chaining and data management (22:01) Improving enterprise RAG (25:44) Latency budgets (27:48) Advice for building RAG systems (31:06) Learnings as an AI founder (32:55) The role of academia in AI

No Priors: Artificial Intelligence | Machine Learning | Technology | Startups

enJune 06, 2024

How YC fosters AI Innovation with Garry Tan

Garry Tan is a notorious founder-turned-investor who is now running one of the most prestigious accelerators in the world, Y Combinator. As the president and CEO of YC, Garry has been credited with reinvigorating the program. On this week’s episode of No Priors, Sarah, Elad, and Garry discuss the shifting demographics of YC founders and how AI is encouraging younger founders to launch companies, predicting which early stage startups will have longevity, and making YC a beacon for innovation in AI companies. They also discussed the importance of building companies in person and if San Francisco is, in fact, back. Sign up for new podcasts every week. Email feedback to show@no-priors.com Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil | @garrytan Show Notes: (0:00) Introduction (0:53) Transitioning from founder to investing (5:10) Early social media startups (7:50) Trend predicting at YC (10:03) Selecting YC founders (12:06) AI trends emerging in YC batch (18:34) Motivating culture at YC (20:39) Choosing the startups with longevity (24:01) Shifting YC found demographics (29:24) Building in San Francisco (31:01) Making YC a beacon for creators (33:17) Garry Tan is bringing San Francisco back

No Priors: Artificial Intelligence | Machine Learning | Technology | Startups

enMay 23, 2024

The Data Foundry for AI with Alexandr Wang from Scale

Alexandr Wang was 19 when he realized that gathering data will be crucial as AI becomes more prevalent, so he dropped out of MIT and started Scale AI. This week on No Priors, Alexandr joins Sarah and Elad to discuss how Scale is providing infrastructure and building a robust data foundry that is crucial to the future of AI. While the company started working with autonomous vehicles, they’ve expanded by partnering with research labs and even the U.S. government. In this episode, they get into the importance of data quality in building trust in AI systems and a possible future where we can build better self-improvement loops, AI in the enterprise, and where human and AI intelligence will work together to produce better outcomes. Sign up for new podcasts every week. Email feedback to show@no-priors.com Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil | @alexandr_wang (0:00) Introduction (3:01) Data infrastructure for autonomous vehicles (5:51) Data abundance and organization (12:06) Data quality and collection (15:34) The role of human expertise (20:18) Building trust in AI systems (23:28) Evaluating AI models (29:59) AI and government contracts (32:21) Multi-modality and scaling challenges

No Priors: Artificial Intelligence | Machine Learning | Technology | Startups

enMay 22, 2024

Music consumers are becoming the creators with Suno CEO Mikey Shulman

Mikey Shulman, the CEO and co-founder of Suno, can see a future where the Venn diagram of music creators and consumers becomes one big circle. The AI music generation tool trying to democratize music has been making waves in the AI community ever since they came out of stealth mode last year. Suno users can make a song complete with lyrics, just by entering a text prompt, for example, “koto boom bap lofi intricate beats.” You can hear it in action as Mikey, Sarah, and Elad create a song live in this episode. In this episode, Elad, Sarah, And Mikey talk about how the Suno team took their experience making at transcription tool and applied it to music generation, how the Suno team evaluates aesthetics and taste because there is no standardized test you can give an AI model for music, and why Mikey doesn’t think AI-generated music will affect people’s consumption of human made music. Listen to the full songs played and created in this episode: Whispers of Sakura Stone Statistical Paradise Statistical Paradise 2 Sign up for new podcasts every week. Email feedback to show@no-priors.com Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil | @MikeyShulman Show Notes: (0:00) Mikey’s background (3:48) Bark and music generation (5:33) Architecture for music generation AI (6:57) Assessing music quality (8:20) Mikey’s music background as an asset (10:02) Challenges in generative music AI (11:30) Business model (14:38) Surprising use cases of Suno (18:43) Creating a song on Suno live (21:44) Ratio of creators to consumers (25:00) The digitization of music (27:20) Mikey’s favorite song on Suno (29:35) Suno is hiring

No Priors: Artificial Intelligence | Machine Learning | Technology | Startups

enMay 16, 2024

Context windows, computer constraints, and energy consumption with Sarah and Elad

This week on No Priors hosts, Sarah and Elad are catching up on the latest AI news. They discuss the recent developments in AI music generation, and if you’re interested in generative AI music, stay tuned for next week’s interview! Sarah and Elad also get into device-resident models, AI hardware, and ask just how smart smaller models can really get. These hardware constraints were compared to the hurdles AI platforms are continuing to face including computing constraints, energy consumption, context windows, and how to best integrate these products in apps that users are familiar with. Have a question for our next host-only episode or feedback for our team? Reach out to show@no-priors.com Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil Show Notes: (0:00) Intro (1:25) Music AI generation (4:02) Apple’s LLM (11:39) The role of AI-specific hardware (15:25) AI platform updates (18:01) Forward thinking in investing in AI (20:33) Unlimited context (23:03) Energy constraints

No Priors: Artificial Intelligence | Machine Learning | Technology | Startups

enMay 09, 2024

Cognition’s Scott Wu on how Devin, the AI software engineer, will work for you

Scott Wu loves code. He grew up competing in the International Olympiad in Informatics (IOI) and is a world class coder, and now he's building an AI agent designed to create more, not fewer, human engineers. This week on No Priors, Sarah and Elad talk to Scott, the co-founder and CEO of Cognition, an AI lab focusing on reasoning. Recently, the Cognition team released a demo of Devin, an AI software engineer that can increasingly handle entire tasks end to end. In this episode, they talk about why the team built Devin with a UI that mimics looking over another engineer’s shoulder as they work and how this transparency makes for a better result. Scott discusses why he thinks Devin will make it possible for there to be more human engineers in the world, and what will be important for software engineers to focus on as these roles evolve. They also get into how Scott thinks about building the Cognition team and that they’re just getting started. Sign up for new podcasts every week. Email feedback to show@no-priors.com Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil | @ScottWu46 Show Notes: (0:00) Introduction (1:12) IOI training and community (6:39) Cognition’s founding team (8:20) Meet Devin (9:17) The discourse around Devin (12:14) Building Devin’s UI (14:28) Devin’s strengths and weakness (18:44) The evolution of coding agents (22:43) Tips for human engineers (26:48) Hiring at Cognition

No Priors: Artificial Intelligence | Machine Learning | Technology | Startups

enMay 02, 2024

OpenAI’s Sora team thinks we’ve only seen the "GPT-1 of video models"

AI-generated videos are not just leveled-up image generators. But rather, they could be a big step forward on the path to AGI. This week on No Priors, the team from Sora is here to discuss OpenAI’s recently announced generative video model, which can take a text prompt and create realistic, visually coherent, high-definition clips that are up to a minute long. Sora team leads, Aditya Ramesh, Tim Brooks, and Bill Peebles join Elad and Sarah to talk about developing Sora. The generative video model isn’t yet available for public use but the examples of its work are very impressive. However, they believe we’re still in the GPT-1 era of AI video models and are focused on a slow rollout to ensure the model is in the best place possible to offer value to the user and more importantly they’ve applied all the safety measures possible to avoid deep fakes and misinformation. They also discuss what they’re learning from implementing diffusion transformers, why they believe video generation is taking us one step closer to AGI, and why entertainment may not be the main use case for this tool in the future. Show Links: Bling Zoo video Man eating a burger video Tokyo Walk video Sign up for new podcasts every week. Email feedback to show@no-priors.com Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil | @_tim_brooks l @billpeeb l @model_mechanic Show Notes: (0:00) Sora team Introduction (1:05) Simulating the world with Sora (2:25) Building the most valuable consumer product (5:50) Alternative use cases and simulation capabilities (8:41) Diffusion transformers explanation (10:15) Scaling laws for video (13:08) Applying end-to-end deep learning to video (15:30) Tuning the visual aesthetic of Sora (17:08) The road to “desktop Pixar” for everyone (20:12) Safety for visual models (22:34) Limitations of Sora (25:04) Learning from how Sora is learning (29:32) The biggest misconceptions about video models

No Priors: Artificial Intelligence | Machine Learning | Technology | Startups

enApril 25, 2024

Related Episodes

Mortgages: Stick or Twist?

The Martin Lewis Podcast is here, with a brand new sound. Martin and Nihal Arthanayake talk mortgages as listeners ask if they should stick with a longer fix or take a chance on variable rates. Martin’s got some special tips on how you might be able to top up your state pension and how to get some extra cash in time for Christmas. Plus, have you ever found a bone fide second-hand bargain?

The Martin Lewis Podcast

enOctober 12, 2022

728: AI Superpowers with Kevin Hou and Codeium

In this supper club, Scott and Wes welcome Kevin Hou, Head of Product Engineering at Codeium, a blazing fast AI-powered code completion and chat tool for developers.
 Show Notes00:00 Welcome to Syntax!
31 An introduction to Codeium.
56 What information are you sending the AI to get such good completions?
Codeium compliance
15 Codeium runs a 'Language Server'.
15 Crawling dependency tree and abstract syntax tree.
07 Using Codeium Live.
34 How big of a codebase can you run this on?
39 Sending select amounts of data to AI.
06 Does Codeium maintain codebase preferences and styling (ie. snake case)?
39 Will Codeium scan the dependency?
23 AI UI, have we found the best format?
55 Crazy ideas in tech.
53 Additional AI UI inputs.
14 How do you make an AI model?
42 How does Codeium manage the product roadmap?
09 Do AI models get worse over time? How does Codeium validate that it's not?
Open AI Evals
39 How is Codeium THAT fast?
49 What programming langauges does Codeium use?
55 Codeium Playground.
Codeium Playground
15 Caching as a performance improvement.
58 What is the pipe between Codeium and editor?
17 Codeium chat service.
44 A WebSocket system allowing push and pull communication.
13 Closed Beta for GPT 4.0.
12 The dreaded closing quote bug.
26 Sometimes bugs bug Wes.
49 Supper Club Questions
40 Perplexity.ai
35 What editor does Kevin use?
31 Sick Picks + Shameless Plugs
 Sick Picks Kevin: Lapse.com
Shameless Plugs Kevin: Codeium.com
Hit us up on Socials! Syntax: X Instagram Tiktok LinkedIn Threads
Wes: X Instagram Tiktok LinkedIn Threads
Scott: X Instagram Tiktok LinkedIn Threads
Randy: X Instagram YouTube Threads