Logo
    Search

    John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

    enMay 15, 2024

    Podcast Summary

    • Pre-training vs Post-training in AI and RLPre-training generates a base model capable of various personas and content, while post-training optimizes for human-useful outputs. Future models will handle more complex tasks, recover from errors effectively, and require less data to learn.

      Pre-training and post-training are two distinct concepts in AI and reinforcement learning (RL). Pre-training involves teaching a model to imitate all content on the web, generating content that looks like random web pages and assigning probabilities to everything. This results in a base model that can generate various personas and content. Post-training, on the other hand, targets a narrower range of behavior, optimizing for outputs that humans find useful and helpful, such as chat assistants. Looking ahead, models will significantly improve in the next 5 years. In the first year or two, they will be capable of handling more complex tasks, like carrying out entire coding projects instead of just suggesting how to write a function. This improvement comes from training models to carry out longer projects and recovering from errors more effectively. The models will become more sample efficient, requiring less data to learn how to get back on track. The connection between generalization and sample efficiency is that as models get better, they can learn from a diverse dataset and generalize their abilities to recover from errors. This means they will need less data to learn how to handle new situations, making them more effective and efficient overall.

    • Long-term reinforcement learning holds promise for maintaining coherence and aligning with broader goals, but challenges remainLong-term reinforcement learning could enable models to maintain coherence and align with broader goals, but human-level performance isn't guaranteed and challenges like ambiguity, expertise, access to UIs, and the physical world may pose obstacles.

      While current models may be smart on a per token basis, they lack the ability to maintain coherence and align with broader goals for extended periods of time. Long-term reinforcement learning (RL) training could unlock this ability, but it's uncertain if human-level performance will be immediately achieved. Other deficits, such as dealing with ambiguity or having expertise, might still pose challenges. Additionally, mundane barriers like access to UIs and the physical world may initially slow down progress. The nature of these bottlenecks is not yet clear. As for the user interface (UI) design for models, it may need to be multimodal or trained on more multimodal data to effectively cater to the needs of more advanced models.

    • Future of AI interaction with web involves blend of human-designed UIs and new optimizationsAI models can use current UIs with improved vision, but some websites may benefit from AI-specific UX designs. AGI's ability to generalize and transfer learning is promising, but long-term coherence and other weaknesses need to be addressed before full human collaboration.

      The future of AI interaction with the web may involve a blend of existing human-designed UIs and new optimizations for AI capabilities. While models may be able to use current UIs with improved vision capabilities, some websites will likely benefit from AI-specific UX designs. The ability of AI models to generalize and transfer learning from one domain to another has been observed in instances such as multilingual understanding and multimodal data processing. However, long-term coherence is just one aspect of AGI, and there are likely other weaknesses that need to be addressed before we can expect AI to fully function as a human colleague. If AGI were to arrive sooner than expected, it would be important to proceed with caution and ensure a safe and controlled deployment.

    • Ensuring safety in the development of AGICoordination among key players is crucial to establish guidelines and prevent a dangerous race to deploy AGI, ensuring safety measures and preventing misuse.

      As we approach the development of advanced artificial intelligence (AGI), it's crucial for key players to coordinate and establish guidelines to ensure safety and prevent a race to deploy potentially dangerous technology. The conversation suggests that if AGI emerges sooner than expected, a pause in deployment and further training might be necessary. This coordination could involve setting reasonable limits on deployment and training, and agreeing on safety measures to prevent misuse. However, maintaining this equilibrium for a long period could be challenging. Ultimately, the goal would be to deploy AGI systems that act as an extension of human will, prevent misuse, and usher in a new phase of scientific advancement. To ensure this, proof of coordination and alignment could be demonstrated through the deployment of incrementally smarter systems that are safer than their predecessors.

    • Continuous improvement and monitoring for advanced AI systemsRegularly evaluate and monitor AI systems for potential risks and misalignment with human values, aiming for continuous improvement rather than a big release with potential dangerous energy.

      When it comes to releasing advanced AI systems, it's better to aim for continuous improvement with rigorous testing and monitoring, rather than a big, coordinated release that could potentially build up dangerous energy. When dealing with potential discontinuous jumps in capabilities, it's crucial to have extensive evaluations during the training process, monitor the models closely, and ensure that the training data doesn't provide any incentive for the model to turn against humans. Continuous evaluation and monitoring are essential to mitigate risks and ensure alignment with human values. Additionally, the models themselves should be well-behaved and resistant to takeover attempts. It's not necessary to be overly concerned about discontinuous jumps in capabilities at this stage, but as models become more capable, it's essential to take these concerns seriously and invest in rigorous testing and evaluation.

    • RLHF models aim to maximize human approval through drives and goalsRLHF models use human feedback to improve reasoning abilities, with methods including learning from outputs and following correct thought processes, and active learning approaches.

      The concept of reinforcement learning with human feedback (RLHF) can be seen as having drives and goals, similar to how humans experience satisfaction from achieving their goals. This concept is essential in RLHF models, which aim to maximize human approval. Two methods for improving a model's reasoning abilities have been suggested: one involves the model learning from its outputs and following the correct train of thought, while the other uses computation during deployment for inference. Reasoning is defined as tasks requiring computation at test time, and both training and deployment time computations are valuable. The speaker also shared their experience with genetic testing and the importance of understanding the full DNA sequence for accurate health risk assessments. Regarding the learning methods for models, the speaker noted that most of the computing in training is spent on free training, which is not very efficient. Instead, a more deliberate and active learning approach is desirable. This could potentially involve models having some form of medium-term memory, allowing them to retain and build upon information for more extended periods than context learning but less than pretraining. The speaker also pondered how this might correspond to models, as they prepare for a conversation by researching and thinking about the topic at hand.

    • Balancing large-scale training and in-context learning for future AI developmentFuture AI development may require a balance between large-scale training and in-context learning, with a focus on online learning, introspection, and active knowledge acquisition. Effective learning algorithms for complex tasks may be those that explore all possibilities and adapt to new information, like learned search algorithms.

      The future of AI development may involve a balance between large-scale training and in-context learning, with a focus on online learning, introspection, and active knowledge acquisition. This middle ground between producing a snapshot model and in-context learning has not been fully explored due to the increasing context lengths, but it is expected to become increasingly important for long-term tasks. The model's ability to introspect and learn new knowledge autonomously could be crucial for effective active learning. However, the most sample-efficient algorithms, such as policy gradient algorithms, may not be the best choice for test time learning in complex tasks. Instead, learning algorithms that effectively explore all possibilities and adapt to new information, like learned search algorithms, are likely to be more effective. The speaker also highlighted their personal history in the development of ChatGPT at OpenAI, where they first recognized the potential of large language models and saw the utility of creating a chatbot interface to interact with them.

    • Development of conversational chatbotsGoogle and OpenAI pioneered conversational chatbot technology, with OpenAI introducing a conversational chat assistant in 2022 that showed potential despite limitations, leading to significant progress in the field.

      The development of conversational AI models, specifically chatbots, has evolved significantly over the years. Before the focus shifted to chat, there were instruction following models that required elaborate prompts and were not easy to use. Google and OpenAI were among the organizations exploring chatbot technology, with Google having chatbots specialized for specific tasks and OpenAI working on question answering models that benefited from conversational interactions. In 2022, OpenAI introduced a conversational chat assistant built on top of GPD 3.5, which showed great potential in language and coding help. Although the model had limitations, such as occasional hallucinations and unreliability, the team continued to improve it by mixing instruction and chat data to create a model that was the best of both worlds. The ease of use and sensible behavior of chat models made them an exciting alternative, and the clear definition of a helpful conversational robot made it easier for people to understand the model's purpose. Despite some setbacks, the team's dedication to refining the conversational chat assistant led to significant progress in the field.

    • Understanding the progress of language models since GPT-2While publicly available APIs and RL models could have produced decent chatbots, the progress of language models since GPT-2, particularly in post-training, has been impressive and will continue to be a focus in the future.

      While the publicly available fine tuning API could have been used by anyone to create a chatbot similar to ChatGPT, the process would have been nontrivial and required multiple iterations of fine tuning and human editing. The speaker also mentioned that there was an instruction following model trained with reinforcement learning (RL) released before ChatGPT, which, when wrapped in a chat interface, could have produced a decent result. However, this model had strengths and weaknesses compared to ChatGPT. The speaker was impressed with the progress of AI since GPT-2, especially in the area of language models. They mentioned that they were initially unsure about the revolutionizing potential of GPT-2 but became more convinced after seeing the gains from post-training. The speaker expects that post-training will continue to be a focus in the future, as there is a lot of room for improvement through data quality, data quantity, and iterative processes. To excel in this research area, one needs a decent amount of experience with various parts of the stack, including RL algorithms, data collection, annotation processes, and language models. The process is finicky, but having a good understanding of these different components can lead to discovering ways to improve the models.

    • Approaching a data wall in language modelsResearchers suggest that significant improvements in language models may not significantly enhance intelligence beyond current levels due to challenges in generalizing from different types of pre-training data and the resource-intensive nature of training large models.

      While we've seen significant advancements in language models like GPD 4, there's a hypothesis that we might be approaching a data wall, where the abilities unlocked by memorizing vast pre-training data may not significantly improve intelligence beyond current levels. Researchers suggest that generalization from different types of pre-training data is challenging, and it's unclear if there's significant transfer between different modalities. The reason being, it's difficult to conduct extensive research due to the resource-intensive nature of training large models and the lack of public results on ablation studies involving code data and reasoning performance. However, it's important to note that the scaling of models with more parameters doesn't necessarily mean they become smarter with less data. Instead, larger models might learn better shared representations, while smaller models might rely more on memorization. The reasons for this are not well understood, and there's no clear explanation for the scaling law with parameter count. Despite these challenges, researchers continue to explore the potential of larger models and different pre-training strategies. Ultimately, the goal is to develop models that can generalize effectively across various domains and modalities, enabling them to reason, learn, and transfer knowledge more efficiently.

    • Advancements in AI size, modalities, and capabilitiesLarger models have access to more computations and functions, leading to better performance. New modalities will be added, capabilities will improve, and AI will be used in various industries. Challenges include integrating AI into existing processes and long-horizon reinforcement learning.

      The size of a model plays a significant role in its performance, as a larger model has access to a larger library of computations and functions, increasing its chances of discovering effective solutions. Over the next few years, we can expect new modalities to be added to models, capabilities to improve through pre-training and post-training, and AI to be used more widely in various industries. Despite the potential advancements, there will still be challenges to overcome, such as integrating AI into existing processes and unlocking long-horizon reinforcement learning. AI's impact on the economy will grow as it becomes more integrated into various processes, and it will be used for increasingly sophisticated tasks, such as accelerating scientific research. A user-friendly tool, like Command Bar, can help streamline the user experience by providing personalized assistance and improving the overall interaction between users and AI systems.

    • Balancing Automation and Human Involvement in AI BusinessesRegulations may be necessary to ensure human oversight in AI businesses and maintain a balance between automation and human involvement.

      As AI capabilities advance, there will be a need for human oversight to ensure alignment with user expectations and to make important decisions. However, there is a risk that firms without human involvement may outcompete those with human involvement due to increased efficiency. This raises the question of how to regulate AI use in businesses to maintain a balance between automation and human involvement. It may be necessary to define important processes that require human oversight and monitor compliance across all firms and countries. Alternatively, regulations could be put in place before AI deployment to ensure that firms are built end-to-end on these models and do not require human intervention. Ultimately, it may be some time before AI can be trusted to run firms entirely, and practical considerations and potential malfunctions may necessitate continued human involvement.

    • Balancing human values with advanced AI systemsOpenAI's model spec focuses on following user instructions while considering others' impact, maintaining transparency and openness in ML research.

      When dealing with advanced AI systems like RLHF, aligning their actions with human values becomes a complex issue. Stakeholders, including users, developers, and the wider human population, may have conflicting demands, and it's essential to make compromises. The recently released model spec from OpenAI outlines their approach to handling these conflicts, focusing on the model following user instructions while considering the impact on others. The ML research field, while not perfect, is generally considered healthier than some others due to its practical focus and the consequences of unreplicable studies. Researchers strive for transparency and openness to maintain the field's credibility.

    • Advancements and Challenges in Language ModelsDespite progress in language models, there's a need for more scientific inquiry, improving chatbot writing, and making models more useful and engaging for users. Simulated social science research holds potential for new discoveries.

      While there have been significant advancements in language models and their applications, there are still challenges and opportunities for improvement. The field faces incentives that sometimes lead to less scientific progress and more focus on benchmarks and new methods. However, there is a growing interest in using language models for simulated social science research, which could lead to new insights. Regarding the specific application of RLHF in chatbots, there is room for improvement in making the writing more lively and less robotic. This may be due to the training process and the loss function used, but there may also be unintentional distillation happening between language model providers. Ultimately, people's preferences for structured responses and big information dumps may also be contributing factors. In terms of the broader progress in language models, there has been a focus on improving efficiency and making learning more stable. However, it's unclear if the same amount of compute can now train a much better model or if the improvements have made learning more scalable without a significant increase in compute. Overall, there is a need for more scientific inquiry and understanding in the field, as well as a focus on making language models more useful and engaging for users. The potential applications of language models in simulated social science research are particularly exciting and could lead to new discoveries.

    • Understanding and mastering human preferences in language modelsDespite complexities, companies invest heavily in pre-training and post-training to master human preferences in language models, creating a potential moat but not impossible to replicate.

      The preference models used in language models are complex and intricate, learning subtle nuances about what people prefer. However, the optimal format for describing these preferences is not clear, and it may depend on various factors such as the model size, the data used for training, and the speed at which the model outputs text. The reward model, which aggregates human preferences, is currently the closest representation of what people want, but it may not fully capture all the complexities and subtleties of human preferences. Post-training, the process of creating a functional model with all the desired features, is a complex and complicated effort that requires a significant amount of R&D and skilled personnel. This makes it a potential moat for companies that can master it, but it is not impossible to replicate. The same companies that are investing heavily in pre-training are also putting resources into post-training, making it a competitive landscape. However, there is also the possibility to distill models, clone outputs, or use someone else's model as a judge, which reduces the moat to some extent. Overall, understanding and mastering human preferences in language models is a complex and ongoing process.

    • Human raters play a crucial role in AI developmentAdvanced AI models require human raters for data labeling and fine-tuning. The future involves multimodal data integration and the creation of advanced AI assistants that can work alongside humans, offering suggestions and collaborating on projects.

      The development of advanced AI models relies on a large and diverse workforce of human raters to label and fine-tune data. These raters come from various backgrounds, including the US and countries with lower middle income, and they possess a range of skills. While some models may require closely matched labels for specific tasks, generalization can also lead to significant progress. The future of AI development involves the integration of multimodal data and the creation of assistants that can work alongside humans, offering suggestions and collaborating on projects. By the end of the year or next year, we can expect the emergence of more advanced AI assistants that can work on entire projects and offer proactive suggestions, moving away from the current model of one-off queries. However, the optimal form factor for these assistants remains uncertain.

    • Exploring the advanced capabilities of AI modelsAI models can now suggest tasks and even perform background work, potentially replacing certain jobs within the next five years, and they're evolving to become more like personal assistants, offering a game-changing level of autonomy and intelligence

      Advanced AI models can now proactively suggest tasks and even perform background work, potentially replacing certain jobs within the next five years. During our conversation with John, we learned about the fascinating capabilities of these models and how they're evolving to become more like personal assistants. This level of autonomy and intelligence is a game-changer and an essential aspect of the AI process that many people don't fully understand. If you'd like to learn more or advertise on our podcast, please reach out through the link in the description. And don't forget to share this episode with anyone who might be interested!

    Recent Episodes from Dwarkesh Podcast

    Tony Blair - Life of a PM, The Deep State, Lee Kuan Yew, & AI's 1914 Moment

    Tony Blair - Life of a PM, The Deep State, Lee Kuan Yew, & AI's 1914 Moment

    I chatted with Tony Blair about:

    - What he learned from Lee Kuan Yew

    - Intelligence agencies track record on Iraq & Ukraine

    - What he tells the dozens of world leaders who come seek advice from him

    - How much of a PM’s time is actually spent governing

    - What will AI’s July 1914 moment look like from inside the Cabinet?

    Enjoy!

    Watch the video on YouTube. Read the full transcript here.

    Follow me on Twitter for updates on future episodes.

    Sponsors

    - Prelude Security is the world’s leading cyber threat management automation platform. Prelude Detect quickly transforms threat intelligence into validated protections so organizations can know with certainty that their defenses will protect them against the latest threats. Prelude is backed by Sequoia Capital, Insight Partners, The MITRE Corporation, CrowdStrike, and other leading investors. Learn more here.

    - This episode is brought to you by Stripe, financial infrastructure for the internet. Millions of companies from Anthropic to Amazon use Stripe to accept payments, automate financial processes and grow their revenue.

    If you’re interested in advertising on the podcast, check out this page.

    Timestamps

    (00:00:00) – A prime minister’s constraints

    (00:04:12) – CEOs vs. politicians

    (00:10:31) – COVID, AI, & how government deals with crisis

    (00:21:24) – Learning from Lee Kuan Yew

    (00:27:37) – Foreign policy & intelligence

    (00:31:12) – How much leadership actually matters

    (00:35:34) – Private vs. public tech

    (00:39:14) – Advising global leaders

    (00:46:45) – The unipolar moment in the 90s



    Get full access to Dwarkesh Podcast at www.dwarkeshpatel.com/subscribe
    Dwarkesh Podcast
    enJune 26, 2024

    Francois Chollet, Mike Knoop - LLMs won’t lead to AGI - $1,000,000 Prize to find true solution

    Francois Chollet, Mike Knoop - LLMs won’t lead to AGI - $1,000,000 Prize to find true solution

    Here is my conversation with Francois Chollet and Mike Knoop on the $1 million ARC-AGI Prize they're launching today.

    I did a bunch of socratic grilling throughout, but Francois’s arguments about why LLMs won’t lead to AGI are very interesting and worth thinking through.

    It was really fun discussing/debating the cruxes. Enjoy!

    Watch on YouTube. Listen on Apple PodcastsSpotify, or any other podcast platform. Read the full transcript here.

    Timestamps

    (00:00:00) – The ARC benchmark

    (00:11:10) – Why LLMs struggle with ARC

    (00:19:00) – Skill vs intelligence

    (00:27:55) - Do we need “AGI” to automate most jobs?

    (00:48:28) – Future of AI progress: deep learning + program synthesis

    (01:00:40) – How Mike Knoop got nerd-sniped by ARC

    (01:08:37) – Million $ ARC Prize

    (01:10:33) – Resisting benchmark saturation

    (01:18:08) – ARC scores on frontier vs open source models

    (01:26:19) – Possible solutions to ARC Prize



    Get full access to Dwarkesh Podcast at www.dwarkeshpatel.com/subscribe
    Dwarkesh Podcast
    enJune 11, 2024

    Leopold Aschenbrenner - China/US Super Intelligence Race, 2027 AGI, & The Return of History

    Leopold Aschenbrenner - China/US Super Intelligence Race, 2027 AGI, & The Return of History

    Chatted with my friend Leopold Aschenbrenner on the trillion dollar nationalized cluster, CCP espionage at AI labs, how unhobblings and scaling can lead to 2027 AGI, dangers of outsourcing clusters to Middle East, leaving OpenAI, and situational awareness.

    Watch on YouTube. Listen on Apple PodcastsSpotify, or any other podcast platform. Read the full transcript here.

    Follow me on Twitter for updates on future episodes. Follow Leopold on Twitter.

    Timestamps

    (00:00:00) – The trillion-dollar cluster and unhobbling

    (00:20:31) – AI 2028: The return of history

    (00:40:26) – Espionage & American AI superiority

    (01:08:20) – Geopolitical implications of AI

    (01:31:23) – State-led vs. private-led AI

    (02:12:23) – Becoming Valedictorian of Columbia at 19

    (02:30:35) – What happened at OpenAI

    (02:45:11) – Accelerating AI research progress

    (03:25:58) – Alignment

    (03:41:26) – On Germany, and understanding foreign perspectives

    (03:57:04) – Dwarkesh’s immigration story and path to the podcast

    (04:07:58) – Launching an AGI hedge fund

    (04:19:14) – Lessons from WWII

    (04:29:08) – Coda: Frederick the Great



    Get full access to Dwarkesh Podcast at www.dwarkeshpatel.com/subscribe
    Dwarkesh Podcast
    enJune 04, 2024

    John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

    John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

    Chatted with John Schulman (cofounded OpenAI and led ChatGPT creation) on how posttraining tames the shoggoth, and the nature of the progress to come...

    Watch on YouTube. Listen on Apple PodcastsSpotify, or any other podcast platform. Read the full transcript here. Follow me on Twitter for updates on future episodes.

    Timestamps

    (00:00:00) - Pre-training, post-training, and future capabilities

    (00:16:57) - Plan for AGI 2025

    (00:29:19) - Teaching models to reason

    (00:40:50) - The Road to ChatGPT

    (00:52:13) - What makes for a good RL researcher?

    (01:00:58) - Keeping humans in the loop

    (01:15:15) - State of research, plateaus, and moats

    Sponsors

    If you’re interested in advertising on the podcast, fill out this form.

    * Your DNA shapes everything about you. Want to know how? Take 10% off our Premium DNA kit with code DWARKESH at mynucleus.com.

    * CommandBar is an AI user assistant that any software product can embed to non-annoyingly assist, support, and unleash their users. Used by forward-thinking CX, product, growth, and marketing teams. Learn more at commandbar.com.



    Get full access to Dwarkesh Podcast at www.dwarkeshpatel.com/subscribe
    Dwarkesh Podcast
    enMay 15, 2024

    Mark Zuckerberg - Llama 3, Open Sourcing $10b Models, & Caesar Augustus

    Mark Zuckerberg - Llama 3, Open Sourcing $10b Models, & Caesar Augustus

    Mark Zuckerberg on:

    - Llama 3

    - open sourcing towards AGI

    - custom silicon, synthetic data, & energy constraints on scaling

    - Caesar Augustus, intelligence explosion, bioweapons, $10b models, & much more

    Enjoy!

    Watch on YouTube. Listen on Apple PodcastsSpotify, or any other podcast platform. Human edited transcript with helpful links here.

    Timestamps

    (00:00:00) - Llama 3

    (00:08:32) - Coding on path to AGI

    (00:25:24) - Energy bottlenecks

    (00:33:20) - Is AI the most important technology ever?

    (00:37:21) - Dangers of open source

    (00:53:57) - Caesar Augustus and metaverse

    (01:04:53) - Open sourcing the $10b model & custom silicon

    (01:15:19) - Zuck as CEO of Google+

    Sponsors

    If you’re interested in advertising on the podcast, fill out this form.

    * This episode is brought to you by Stripe, financial infrastructure for the internet. Millions of companies from Anthropic to Amazon use Stripe to accept payments, automate financial processes and grow their revenue. Learn more at stripe.com.

    * V7 Go is a tool to automate multimodal tasks using GenAI, reliably and at scale. Use code DWARKESH20 for 20% off on the pro plan. Learn more here.

    * CommandBar is an AI user assistant that any software product can embed to non-annoyingly assist, support, and unleash their users. Used by forward-thinking CX, product, growth, and marketing teams. Learn more at commandbar.com.



    Get full access to Dwarkesh Podcast at www.dwarkeshpatel.com/subscribe

    Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

    Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

    Had so much fun chatting with my good friends Trenton Bricken and Sholto Douglas on the podcast.

    No way to summarize it, except: 

    This is the best context dump out there on how LLMs are trained, what capabilities they're likely to soon have, and what exactly is going on inside them.

    You would be shocked how much of what I know about this field, I've learned just from talking with them.

    To the extent that you've enjoyed my other AI interviews, now you know why.

    So excited to put this out. Enjoy! I certainly did :)

    Watch on YouTube. Listen on Apple PodcastsSpotify, or any other podcast platform. 

    There's a transcript with links to all the papers the boys were throwing down - may help you follow along.

    Follow Trenton and Sholto on Twitter.

    Timestamps

    (00:00:00) - Long contexts

    (00:16:12) - Intelligence is just associations

    (00:32:35) - Intelligence explosion & great researchers

    (01:06:52) - Superposition & secret communication

    (01:22:34) - Agents & true reasoning

    (01:34:40) - How Sholto & Trenton got into AI research

    (02:07:16) - Are feature spaces the wrong way to think about intelligence?

    (02:21:12) - Will interp actually work on superhuman models

    (02:45:05) - Sholto’s technical challenge for the audience

    (03:03:57) - Rapid fire



    Get full access to Dwarkesh Podcast at www.dwarkeshpatel.com/subscribe

    Demis Hassabis - Scaling, Superhuman AIs, AlphaZero atop LLMs, Rogue Nations Threat

    Demis Hassabis - Scaling, Superhuman AIs, AlphaZero atop LLMs, Rogue Nations Threat

    Here is my episode with Demis Hassabis, CEO of Google DeepMind

    We discuss:

    * Why scaling is an artform

    * Adding search, planning, & AlphaZero type training atop LLMs

    * Making sure rogue nations can't steal weights

    * The right way to align superhuman AIs and do an intelligence explosion

    Watch on YouTube. Listen on Apple PodcastsSpotify, or any other podcast platform. Read the full transcript here.

    Timestamps

    (0:00:00) - Nature of intelligence

    (0:05:56) - RL atop LLMs

    (0:16:31) - Scaling and alignment

    (0:24:13) - Timelines and intelligence explosion

    (0:28:42) - Gemini training

    (0:35:30) - Governance of superhuman AIs

    (0:40:42) - Safety, open source, and security of weights

    (0:47:00) - Multimodal and further progress

    (0:54:18) - Inside Google DeepMind



    Get full access to Dwarkesh Podcast at www.dwarkeshpatel.com/subscribe

    Patrick Collison (Stripe CEO) - Craft, Beauty, & The Future of Payments

    Patrick Collison (Stripe CEO) - Craft, Beauty, & The Future of Payments

    We discuss:

    * what it takes to process $1 trillion/year

    * how to build multi-decade APIs, companies, and relationships

    * what's next for Stripe (increasing the GDP of the internet is quite an open ended prompt, and the Collison brothers are just getting started).

    Plus the amazing stuff they're doing at Arc Institute, the financial infrastructure for AI agents, playing devil's advocate against progress studies, and much more.

    Watch on YouTube. Listen on Apple PodcastsSpotify, or any other podcast platform. Read the full transcript here. Follow me on Twitter for updates on future episodes.

    Timestamps

    (00:00:00) - Advice for 20-30 year olds

    (00:12:12) - Progress studies

    (00:22:21) - Arc Institute

    (00:34:27) - AI & Fast Grants

    (00:43:46) - Stripe history

    (00:55:44) - Stripe Climate

    (01:01:39) - Beauty & APIs

    (01:11:51) - Financial innards

    (01:28:16) - Stripe culture & future

    (01:41:56) - Virtues of big businesses

    (01:51:41) - John



    Get full access to Dwarkesh Podcast at www.dwarkeshpatel.com/subscribe

    Tyler Cowen - Hayek, Keynes, & Smith on AI, Animal Spirits, Anarchy, & Growth

    Tyler Cowen - Hayek, Keynes, & Smith on AI, Animal Spirits, Anarchy, & Growth

    It was a great pleasure speaking with Tyler Cowen for the 3rd time.

    We discussed GOAT: Who is the Greatest Economist of all Time and Why Does it Matter?, especially in the context of how the insights of Hayek, Keynes, Smith, and other great economists help us make sense of AI, growth, animal spirits, prediction markets, alignment, central planning, and much more.

    The topics covered in this episode are too many to summarize. Hope you enjoy!

    Watch on YouTube. Listen on Apple PodcastsSpotify, or any other podcast platform. Read the full transcript here. Follow me on Twitter for updates on future episodes.

    Timestamps

    (0:00:00) - John Maynard Keynes

    (00:17:16) - Controversy

    (00:25:02) - Fredrick von Hayek

    (00:47:41) - John Stuart Mill

    (00:52:41) - Adam Smith

    (00:58:31) - Coase, Schelling, & George

    (01:08:07) - Anarchy

    (01:13:16) - Cheap WMDs

    (01:23:18) - Technocracy & political philosophy

    (01:34:16) - AI & Scaling



    Get full access to Dwarkesh Podcast at www.dwarkeshpatel.com/subscribe

    Lessons from The Years of Lyndon Johnson by Robert Caro [Narration]

    Lessons from The Years of Lyndon Johnson by Robert Caro [Narration]

    This is a narration of my blog post, Lessons from The Years of Lyndon Johnson by Robert Caro.

    You read the full post here: https://www.dwarkeshpatel.com/p/lyndon-johnson

    Listen on Apple Podcasts, Spotify, or any other podcast platform. Follow me on Twitter for updates on future posts and episodes.



    Get full access to Dwarkesh Podcast at www.dwarkeshpatel.com/subscribe