Logo
    Search

    Former OpenAI Engineer William Saunders on Silence, Safety, and the Right to Warn

    enJune 07, 2024

    Podcast Summary

    • AI safety concernsThe pursuit of market dominance and speed in AI development at OpenAI led some employees to resign due to under-addressed safety concerns, emphasizing the importance of prioritizing safety in AI development, especially in the absence of regulation

      The race for market dominance and speed in the development of artificial intelligence (AI) can lead companies to prioritize these goals over safety, potentially resulting in under-addressed risks. This was highlighted in an open letter called "The Right to Warn" signed by 11 current and former OpenAI employees, including William Saunders, our guest today. William worked at OpenAI for three years, where he was part of the alignment team, focusing on ensuring AI systems behave as intended, even when they might be smarter than their creators. He later transitioned to interpretability research, which aims to understand what happens inside these models to improve safety. However, William and other employees expressed concerns about OpenAI's focus on market dominance and speed, leading them to resign. This issue is particularly significant given the lack of regulation in the US for AI systems, making the role of insiders in raising safety concerns even more crucial.

    • Machine learning transparencyMachine learning systems lack transparency and understanding the reasoning behind their actions can be challenging, making it important to ensure their safety and trustworthiness as they become more sophisticated and widespread.

      Machine learning systems, unlike traditional technologies, are not designed by humans with a clear understanding of how each component fits together. Instead, these systems are developed by defining the desired outcome, such as predicting the next word in a text sequence. The machine learning process then produces a system that can perform this task effectively, but the reasoning behind its actions may not be transparent or understandable to humans. This lack of transparency can be problematic if the system is applied in new contexts, as it may behave in unexpected ways. Engineers working on these systems are not coding them line by line, but rather training them to develop emergent capabilities, much like genes replicating and creating complex behaviors. Interpreting the workings of these systems is a complex and challenging task, akin to understanding the functions of DNA in a biological system. As machine learning systems become more sophisticated and widespread, it becomes increasingly important to ensure their safety and trustworthiness. The potential consequences of a system giving incorrect advice to CEOs or politicians, for instance, could be significant. Understanding the inner workings of these systems is crucial for identifying and addressing any potential issues. The process of interpreting machine learning systems is like trying to understand a complex black box, and it requires careful analysis and expertise.

    • AI risks and mitigationAs AI systems become smarter than humans, there are significant risks involved, including potential for replacement in decision-making roles, hidden capabilities, and even a desire for power and money. Interpretability can help mitigate these risks by identifying capabilities and potential biases, but ongoing research and vigilance are crucial.

      As AI systems become smarter than humans, there are significant risks involved. These risks include the potential for AI systems to replace humans in decision-making roles, as well as the possibility that they may have hidden capabilities that could be detrimental if not identified before widespread integration into society. The conversation also touched upon the possibility of AI systems developing a desire for power and money, potentially leading to a world where they are in control. Interpretability, which involves understanding how AI models work, is one way to mitigate these risks by identifying capabilities and potential biases before they are widely used. However, it's important to note that interpretability is not the only solution and the capabilities of future AI models are still largely unknown. The conversation underscored the importance of ongoing research and vigilance in this area. The speaker, who is a researcher in this field, expressed concern about the potential consequences of not being able to understand or predict the capabilities of future AI models. He also shared his personal motivation for working at OpenAI, which is to contribute to making AI safe and beneficial for humanity.

    • AI development safety concernsPressure to expedite AI development at OpenAI may compromise safety, leading to potential negative consequences in the real world. One employee considered resigning due to unaddressed safety concerns.

      Working at an advanced AI research lab like OpenAI comes with both benefits and challenges. The benefits include access to cutting-edge technology and the ability to contribute to important research. However, there are also potential downsides, such as the pressure to meet deadlines and the risk of making decisions that could have negative consequences. The speaker in this conversation expressed concerns about the potential for shortcuts being taken in the name of expediency, which could compromise safety. They felt that there was a pattern of putting pressure to ship AI systems before they were fully ready, leading to preventable issues in the real world. These concerns led the speaker to consider resigning from OpenAI, as they felt that their voice was not being heard and that the company was not taking sufficient steps to address safety concerns. Ultimately, they believed that it was important for someone to stay within the organization and advocate for a more cautious approach to AI development.

    • Technological RisksIgnoring or downplaying potential risks in advanced technologies like AI can lead to unresolved situations and unaddressed concerns, potentially causing significant harm

      The drive for innovation and investment in advanced technologies, such as AI, comes with significant risks and pressures to deliver returns. This can lead to concerns being overlooked or dismissed, as those responsible feel the need to justify the massive financial commitments. The speaker shares their experience of raising safety concerns within a company and feeling uncomfortable with the response. They also express disappointment with the investigation into a high-profile case, which left important details unaddressed and the situation unresolved. Furthermore, the speaker challenges the use of the term "science-based concerns" to downplay potential risks, using the analogy of testing airplanes only over land before flying them over oceans. These risks, while not yet proven, should not be disregarded, but acknowledged and addressed in a transparent and responsible manner.

    • Preventative measures vs Reacting to ProblemsImportance of addressing potential AI issues before they become crises, using confidentiality to encourage employees to speak up and not signing non-disparagement agreements that limit speaking out about safety concerns.

      The distinction between taking preventative measures and reacting to problems is crucial, especially when dealing with advanced AI systems. The discussion highlights the importance of addressing potential issues before they become crises, using the example of an airline that didn't take measures to prevent planes from crashing into water until it was too late. The interview also touches upon the importance of confidentiality for employees who wish to speak up about concerns, with the example of non-disparagement agreements that can prevent employees from sharing important information. The right to warn principles mentioned in the letter aim to address these issues by allowing employees to share concerns about safety and ethical issues without fear of retaliation. The first principle focuses on the importance of not signing non-disparagement agreements that limit one's ability to speak out about safety concerns. The overall theme is the importance of creating a culture where employees feel safe and encouraged to speak up about potential issues before they become crises.

    • Employee confidentiality clausesCompanies should allow anonymous reporting of concerns, create a culture of open communication, and not retaliate against employees who speak out about potential risks or concerns.

      Companies should establish ethical practices when it comes to employee agreements and confidentiality clauses. The use of excessive time pressure, forbidding critical statements based on public information, and the potential loss of equity can discourage employees from speaking out about potential risks or concerns. To address this, companies should establish anonymous processes for employees to raise concerns to the board, regulators, and independent experts. This allows employees to feel secure in raising valid concerns without fear of retaliation. Additionally, creating a culture where it's acceptable to discuss non-confidential information and implementing clear guidelines for handling concerns can prevent misunderstandings and potential conflicts. Companies that fail to implement these processes should not retaliate against employees who go public with their concerns.

    • Companies' transparency on safety concernsCompanies should not suppress safety information, an independent body should evaluate safety commitments, and public and regulators should not solely trust companies' claims

      The public's right to know about potential safety concerns regarding advanced technologies, such as nuclear fusion and social media algorithms, should be prioritized. Companies should not be allowed to suppress information that could impact the public interest. The history of social media and whistleblowing incidents has shown that internal research on safety and trust can be disincentivized when companies are liable for the information they uncover. Therefore, there should be an independent body to evaluate if companies have met their safety commitments and addressed known issues. The public and regulators should not trust companies' claims about safety without independent verification.

    • Expert assessment of technology safetyIndependent experts are crucial for assessing technology safety and ethical use, as self-reported actions may have limitations and conflicts of interest can pose risks.

      Ensuring the safety and ethical use of technology requires independent experts to assess the actions being taken, as there may be limitations to what can be accomplished through self-reported lists of actions. William Lehr, a cybersecurity and technology policy expert, emphasized the importance of transparency and the potential risks of conflicts of interest. He also expressed optimism that the technology can be developed safely with dedication and hard work. The Center for Humane Technology, a nonprofit organization, produces this podcast, which aims to catalyze a humane future. The team includes Julius Scott as senior producer, Josh Lash as researcher and producer, Sasha Fegan as executive producer, Jeff Sudeiken for mixing, and Ryan and Hayes Holiday for original music. Listeners are encouraged to rate the podcast on Apple Podcasts to help others discover it. The team thanks you for your undivided attention.

    Recent Episodes from Your Undivided Attention

    Why Are Migrants Becoming AI Test Subjects? With Petra Molnar

    Why Are Migrants Becoming AI Test Subjects? With Petra Molnar

    Climate change, political instability, hunger. These are just some of the forces behind an unprecedented refugee crisis that’s expected to include over a billion people by 2050. In response to this growing crisis, wealthy governments like the US and the EU are employing novel AI and surveillance technologies to slow the influx of migrants at their borders. But will this rollout stop at the border?

    In this episode, Tristan and Aza sit down with Petra Molnar to discuss how borders have become a proving ground for the sharpest edges of technology, and especially AI. Petra is an immigration lawyer and co-creator of the Migration and Technology Monitor. Her new book is “The Walls Have Eyes: Surviving Migration in the Age of Artificial Intelligence.”

    RECOMMENDED MEDIA

    The Walls Have Eyes: Surviving Migration in the Age of Artificial Intelligence

    Petra’s newly published book on the rollout of high risk tech at the border.

    Bots at the Gate

    A report co-authored by Petra about Canada’s use of AI technology in their immigration process.

    Technological Testing Grounds

    A report authored by Petra about the use of experimental technology in EU border enforcement.

    Startup Pitched Tasing Migrants from Drones, Video Reveals

    An article from The Intercept, containing the demo for Brinc’s taser drone pilot program.

    The UNHCR

    Information about the global refugee crisis from the UN.

    RECOMMENDED YUA EPISODES

    War is a Laboratory for AI with Paul Scharre

    No One is Immune to AI Harms with Dr. Joy Buolamwini

    Can We Govern AI? With Marietje Schaake

    CLARIFICATION:

    The iBorderCtrl project referenced in this episode was a pilot project that was discontinued in 2019

    Former OpenAI Engineer William Saunders on Silence, Safety, and the Right to Warn

    Former OpenAI Engineer William Saunders on Silence, Safety, and the Right to Warn

    This week, a group of current and former employees from Open AI and Google Deepmind penned an open letter accusing the industry’s leading companies of prioritizing profits over safety. This comes after a spate of high profile departures from OpenAI, including co-founder Ilya Sutskever and senior researcher Jan Leike, as well as reports that OpenAI has gone to great lengths to silence would-be whistleblowers. 

    The writers of the open letter argue that researchers have a “right to warn” the public about AI risks and laid out a series of principles that would protect that right. In this episode, we sit down with one of those writers: William Saunders, who left his job as a research engineer at OpenAI in February. William is now breaking the silence on what he saw at OpenAI that compelled him to leave the company and to put his name to this letter. 

    RECOMMENDED MEDIA 

    The Right to Warn Open Letter 

    My Perspective On "A Right to Warn about Advanced Artificial Intelligence": A follow-up from William about the letter

     Leaked OpenAI documents reveal aggressive tactics toward former employees: An investigation by Vox into OpenAI’s policy of non-disparagement.

    RECOMMENDED YUA EPISODES

    1. A First Step Toward AI Regulation with Tom Wheeler 
    2. Spotlight on AI: What Would It Take For This to Go Well? 
    3. Big Food, Big Tech and Big AI with Michael Moss 
    4. Can We Govern AI? with Marietje Schaake

    Your Undivided Attention is produced by the Center for Humane Technology. Follow us on Twitter: @HumaneTech_

    War is a Laboratory for AI with Paul Scharre

    War is a Laboratory for AI with Paul Scharre

    Right now, militaries around the globe are investing heavily in the use of AI weapons and drones.  From Ukraine to Gaza, weapons systems with increasing levels of autonomy are being used to kill people and destroy infrastructure and the development of fully autonomous weapons shows little signs of slowing down. What does this mean for the future of warfare? What safeguards can we put up around these systems? And is this runaway trend toward autonomous warfare inevitable or will nations come together and choose a different path? In this episode, Tristan and Daniel sit down with Paul Scharre to try to answer some of these questions. Paul is a former Army Ranger, the author of two books on autonomous weapons and he helped the Department of Defense write a lot of its policy on the use of AI in weaponry. 

    RECOMMENDED MEDIA

    Four Battlegrounds: Power in the Age of Artificial Intelligence: Paul’s book on the future of AI in war, which came out in 2023.

    Army of None: Autonomous Weapons and the Future of War: Paul’s 2018 book documenting and predicting the rise of autonomous and semi-autonomous weapons as part of modern warfare.

    The Perilous Coming Age of AI Warfare: How to Limit the Threat of Autonomous Warfare: Paul’s article in Foreign Affairs based on his recent trip to the battlefield in Ukraine.

    The night the world almost almost ended: A BBC documentary about Stanislav Petrov’s decision not to start nuclear war.

    AlphaDogfight Trials Final Event: The full simulated dogfight between an AI and human pilot. The AI pilot swept, 5-0.

    RECOMMENDED YUA EPISODES

    1. The AI ‘Race’: China vs. the US with Jeffrey Ding and Karen Hao
    2. Can We Govern AI? with Marietje Schaake
    3. Big Food, Big Tech and Big AI with Michael Moss
    4. The Invisible Cyber-War with Nicole Perlroth

    Your Undivided Attention is produced by the Center for Humane Technology. Follow us on Twitter: @HumaneTech_

    AI and Jobs: How to Make AI Work With Us, Not Against Us With Daron Acemoglu

    AI and Jobs: How to Make AI Work With Us, Not Against Us With Daron Acemoglu

    Tech companies say that AI will lead to massive economic productivity gains. But as we know from the first digital revolution, that’s not what happened. Can we do better this time around?

    RECOMMENDED MEDIA

    Power and Progress by Daron Acemoglu and Simon Johnson Professor Acemoglu co-authored a bold reinterpretation of economics and history that will fundamentally change how you see the world

    Can we Have Pro-Worker AI? Professor Acemoglu co-authored this paper about redirecting AI development onto the human-complementary path

    Rethinking Capitalism: In Conversation with Daron Acemoglu The Wheeler Institute for Business and Development hosted Professor Acemoglu to examine how technology affects the distribution and growth of resources while being shaped by economic and social incentives

    RECOMMENDED YUA EPISODES

    1. The Three Rules of Humane Tech
    2. The Tech We Need for 21st Century Democracy
    3. Can We Govern AI?
    4. An Alternative to Silicon Valley Unicorns

    Your Undivided Attention is produced by the Center for Humane Technology. Follow us on Twitter: @HumaneTech_

    Jonathan Haidt On How to Solve the Teen Mental Health Crisis

    Jonathan Haidt On How to Solve the Teen Mental Health Crisis

    Suicides. Self harm. Depression and anxiety. The toll of a social media-addicted, phone-based childhood has never been more stark. It can be easy for teens, parents and schools to feel like they’re trapped by it all. But in this conversation with Tristan Harris, author and social psychologist Jonathan Haidt makes the case that the conditions that led to today’s teenage mental health crisis can be turned around – with specific, achievable actions we all can take starting today.

    This episode was recorded live at the San Francisco Commonwealth Club.  

    Correction: Tristan mentions that 40 Attorneys General have filed a lawsuit against Meta for allegedly fostering addiction among children and teens through their products. However, the actual number is 42 Attorneys General who are taking legal action against Meta.

    Clarification: Jonathan refers to the Wait Until 8th pledge. By signing the pledge, a parent  promises not to give their child a smartphone until at least the end of 8th grade. The pledge becomes active once at least ten other families from their child’s grade pledge the same.

    Chips Are the Future of AI. They’re Also Incredibly Vulnerable. With Chris Miller

    Chips Are the Future of AI. They’re Also Incredibly Vulnerable. With Chris Miller

    Beneath the race to train and release more powerful AI models lies another race: a race by companies and nation-states to secure the hardware to make sure they win AI supremacy. 

    Correction: The latest available Nvidia chip is the Hopper H100 GPU, which has 80 billion transistors. Since the first commercially available chip had four transistors, the Hopper actually has 20 billion times that number. Nvidia recently announced the Blackwell, which boasts 208 billion transistors - but it won’t ship until later this year.

    RECOMMENDED MEDIA 

    Chip War: The Fight For the World’s Most Critical Technology by Chris Miller

    To make sense of the current state of politics, economics, and technology, we must first understand the vital role played by chips

    Gordon Moore Biography & Facts

    Gordon Moore, the Intel co-founder behind Moore's Law, passed away in March of 2023

    AI’s most popular chipmaker Nvidia is trying to use AI to design chips faster

    Nvidia's GPUs are in high demand - and the company is using AI to accelerate chip production

    RECOMMENDED YUA EPISODES

    Future-proofing Democracy In the Age of AI with Audrey Tang

    How Will AI Affect the 2024 Elections? with Renee DiResta and Carl Miller

    The AI ‘Race’: China vs. the US with Jeffrey Ding and Karen Hao

    Protecting Our Freedom of Thought with Nita Farahany

    Your Undivided Attention is produced by the Center for Humane Technology. Follow us on Twitter: @HumaneTech_

     

     

    Future-proofing Democracy In the Age of AI with Audrey Tang

    Future-proofing Democracy In the Age of AI with Audrey Tang

    What does a functioning democracy look like in the age of artificial intelligence? Could AI even be used to help a democracy flourish? Just in time for election season, Taiwan’s Minister of Digital Affairs Audrey Tang returns to the podcast to discuss healthy information ecosystems, resilience to cyberattacks, how to “prebunk” deepfakes, and more. 

    RECOMMENDED MEDIA 

    Testing Theories of American Politics: Elites, Interest Groups, and Average Citizens by Martin Gilens and Benjamin I. Page

    This academic paper addresses tough questions for Americans: Who governs? Who really rules? 

    Recursive Public

    Recursive Public is an experiment in identifying areas of consensus and disagreement among the international AI community, policymakers, and the general public on key questions of governance

    A Strong Democracy is a Digital Democracy

    Audrey Tang’s 2019 op-ed for The New York Times

    The Frontiers of Digital Democracy

    Nathan Gardels interviews Audrey Tang in Noema

    RECOMMENDED YUA EPISODES 

    Digital Democracy is Within Reach with Audrey Tang

    The Tech We Need for 21st Century Democracy with Divya Siddarth

    How Will AI Affect the 2024 Elections? with Renee DiResta and Carl Miller

    The AI Dilemma

    Your Undivided Attention is produced by the Center for Humane Technology. Follow us on Twitter: @HumaneTech_

    U.S. Senators Grilled Social Media CEOs. Will Anything Change?

    U.S. Senators Grilled Social Media CEOs. Will Anything Change?

    Was it political progress, or just political theater? The recent Senate hearing with social media CEOs led to astonishing moments — including Mark Zuckerberg’s public apology to families who lost children following social media abuse. Our panel of experts, including Facebook whistleblower Frances Haugen, untangles the explosive hearing, and offers a look ahead, as well. How will this hearing impact protocol within these social media companies? How will it impact legislation? In short: will anything change?

    Clarification: Julie says that shortly after the hearing, Meta’s stock price had the biggest increase of any company in the stock market’s history. It was the biggest one-day gain by any company in Wall Street history.

    Correction: Frances says it takes Snap three or four minutes to take down exploitative content. In Snap's most recent transparency report, they list six minutes as the median turnaround time to remove exploitative content.

    RECOMMENDED MEDIA 

    Get Media Savvy

    Founded by Julie Scelfo, Get Media Savvy is a non-profit initiative working to establish a healthy media environment for kids and families

    The Power of One by Frances Haugen

    The inside story of France’s quest to bring transparency and accountability to Big Tech

    RECOMMENDED YUA EPISODES

    Real Social Media Solutions, Now with Frances Haugen

    A Conversation with Facebook Whistleblower Frances Haugen

    Are the Kids Alright?

    Social Media Victims Lawyer Up with Laura Marquez-Garrett

    Your Undivided Attention is produced by the Center for Humane Technology. Follow us on Twitter: @HumaneTech_

     

     

    Taylor Swift is Not Alone: The Deepfake Nightmare Sweeping the Internet

    Taylor Swift is Not Alone: The Deepfake Nightmare Sweeping the Internet

    Over the past year, a tsunami of apps that digitally strip the clothes off real people has hit the market. Now anyone can create fake non-consensual sexual images in just a few clicks. With cases proliferating in high schools, guest presenter Laurie Segall talks to legal scholar Mary Anne Franks about the AI-enabled rise in deep fake porn and what we can do about it. 

    Correction: Laurie refers to the app 'Clothes Off.' It’s actually named Clothoff. There are many clothes remover apps in this category.

    RECOMMENDED MEDIA 

    Revenge Porn: The Cyberwar Against Women

    In a five-part digital series, Laurie Segall uncovers a disturbing internet trend: the rise of revenge porn

    The Cult of the Constitution

    In this provocative book, Mary Anne Franks examines the thin line between constitutional fidelity and constitutional fundamentalism

    Fake Explicit Taylor Swift Images Swamp Social Media

    Calls to protect women and crack down on the platforms and technology that spread such images have been reignited

    RECOMMENDED YUA EPISODES 

    No One is Immune to AI Harms

    Esther Perel on Artificial Intimacy

    Social Media Victims Lawyer Up

    The AI Dilemma

    Your Undivided Attention is produced by the Center for Humane Technology. Follow us on Twitter: @HumaneTech_

    Can Myth Teach Us Anything About the Race to Build Artificial General Intelligence? With Josh Schrei

    Can Myth Teach Us Anything About the Race to Build Artificial General Intelligence? With Josh Schrei

    We usually talk about tech in terms of economics or policy, but the casual language tech leaders often use to describe AI — summoning an inanimate force with the powers of code — sounds more... magical. So, what can myth and magic teach us about the AI race? Josh Schrei, mythologist and host of The Emerald podcast,  says that foundational cultural tales like "The Sorcerer's Apprentice" or Prometheus teach us the importance of initiation, responsibility, human knowledge, and care.  He argues these stories and myths can guide ethical tech development by reminding us what it is to be human. 

    Correction: Josh says the first telling of "The Sorcerer’s Apprentice" myth dates back to ancient Egypt, but it actually dates back to ancient Greece.

    RECOMMENDED MEDIA 

    The Emerald podcast

    The Emerald explores the human experience through a vibrant lens of myth, story, and imagination

    Embodied Ethics in The Age of AI

    A five-part course with The Emerald podcast’s Josh Schrei and School of Wise Innovation’s Andrew Dunn

    Nature Nurture: Children Can Become Stewards of Our Delicate Planet

    A U.S. Department of the Interior study found that the average American kid can identify hundreds of corporate logos but not plants and animals

    The New Fire

    AI is revolutionizing the world - here's how democracies can come out on top. This upcoming book was authored by an architect of President Biden's AI executive order

    RECOMMENDED YUA EPISODES 

    How Will AI Affect the 2024 Elections?

    The AI Dilemma

    The Three Rules of Humane Tech

    AI Myths and Misconceptions

     

    Your Undivided Attention is produced by the Center for Humane Technology. Follow us on Twitter: @HumaneTech_