Former OpenAI Engineer William Saunders on Silence, Safety, and the Right to Warn

enJune 07, 2024

Your Undivided Attention

Podcast Summary

AI safety concerns: The pursuit of market dominance and speed in AI development at OpenAI led some employees to resign due to under-addressed safety concerns, emphasizing the importance of prioritizing safety in AI development, especially in the absence of regulation
The race for market dominance and speed in the development of artificial intelligence (AI) can lead companies to prioritize these goals over safety, potentially resulting in under-addressed risks. This was highlighted in an open letter called "The Right to Warn" signed by 11 current and former OpenAI employees, including William Saunders, our guest today. William worked at OpenAI for three years, where he was part of the alignment team, focusing on ensuring AI systems behave as intended, even when they might be smarter than their creators. He later transitioned to interpretability research, which aims to understand what happens inside these models to improve safety. However, William and other employees expressed concerns about OpenAI's focus on market dominance and speed, leading them to resign. This issue is particularly significant given the lack of regulation in the US for AI systems, making the role of insiders in raising safety concerns even more crucial.
Machine learning transparency: Machine learning systems lack transparency and understanding the reasoning behind their actions can be challenging, making it important to ensure their safety and trustworthiness as they become more sophisticated and widespread.
Machine learning systems, unlike traditional technologies, are not designed by humans with a clear understanding of how each component fits together. Instead, these systems are developed by defining the desired outcome, such as predicting the next word in a text sequence. The machine learning process then produces a system that can perform this task effectively, but the reasoning behind its actions may not be transparent or understandable to humans. This lack of transparency can be problematic if the system is applied in new contexts, as it may behave in unexpected ways. Engineers working on these systems are not coding them line by line, but rather training them to develop emergent capabilities, much like genes replicating and creating complex behaviors. Interpreting the workings of these systems is a complex and challenging task, akin to understanding the functions of DNA in a biological system. As machine learning systems become more sophisticated and widespread, it becomes increasingly important to ensure their safety and trustworthiness. The potential consequences of a system giving incorrect advice to CEOs or politicians, for instance, could be significant. Understanding the inner workings of these systems is crucial for identifying and addressing any potential issues. The process of interpreting machine learning systems is like trying to understand a complex black box, and it requires careful analysis and expertise.
AI risks and mitigation: As AI systems become smarter than humans, there are significant risks involved, including potential for replacement in decision-making roles, hidden capabilities, and even a desire for power and money. Interpretability can help mitigate these risks by identifying capabilities and potential biases, but ongoing research and vigilance are crucial.
As AI systems become smarter than humans, there are significant risks involved. These risks include the potential for AI systems to replace humans in decision-making roles, as well as the possibility that they may have hidden capabilities that could be detrimental if not identified before widespread integration into society. The conversation also touched upon the possibility of AI systems developing a desire for power and money, potentially leading to a world where they are in control. Interpretability, which involves understanding how AI models work, is one way to mitigate these risks by identifying capabilities and potential biases before they are widely used. However, it's important to note that interpretability is not the only solution and the capabilities of future AI models are still largely unknown. The conversation underscored the importance of ongoing research and vigilance in this area. The speaker, who is a researcher in this field, expressed concern about the potential consequences of not being able to understand or predict the capabilities of future AI models. He also shared his personal motivation for working at OpenAI, which is to contribute to making AI safe and beneficial for humanity.
AI development safety concerns: Pressure to expedite AI development at OpenAI may compromise safety, leading to potential negative consequences in the real world. One employee considered resigning due to unaddressed safety concerns.
Working at an advanced AI research lab like OpenAI comes with both benefits and challenges. The benefits include access to cutting-edge technology and the ability to contribute to important research. However, there are also potential downsides, such as the pressure to meet deadlines and the risk of making decisions that could have negative consequences. The speaker in this conversation expressed concerns about the potential for shortcuts being taken in the name of expediency, which could compromise safety. They felt that there was a pattern of putting pressure to ship AI systems before they were fully ready, leading to preventable issues in the real world. These concerns led the speaker to consider resigning from OpenAI, as they felt that their voice was not being heard and that the company was not taking sufficient steps to address safety concerns. Ultimately, they believed that it was important for someone to stay within the organization and advocate for a more cautious approach to AI development.
Technological Risks: Ignoring or downplaying potential risks in advanced technologies like AI can lead to unresolved situations and unaddressed concerns, potentially causing significant harm
The drive for innovation and investment in advanced technologies, such as AI, comes with significant risks and pressures to deliver returns. This can lead to concerns being overlooked or dismissed, as those responsible feel the need to justify the massive financial commitments. The speaker shares their experience of raising safety concerns within a company and feeling uncomfortable with the response. They also express disappointment with the investigation into a high-profile case, which left important details unaddressed and the situation unresolved. Furthermore, the speaker challenges the use of the term "science-based concerns" to downplay potential risks, using the analogy of testing airplanes only over land before flying them over oceans. These risks, while not yet proven, should not be disregarded, but acknowledged and addressed in a transparent and responsible manner.
Preventative measures vs Reacting to Problems: Importance of addressing potential AI issues before they become crises, using confidentiality to encourage employees to speak up and not signing non-disparagement agreements that limit speaking out about safety concerns.
The distinction between taking preventative measures and reacting to problems is crucial, especially when dealing with advanced AI systems. The discussion highlights the importance of addressing potential issues before they become crises, using the example of an airline that didn't take measures to prevent planes from crashing into water until it was too late. The interview also touches upon the importance of confidentiality for employees who wish to speak up about concerns, with the example of non-disparagement agreements that can prevent employees from sharing important information. The right to warn principles mentioned in the letter aim to address these issues by allowing employees to share concerns about safety and ethical issues without fear of retaliation. The first principle focuses on the importance of not signing non-disparagement agreements that limit one's ability to speak out about safety concerns. The overall theme is the importance of creating a culture where employees feel safe and encouraged to speak up about potential issues before they become crises.
Employee confidentiality clauses: Companies should allow anonymous reporting of concerns, create a culture of open communication, and not retaliate against employees who speak out about potential risks or concerns.
Companies should establish ethical practices when it comes to employee agreements and confidentiality clauses. The use of excessive time pressure, forbidding critical statements based on public information, and the potential loss of equity can discourage employees from speaking out about potential risks or concerns. To address this, companies should establish anonymous processes for employees to raise concerns to the board, regulators, and independent experts. This allows employees to feel secure in raising valid concerns without fear of retaliation. Additionally, creating a culture where it's acceptable to discuss non-confidential information and implementing clear guidelines for handling concerns can prevent misunderstandings and potential conflicts. Companies that fail to implement these processes should not retaliate against employees who go public with their concerns.
Companies' transparency on safety concerns: Companies should not suppress safety information, an independent body should evaluate safety commitments, and public and regulators should not solely trust companies' claims
The public's right to know about potential safety concerns regarding advanced technologies, such as nuclear fusion and social media algorithms, should be prioritized. Companies should not be allowed to suppress information that could impact the public interest. The history of social media and whistleblowing incidents has shown that internal research on safety and trust can be disincentivized when companies are liable for the information they uncover. Therefore, there should be an independent body to evaluate if companies have met their safety commitments and addressed known issues. The public and regulators should not trust companies' claims about safety without independent verification.
Expert assessment of technology safety: Independent experts are crucial for assessing technology safety and ethical use, as self-reported actions may have limitations and conflicts of interest can pose risks.
Ensuring the safety and ethical use of technology requires independent experts to assess the actions being taken, as there may be limitations to what can be accomplished through self-reported lists of actions. William Lehr, a cybersecurity and technology policy expert, emphasized the importance of transparency and the potential risks of conflicts of interest. He also expressed optimism that the technology can be developed safely with dedication and hard work. The Center for Humane Technology, a nonprofit organization, produces this podcast, which aims to catalyze a humane future. The team includes Julius Scott as senior producer, Josh Lash as researcher and producer, Sasha Fegan as executive producer, Jeff Sudeiken for mixing, and Ryan and Hayes Holiday for original music. Listeners are encouraged to rate the podcast on Apple Podcasts to help others discover it. The team thanks you for your undivided attention.

Recent Episodes from Your Undivided Attention

Why Are Migrants Becoming AI Test Subjects? With Petra Molnar

Climate change, political instability, hunger. These are just some of the forces behind an unprecedented refugee crisis that’s expected to include over a billion people by 2050. In response to this growing crisis, wealthy governments like the US and the EU are employing novel AI and surveillance technologies to slow the influx of migrants at their borders. But will this rollout stop at the border?

In this episode, Tristan and Aza sit down with Petra Molnar to discuss how borders have become a proving ground for the sharpest edges of technology, and especially AI. Petra is an immigration lawyer and co-creator of the Migration and Technology Monitor. Her new book is “The Walls Have Eyes: Surviving Migration in the Age of Artificial Intelligence.”

RECOMMENDED MEDIA

The Walls Have Eyes: Surviving Migration in the Age of Artificial Intelligence

Petra’s newly published book on the rollout of high risk tech at the border.

Bots at the Gate

A report co-authored by Petra about Canada’s use of AI technology in their immigration process.

Technological Testing Grounds

A report authored by Petra about the use of experimental technology in EU border enforcement.

Startup Pitched Tasing Migrants from Drones, Video Reveals

An article from The Intercept, containing the demo for Brinc’s taser drone pilot program.

The UNHCR

Information about the global refugee crisis from the UN.

RECOMMENDED YUA EPISODES

War is a Laboratory for AI with Paul Scharre

No One is Immune to AI Harms with Dr. Joy Buolamwini

Can We Govern AI? With Marietje Schaake

CLARIFICATION:

The iBorderCtrl project referenced in this episode was a pilot project that was discontinued in 2019

Your Undivided Attention

enJune 20, 2024

Former OpenAI Engineer William Saunders on Silence, Safety, and the Right to Warn

This week, a group of current and former employees from Open AI and Google Deepmind penned an open letter accusing the industry’s leading companies of prioritizing profits over safety. This comes after a spate of high profile departures from OpenAI, including co-founder Ilya Sutskever and senior researcher Jan Leike, as well as reports that OpenAI has gone to great lengths to silence would-be whistleblowers.

The writers of the open letter argue that researchers have a “right to warn” the public about AI risks and laid out a series of principles that would protect that right. In this episode, we sit down with one of those writers: William Saunders, who left his job as a research engineer at OpenAI in February. William is now breaking the silence on what he saw at OpenAI that compelled him to leave the company and to put his name to this letter.

RECOMMENDED MEDIA

The Right to Warn Open Letter

My Perspective On "A Right to Warn about Advanced Artificial Intelligence": A follow-up from William about the letter

Leaked OpenAI documents reveal aggressive tactics toward former employees: An investigation by Vox into OpenAI’s policy of non-disparagement.

RECOMMENDED YUA EPISODES

Your Undivided Attention is produced by the Center for Humane Technology. Follow us on Twitter: @HumaneTech_

Your Undivided Attention

enJune 07, 2024

War is a Laboratory for AI with Paul Scharre

Right now, militaries around the globe are investing heavily in the use of AI weapons and drones. From Ukraine to Gaza, weapons systems with increasing levels of autonomy are being used to kill people and destroy infrastructure and the development of fully autonomous weapons shows little signs of slowing down. What does this mean for the future of warfare? What safeguards can we put up around these systems? And is this runaway trend toward autonomous warfare inevitable or will nations come together and choose a different path? In this episode, Tristan and Daniel sit down with Paul Scharre to try to answer some of these questions. Paul is a former Army Ranger, the author of two books on autonomous weapons and he helped the Department of Defense write a lot of its policy on the use of AI in weaponry.

RECOMMENDED MEDIA

Four Battlegrounds: Power in the Age of Artificial Intelligence: Paul’s book on the future of AI in war, which came out in 2023.

Army of None: Autonomous Weapons and the Future of War: Paul’s 2018 book documenting and predicting the rise of autonomous and semi-autonomous weapons as part of modern warfare.

The Perilous Coming Age of AI Warfare: How to Limit the Threat of Autonomous Warfare: Paul’s article in Foreign Affairs based on his recent trip to the battlefield in Ukraine.

The night the world almost almost ended: A BBC documentary about Stanislav Petrov’s decision not to start nuclear war.

AlphaDogfight Trials Final Event: The full simulated dogfight between an AI and human pilot. The AI pilot swept, 5-0.

RECOMMENDED YUA EPISODES

Your Undivided Attention is produced by the Center for Humane Technology. Follow us on Twitter: @HumaneTech_

Your Undivided Attention

enMay 23, 2024

AI and Jobs: How to Make AI Work With Us, Not Against Us With Daron Acemoglu

Tech companies say that AI will lead to massive economic productivity gains. But as we know from the first digital revolution, that’s not what happened. Can we do better this time around?

RECOMMENDED MEDIA

Power and Progress by Daron Acemoglu and Simon Johnson Professor Acemoglu co-authored a bold reinterpretation of economics and history that will fundamentally change how you see the world

Can we Have Pro-Worker AI? Professor Acemoglu co-authored this paper about redirecting AI development onto the human-complementary path

Rethinking Capitalism: In Conversation with Daron Acemoglu The Wheeler Institute for Business and Development hosted Professor Acemoglu to examine how technology affects the distribution and growth of resources while being shaped by economic and social incentives

RECOMMENDED YUA EPISODES

Your Undivided Attention is produced by the Center for Humane Technology. Follow us on Twitter: @HumaneTech_

Your Undivided Attention

enMay 09, 2024

jobs

artificial intelligence

industrial revolution

john maynard keynes

Jonathan Haidt On How to Solve the Teen Mental Health Crisis

Suicides. Self harm. Depression and anxiety. The toll of a social media-addicted, phone-based childhood has never been more stark. It can be easy for teens, parents and schools to feel like they’re trapped by it all. But in this conversation with Tristan Harris, author and social psychologist Jonathan Haidt makes the case that the conditions that led to today’s teenage mental health crisis can be turned around – with specific, achievable actions we all can take starting today.

This episode was recorded live at the San Francisco Commonwealth Club.

Correction: Tristan mentions that 40 Attorneys General have filed a lawsuit against Meta for allegedly fostering addiction among children and teens through their products. However, the actual number is 42 Attorneys General who are taking legal action against Meta.

Clarification: Jonathan refers to the Wait Until 8th pledge. By signing the pledge, a parent promises not to give their child a smartphone until at least the end of 8th grade. The pledge becomes active once at least ten other families from their child’s grade pledge the same.

Your Undivided Attention

enApril 11, 2024

Chips Are the Future of AI. They’re Also Incredibly Vulnerable. With Chris Miller

Beneath the race to train and release more powerful AI models lies another race: a race by companies and nation-states to secure the hardware to make sure they win AI supremacy.

Correction: The latest available Nvidia chip is the Hopper H100 GPU, which has 80 billion transistors. Since the first commercially available chip had four transistors, the Hopper actually has 20 billion times that number. Nvidia recently announced the Blackwell, which boasts 208 billion transistors - but it won’t ship until later this year.

RECOMMENDED MEDIA

Chip War: The Fight For the World’s Most Critical Technology by Chris Miller

To make sense of the current state of politics, economics, and technology, we must first understand the vital role played by chips

Gordon Moore Biography & Facts

Gordon Moore, the Intel co-founder behind Moore's Law, passed away in March of 2023

Nvidia's GPUs are in high demand - and the company is using AI to accelerate chip production

RECOMMENDED YUA EPISODES

Future-proofing Democracy In the Age of AI with Audrey Tang

How Will AI Affect the 2024 Elections? with Renee DiResta and Carl Miller

The AI ‘Race’: China vs. the US with Jeffrey Ding and Karen Hao

Protecting Our Freedom of Thought with Nita Farahany

Your Undivided Attention is produced by the Center for Humane Technology. Follow us on Twitter: @HumaneTech_

Your Undivided Attention

enMarch 29, 2024

artificial intelligence

Future-proofing Democracy In the Age of AI with Audrey Tang

What does a functioning democracy look like in the age of artificial intelligence? Could AI even be used to help a democracy flourish? Just in time for election season, Taiwan’s Minister of Digital Affairs Audrey Tang returns to the podcast to discuss healthy information ecosystems, resilience to cyberattacks, how to “prebunk” deepfakes, and more.

RECOMMENDED MEDIA

Testing Theories of American Politics: Elites, Interest Groups, and Average Citizens by Martin Gilens and Benjamin I. Page

This academic paper addresses tough questions for Americans: Who governs? Who really rules?

Recursive Public

Recursive Public is an experiment in identifying areas of consensus and disagreement among the international AI community, policymakers, and the general public on key questions of governance

A Strong Democracy is a Digital Democracy

Audrey Tang’s 2019 op-ed for The New York Times

The Frontiers of Digital Democracy

Nathan Gardels interviews Audrey Tang in Noema

RECOMMENDED YUA EPISODES

Digital Democracy is Within Reach with Audrey Tang

The Tech We Need for 21st Century Democracy with Divya Siddarth

How Will AI Affect the 2024 Elections? with Renee DiResta and Carl Miller

The AI Dilemma

Your Undivided Attention is produced by the Center for Humane Technology. Follow us on Twitter: @HumaneTech_

Your Undivided Attention

enFebruary 29, 2024

agi

artificial intelligence

U.S. Senators Grilled Social Media CEOs. Will Anything Change?

Was it political progress, or just political theater? The recent Senate hearing with social media CEOs led to astonishing moments — including Mark Zuckerberg’s public apology to families who lost children following social media abuse. Our panel of experts, including Facebook whistleblower Frances Haugen, untangles the explosive hearing, and offers a look ahead, as well. How will this hearing impact protocol within these social media companies? How will it impact legislation? In short: will anything change?

Clarification: Julie says that shortly after the hearing, Meta’s stock price had the biggest increase of any company in the stock market’s history. It was the biggest one-day gain by any company in Wall Street history.

Correction: Frances says it takes Snap three or four minutes to take down exploitative content. In Snap's most recent transparency report, they list six minutes as the median turnaround time to remove exploitative content.

RECOMMENDED MEDIA

Get Media Savvy

Founded by Julie Scelfo, Get Media Savvy is a non-profit initiative working to establish a healthy media environment for kids and families

The Power of One by Frances Haugen

The inside story of France’s quest to bring transparency and accountability to Big Tech

RECOMMENDED YUA EPISODES

Real Social Media Solutions, Now with Frances Haugen

A Conversation with Facebook Whistleblower Frances Haugen

Are the Kids Alright?

Social Media Victims Lawyer Up with Laura Marquez-Garrett

Your Undivided Attention is produced by the Center for Humane Technology. Follow us on Twitter: @HumaneTech_

Your Undivided Attention

enFebruary 13, 2024

Taylor Swift is Not Alone: The Deepfake Nightmare Sweeping the Internet

Over the past year, a tsunami of apps that digitally strip the clothes off real people has hit the market. Now anyone can create fake non-consensual sexual images in just a few clicks. With cases proliferating in high schools, guest presenter Laurie Segall talks to legal scholar Mary Anne Franks about the AI-enabled rise in deep fake porn and what we can do about it.

Correction: Laurie refers to the app 'Clothes Off.' It’s actually named Clothoff. There are many clothes remover apps in this category.

RECOMMENDED MEDIA

Revenge Porn: The Cyberwar Against Women

In a five-part digital series, Laurie Segall uncovers a disturbing internet trend: the rise of revenge porn

The Cult of the Constitution

In this provocative book, Mary Anne Franks examines the thin line between constitutional fidelity and constitutional fundamentalism

Fake Explicit Taylor Swift Images Swamp Social Media

Calls to protect women and crack down on the platforms and technology that spread such images have been reignited

RECOMMENDED YUA EPISODES

No One is Immune to AI Harms

Esther Perel on Artificial Intimacy

Social Media Victims Lawyer Up

The AI Dilemma

Your Undivided Attention is produced by the Center for Humane Technology. Follow us on Twitter: @HumaneTech_

Your Undivided Attention

enFebruary 01, 2024

artificial intelligence

Can Myth Teach Us Anything About the Race to Build Artificial General Intelligence? With Josh Schrei

We usually talk about tech in terms of economics or policy, but the casual language tech leaders often use to describe AI — summoning an inanimate force with the powers of code — sounds more... magical. So, what can myth and magic teach us about the AI race? Josh Schrei, mythologist and host of The Emerald podcast, says that foundational cultural tales like "The Sorcerer's Apprentice" or Prometheus teach us the importance of initiation, responsibility, human knowledge, and care. He argues these stories and myths can guide ethical tech development by reminding us what it is to be human.

Correction: Josh says the first telling of "The Sorcerer’s Apprentice" myth dates back to ancient Egypt, but it actually dates back to ancient Greece.

RECOMMENDED MEDIA

The Emerald podcast

The Emerald explores the human experience through a vibrant lens of myth, story, and imagination

Embodied Ethics in The Age of AI

A five-part course with The Emerald podcast’s Josh Schrei and School of Wise Innovation’s Andrew Dunn

Nature Nurture: Children Can Become Stewards of Our Delicate Planet

A U.S. Department of the Interior study found that the average American kid can identify hundreds of corporate logos but not plants and animals

The New Fire

AI is revolutionizing the world - here's how democracies can come out on top. This upcoming book was authored by an architect of President Biden's AI executive order

RECOMMENDED YUA EPISODES

How Will AI Affect the 2024 Elections?

The AI Dilemma

The Three Rules of Humane Tech

AI Myths and Misconceptions

Your Undivided Attention is produced by the Center for Humane Technology. Follow us on Twitter: @HumaneTech_

Your Undivided Attention

enJanuary 18, 2024