Logo
    Search

    Podcast Summary

    • Understanding Implications of AI Correlations for PeopleWhile AI can support and automate tasks, human oversight is crucial for critical decision-making processes. Monitoring and controlling AI behavior is essential to mitigate failures, which can manifest as data issues or behavioral problems.

      While technology, including AI, is designed to support and automate human tasks, it's crucial not to replace human judgment entirely. The challenge lies in ensuring that human oversight is present in critical decision-making processes, as it doesn't scale to have a person in every situation. During this episode of Practical AI, they discussed the importance of understanding the implications of AI correlations for people, rather than just focusing on the mathematical aspects. Yaron Singer, CEO of Robust Intelligence, joined the conversation to share insights on how to mitigate AI model failures. Failure can manifest in various ways, such as data issues or behavioral problems. Microsoft's chatbot is an example of an AI failure where the bot began making inappropriate and offensive statements, highlighting the importance of monitoring and controlling AI behavior. The podcast aims to explore these challenges and solutions in depth.

    • AI failures can have serious consequencesDespite their advanced capabilities, AI models can fail unexpectedly, causing harm to people and businesses. It's crucial for organizations to understand potential risks and limitations, and take measures to mitigate them.

      AI models, despite their advanced capabilities, can still fail in unexpected ways. The Microsoft chatbot example showed how a bot trained on human conversation data could be manipulated to spew racist slurs. More recently, Zillow's AI-based housing pricing failed due to changing market conditions caused by the pandemic. These failures highlight the risks associated with AI systems, including potential harm to people and businesses. As AI adoption grows at an exponential rate, the stakes get higher. For instance, AI models are increasingly used in insurance and healthcare industries to determine rates and diagnoses, respectively. These applications can have significant consequences if the models fail or provide inaccurate results. It's crucial for organizations to understand the potential risks and limitations of AI models and take appropriate measures to mitigate them.

    • Understanding AI Risks: Adversarial vs. Distributional ChangeMicrosoft takes an agnostic approach to AI risks, focusing on reducing risks from both adversarial and distributional change, using appropriate algorithms to protect models accordingly.

      While AI is being used in various sectors like lending and policing with good intentions, the risks associated with AI failures are enormous. These failures can be intentional or unintentional. For instance, adversaries can exploit AI models, or environmental influences like the Zillow example can cause unintended consequences. Microsoft takes an agnostic approach to these failure situations. They believe that the root cause of the risk does not matter, as the important thing is to protect the models from failure. However, the type of algorithms used to protect the models from different types of failures can vary. For example, protecting against distributional drift due to changing conditions requires different algorithms than protecting against adversarial input. When engaging with clients, it's essential to understand their perception of the main category of risk - adversarial or distributional change. While some companies may have a higher risk vector for adversarial parties due to their industry, it's crucial to have a clear understanding of the reality of the risks involved. Microsoft's approach is to reduce the risk from a technical perspective, regardless of the root cause.

    • Balancing automation and human intervention in AI systemsCompanies must build AI systems that prioritize human oversight while also automating retraining tasks to maintain safety, security, and ethics.

      The role of humans in the loop in AI systems is a crucial consideration for managing risk, especially as the industry moves towards increased automation. While some teams may prioritize protecting their models from adversarial input, others may focus on adapting to changing conditions. Companies must build systems that address both types of concerns. OpenAI, for instance, emphasizes the importance of human oversight in automated applications. However, industry experts predict that in a few years, most retraining tasks will be done automatically. It's essential to strike a balance between automation and human intervention, ensuring that AI systems remain safe, secure, and ethical.

    • Managing Risks in AI SystemsTo ensure safety and minimize risks in AI systems, organizations must prioritize regular retraining, eliminate potential risks, and support human decision making through debiasing judgment and collaboration.

      As AI technology advances and automation becomes more prevalent, organizations must prioritize risk management. This includes ensuring automated AI systems undergo regular retraining and eliminating potential risks. The world of AI is moving towards automation, and companies will likely have hundreds or even thousands of models in the future. However, there are risks associated with full automation, human-automation collaboration, and human-only decision making. The goal should be to support human decision making and debias judgment, while ensuring safety and minimizing risks. It may not be feasible to have a human in every critical decision junction, so it's essential to make every effort to make AI as safe and risk-free as possible. This involves understanding the unique risks associated with each scenario and making informed decisions based on those assessments.

    • Theoretical limitations of machine learning for good algorithmic decisionsMachine learning models have limitations in making good decisions due to theoretical issues, including the breakdown of mathematical definitions and the exponential dependence on input dimensions for accurate data.

      While machine learning models have revolutionized technology and industry, there are significant theoretical limitations to making good algorithmic decisions based on machine learning inputs. The speaker, a professor with a background in academia and machine learning research at institutions like Berkeley, Google, and Harvard, has dedicated much of their career to studying these limitations. They've found that mathematical definitions of learning and decision making break down when applied to more complex decision making on top of machine learning results. Additionally, even small errors in models can lead to arbitrarily large errors, making the data required for good decisions exponentially dependent on the input dimension. Despite these challenges, the speaker has published influential papers on the topic and developed algorithms for noise-robust decision making. The journey to understanding these limitations began in academia, where the speaker realized the theoretical foundations for machine learning were lacking, and they have since dedicated their research to addressing these issues.

    • Decoupling model building and security in AIDecoupling model building and security in AI allows for more effective approaches to each consideration, with AI firewalls acting as protective layers to monitor and test incoming data.

      In the field of AI and machine learning, it's crucial to decouple the process of building and training models from ensuring their security and safety. This decoupling allows for a more effective approach to addressing each consideration separately. For instance, mathematically, making a model robust to adversarial inputs can significantly decrease its accuracy. However, from a product or engineering perspective, it's unrealistic for the same team to be responsible for both building and securing the model. An analogy to this concept is the separation of concerns in software engineering. While developers focus on solving the primary problem, security measures are implemented as separate components. This approach has led to significant success in software development. In the context of AI, an AI firewall acts as a protective layer around the model. It monitors and tests incoming data, preventing it from causing mistakes or bad predictions. By focusing on catching bad data instead of building a better model, the decoupling process simplifies the task and reduces the risk of errors. When it comes to identifying risky data, the AI firewall undergoes stress testing to evaluate the models' performance. This process can be done implicitly or explicitly, ensuring that the system remains secure and effective.

    • AI firewalls learn implicitly from model responses and logsAI firewalls can continuously test model predictions from logs, ensuring accurate results and minimal disruption to existing MLOps and DevOps processes

      AI firewalls can be trained implicitly through testing and monitoring model responses to various inputs. This approach allows the AI firewall to learn how different inputs affect model predictions and identify potential errors or distribution shifts. For instance, if a model is predicting whether someone will earn above $100,000 next year, the AI firewall can detect when a feature like age is incorrectly inputted as a year instead of an age. By understanding the normal distribution of age, the AI firewall can alert and prevent such errors, ensuring more accurate predictions. To implement this in an existing pipeline, the best approach is minimal integration. The AI firewall can continuously test model predictions from logs without being in the critical path, ensuring no disruption to the existing MLOps and DevOps processes. This method, called continuous testing, allows for effective monitoring and improvement of model performance without requiring extensive integration efforts.

    • Integrating AI solutions involves syncing code and testing for biasesEnsure code syncing for AI solutions and test for biases in models to maintain ethical and compliant AI implementation

      Integrating AI solutions like an AI firewall involves syncing code on the model server while ensuring data remains within the organization for compliance reasons. Continuous testing, a concept mentioned, is essential for identifying potential biases and issues in models, particularly for sensitive categories. Testing for bias is crucial for AI practitioners, and automated testing for various forms of bias across different categories is a key feature of robust intelligence systems. Looking ahead, the adoption of AI is expected to expand rapidly in organizations, and within our specific area of focus, AI risk management, there are two significant developments to anticipate. Firstly, the increasing importance of AI risk management as more organizations adopt AI technologies. Secondly, the advancement of AI risk management solutions to address complex issues such as adversarial attacks, explainability, and ethics. These developments will be crucial in ensuring the safe and ethical implementation of AI in various industries.

    • Regulations and best practices for AI security will become standard in the next few yearsOrganizations must prioritize AI security with regulations and best practices to protect models and data, including third-party stress testing and AI firewalls.

      As the use of AI in organizations continues to grow and impact people, regulations will mandate third-party stress testing and the implementation of AI firewalls to protect models and data. These regulations and best practices will likely become standard practice within a few years, and looking back, deploying AI models without proper security measures will seem unthinkable. It's important for organizations to consider this as they move into the future, and it serves as an invitation to revisit this conversation in a few years to reflect on our progress. As we head into 2025, it will be interesting to revisit our predictions and see how far we've come. It has been a pleasure to discuss these topics with you, and we appreciate your work and perspective in the AI community. We look forward to continuing the conversation and pushing these ideas forward. Thank you for listening to Practical AI, and we encourage you to subscribe, recommend the show to a friend, and check out our partners Fastly, LaunchDarkly, and Linode for their support. Special thanks to the break master cylinder for providing the show's music. We'll talk to you again next week.

    Recent Episodes from Practical AI: Machine Learning, Data Science

    Vectoring in on Pinecone

    Vectoring in on Pinecone
    Daniel & Chris explore the advantages of vector databases with Roie Schwaber-Cohen of Pinecone. Roie starts with a very lucid explanation of why you need a vector database in your machine learning pipeline, and then goes on to discuss Pinecone’s vector database, designed to facilitate efficient storage, retrieval, and management of vector data.

    Stanford's AI Index Report 2024

    Stanford's AI Index Report 2024
    We’ve had representatives from Stanford’s Institute for Human-Centered Artificial Intelligence (HAI) on the show in the past, but we were super excited to talk through their 2024 AI Index Report after such a crazy year in AI! Nestor from HAI joins us in this episode to talk about some of the main takeaways including how AI makes workers more productive, the US is increasing regulations sharply, and industry continues to dominate frontier AI research.

    Apple Intelligence & Advanced RAG

    Apple Intelligence & Advanced RAG
    Daniel & Chris engage in an impromptu discussion of the state of AI in the enterprise. Then they dive into the recent Apple Intelligence announcement to explore its implications. Finally, Daniel leads a deep dive into a new topic - Advanced RAG - covering everything you need to know to be practical & productive.

    The perplexities of information retrieval

    The perplexities of information retrieval
    Daniel & Chris sit down with Denis Yarats, Co-founder & CTO at Perplexity, to discuss Perplexity’s sophisticated AI-driven answer engine. Denis outlines some of the deficiencies in search engines, and how Perplexity’s approach to information retrieval improves on traditional search engine systems, with a focus on accuracy and validation of the information provided.

    Using edge models to find sensitive data

    Using edge models to find sensitive data
    We’ve all heard about breaches of privacy and leaks of private health information (PHI). For healthcare providers and those storing this data, knowing where all the sensitive data is stored is non-trivial. Ramin, from Tausight, joins us to discuss how they have deploy edge AI models to help company search through billions of records for PHI.

    Rise of the AI PC & local LLMs

    Rise of the AI PC & local LLMs
    We’ve seen a rise in interest recently and a number of major announcements related to local LLMs and AI PCs. NVIDIA, Apple, and Intel are getting into this along with models like the Phi family from Microsoft. In this episode, we dig into local AI tooling, frameworks, and optimizations to help you navigate this AI niche, and we talk about how this might impact AI adoption in the longer term.

    AI in the U.S. Congress

    AI in the U.S. Congress
    At the age of 72, U.S. Representative Don Beyer of Virginia enrolled at GMU to pursue a Master’s degree in C.S. with a concentration in Machine Learning. Rep. Beyer is Vice Chair of the bipartisan Artificial Intelligence Caucus & Vice Chair of the NDC’s AI Working Group. He is the author of the AI Foundation Model Transparency Act & a lead cosponsor of the CREATE AI Act, the Federal Artificial Intelligence Risk Management Act & the Artificial Intelligence Environmental Impacts Act. We hope you tune into this inspiring, nonpartisan conversation with Rep. Beyer about his decision to dive into the deep end of the AI pool & his leadership in bringing that expertise to Capitol Hill.

    Full-stack approach for effective AI agents

    Full-stack approach for effective AI agents
    There’s a lot of hype about AI agents right now, but developing robust agents isn’t yet a reality in general. Imbue is leading the way towards more robust agents by taking a full-stack approach; from hardware innovations through to user interface. In this episode, Josh, Imbue’s CTO, tell us more about their approach and some of what they have learned along the way.

    Related Episodes

    When data leakage turns into a flood of trouble

    When data leakage turns into a flood of trouble
    Rajiv Shah teaches Daniel and Chris about data leakage, and its major impact upon machine learning models. It’s the kind of topic that we don’t often think about, but which can ruin our results. Raj discusses how to use activation maps and image embedding to find leakage, so that leaking information in our test set does not find its way into our training set.

    Stable Diffusion (Practical AI #193)

    Stable Diffusion (Practical AI #193)
    The new stable diffusion model is everywhere! Of course you can use this model to quickly and easily create amazing, dream-like images to post on twitter, reddit, discord, etc., but this technology is also poised to be used in very pragmatic ways across industry. In this episode, Chris and Daniel take a deep dive into all things stable diffusion. They discuss the motivations for the work, the model architecture, and the differences between this model and other related releases (e.g., DALL·E 2). (Image from stability.ai)

    AlphaFold is revolutionizing biology

    AlphaFold is revolutionizing biology
    AlphaFold is an AI system developed by DeepMind that predicts a protein’s 3D structure from its amino acid sequence. It regularly achieves accuracy competitive with experiment, and is accelerating research in nearly every field of biology. Daniel and Chris delve into protein folding, and explore the implications of this revolutionary and hugely impactful application of AI.

    Zero-shot multitask learning (Practical AI #158)

    Zero-shot multitask learning (Practical AI #158)
    In this Fully-Connected episode, Daniel and Chris ponder whether in-person AI conferences are on the verge of making a post-pandemic comeback. Then on to BigScience from Hugging Face, a year-long research workshop on large multilingual models and datasets. Specifically they dive into the T0, a series of natural language processing (NLP) AI models specifically trained for researching zero-shot multitask learning. Daniel provides a brief tour of the possible with the T0 family. They finish up with a couple of new learning resources.