Logo
    Search

    Podcast Summary

    • Starting simple with large language modelsTo effectively use large language models for product development, start with simple applications, optimize for learning, and build open-source projects to share knowledge publicly.

      Travis Fisher, a founder and CEO of a stealth AI startup, emphasizes the importance of starting simple when using large language models effectively to build products. He became an advocate for this approach after the mainstream adoption of ChatGPT and the rapid progress in AI. Travis believes in optimizing for learning by building open-source projects and sharing knowledge publicly. He shared a diagram on Twitter illustrating the process of using large language models effectively, from simple to complex applications. His experience and approach demonstrate that understanding the basics and starting with simple applications can lead to successful product development in the ever-evolving field of AI.

    • Start simple with hosted foundational models for business use casesHosted foundational models can help validate business use cases quickly and cost-effectively, providing 95% of the solution for various domains, and the community around language model prompting offers techniques to enhance results.

      Using large language models (LLMs) effectively for business applications doesn't always require building a team of ML engineers or creating custom models from scratch. Andrei Kaparthy's recent insights suggest that starting simple with hosted foundational models can help validate business use cases quickly and cost-effectively. This approach can get you 95% of the way to solving many problems in various domains, which was previously locked behind proprietary data providers. This is a democratizing point in the industry, especially for those who are new to AI and want to build applications. Many people go wrong by jumping into too much complexity, and it's essential to start simple and build from there. The community around language model prompting has a hacking culture, and techniques like multistep prompting, information retrieval, and chaining models can go a long way. However, privacy and domain-specific concerns may arise in enterprise use cases. Surprisingly, using a hosted model, pre-training, and retrieval methods can achieve results that were previously unimaginable with just this layer of technology.

    • Integrating LLMs into products: Opportunities and ChallengesEnsure consistency, relevance, factual accuracy, scalability, performance, security, and ethical considerations when integrating large language models into products.

      The integration of large language models (LLMs) into products brings new opportunities but also introduces unique challenges. Surprising applications of LLMs include personal finance management and even hacking unofficial APIs. For instance, an ex-hedge fund manager uses LLMs to extract structured data from his bank's website. However, these models' ease of use and accessibility can lead to unexpected vulnerabilities, as demonstrated by OpenAI's unofficial API wrapper, which led to a "cat and mouse game" and the infamous "meows" incident. As developers and data scientists consider taking LLM integrations from demos to products, they must focus on crucial trade-offs. Quality is the most apparent concern. While LLMs can generate impressive text, ensuring consistency, relevance, and factual accuracy is essential. Additionally, consider scalability and performance. LLMs require significant computational resources, so optimizing for latency and throughput is crucial. Security is another essential aspect, as demonstrated by the OpenAI API wrapper incident. Developers must ensure their LLM integrations are secure and resilient against potential attacks. Lastly, ethical considerations are increasingly important. Ensuring that LLMs are used responsibly, with respect for user privacy and data protection, is a must. In summary, understanding and effectively communicating these trade-offs is crucial when integrating LLMs into products.

    • Considering trade-offs for integrating large language modelsHosted or local language models each have advantages and disadvantages. Hosted models offer quick validation and minimal resources, while local models provide ultra-low latency, cost efficiency, and fine-tuning. The choice depends on specific use cases and trade-offs.

      Integrating large language models into applications involves considering various trade-offs, including cost, quality, latency, and reliability. For some use cases, a hosted model may be suitable for quick validation with minimal resources. However, for others, a local model might be more appropriate for factors like ultra-low latency, cost efficiency, or fine-tuning. The landscape of open-source and proprietary language models is evolving rapidly, with open-source models becoming increasingly powerful due to low switching costs and competition. However, as the hype around AI applications grows, it's crucial to focus on the last mile of productionization and address the fundamental trade-offs between hosted and local models. Additionally, it's important to remember that applied AI is not just about the AI itself, but also about the software, systems, and cloud infrastructure that support it.

    • Focus on business problem and apply engineering rigor to make the most of AI toolsTo effectively navigate the hype cycle of AI, focus on the job to be done and evaluate solutions based on their ability to solve specific business problems. Apply engineering rigor to improve model quality, pricing, latency, and other trade-offs.

      While the advancements in AI models and their applications may be the focus of attention, it's essential to remember that AI is a tool to solve specific business use cases and problems for humans. The hype around AI can sometimes lead organizations to overlook the importance of the underlying engineering rigor and other considerations necessary to make AI solutions productive and valuable. To effectively navigate the hype cycle and allocate resources appropriately, it's crucial to keep the focus on the job to be done and evaluate AI solutions based on their ability to solve specific business problems. Additionally, applying fundamental engineering rigor at the evaluation stage is essential, as it provides a grounded north star for improving model quality, pricing, latency, and other trade-offs. Furthermore, understanding the ladder of complexity in AI, from using hosted models to building your own, can help organizations make informed decisions about the level of investment and expertise required for their particular use case. By focusing on the business problem and applying engineering rigor, organizations can make the most of the AI tools available while ensuring they deliver value to their users.

    • Break down complex problems into smaller subproblemsStart with simple solutions, evaluate objectively, and understand the job to be done to effectively work with language models

      When working with language models, it's essential to start with a simple solution and only add complexity when necessary. Breaking down complex problems into smaller, more manageable subproblems is a practical approach to ensure reliability and maintainability. Additionally, the evaluation of language model outputs can be challenging, and it's crucial to have objective evaluation methods. The Auto Evaluator project by Lance Martin, which focuses on question answering, is an excellent example of an objective evaluation method. Remember that sentiment analysis is often just a part of a larger job to be done, and language models can do much more than just sentiment analysis. Therefore, it's essential to understand the job to be done and structure the problem accordingly.

    • Working with large language models: Clear tasks, structured guards, typed outputs, and validation techniquesTo build reliable applications with large language models, focus on clear tasks, structured guards, typed outputs, and validation techniques. Stay informed about the latest developments, follow reliable sources, and engage with the community. Narrow the scope by focusing on specific use cases and domains.

      Focusing on a clear, articulated, and structured task when working with large language models (LLMs) leads to better reliability and easier testing using traditional software engineering practices. This can involve invoking LLMs with structured guards, having typed outputs in languages like TypeScript, and implementing techniques to validate and self-heal any issues with the output. Libraries like Lane Chained and open-source frameworks can help abstract some of the complexity. However, the best practices and techniques for working with LLMs are constantly evolving, making it challenging for developers to keep up. To manage this, it's essential to stay informed about the latest developments, follow reliable sources, and engage with the community. Additionally, focusing on specific use cases and domains can help narrow the scope and make the learning process more manageable. It's also crucial to remember that LLMs can do almost anything, but their unconstrained nature can make it difficult to approach the problem. By following established examples and guidelines, developers can build reliable applications and keep up with the rapid progress in the field.

    • Applying AI to your own problemsStart by using AI tools for personal projects to build a mental muscle and stay updated with the latest technology. Hosted APIs make AI more accessible to a broader audience, especially in the JavaScript community.

      To effectively utilize AI tools like JWT and language models, it's essential to start by applying them to your own problems. This approach helps build a mental muscle for thinking creatively about AI solutions and keeps you updated with the latest technology. There are various AI tools and communities, such as TypeScript, Node, and JavaScript developers, who are actively embracing AI technologies through hosted APIs. While the data science community may be focused on model training, the JavaScript community is leveraging these tools to build innovative products. The rise of hosted APIs is a significant unlock for application developers, making AI technology more accessible to a broader audience. TypeScript, being the largest programming language in the world, is poised to play a significant role in this evolution. However, it's important to note that both communities have unique challenges and opportunities in adopting AI technologies.

    • Building Agents with TypeScript: Bridging the Gap in Machine LearningDaniel is developing a reliable TypeScript framework for building agents, focusing on use cases and taking a TypeScript-first approach for application developers

      There's a dynamic between the application layer and machine learning models, with developers who excel at building full-stack apps that can easily integrate with hosted models pushing the envelope in terms of unlocking people's imaginations and improving user experience. However, there's a gap in the TypeScript world when it comes to fundamental machine learning libraries and frameworks, which are primarily available in the Python ecosystem. Daniel, who has experience in this area, is working on an open-source, reliable TypeScript framework for building agents, viewing agents as the new reasoning engines in the machine learning paradigm. He aims to focus on reliable use cases that can be built today, while taking a TypeScript-first approach due to his affinity for the community and the audience of application developers who are more likely to use TypeScript. Daniel also emphasized the importance of considering the use of multiple languages depending on the specific use case and performance requirements.

    • Exploring the Future of Machine Learning with WebAssemblyWebAssembly offers potential for more efficient and multifaceted machine learning solutions, allowing for the use of various programming languages and eliminating the need for containerization technologies like Docker.

      The speaker expresses a desire for a more seamless experience when working with machine learning models, particularly in the context of WebAssembly (WASM) and various programming languages. The speaker has had frustration with having to use Python as the primary language for machine learning models, but is excited about the potential of WASM to provide a more efficient and multifaceted solution. The speaker also sees potential for WASM to have a significant impact on the mainstream adoption of AI, particularly in areas where low latency and hardware access are important. The speaker mentions TypeScript as their current language of choice for application development, but sees WASM as the ultimate runtime target. They also mention the potential for WASM to eliminate the need for containerization technologies like Docker, as WebAssembly could potentially provide similar functionality. The speaker is intrigued by ongoing efforts to port machine learning libraries to WebAssembly, such as the Pyodide project, which allows running a subset of scikit-learn in WebAssembly environments including Node.js in the browser. Overall, the speaker expresses a strong belief in the potential of WASM to revolutionize the way machine learning models are developed and deployed.

    • The future of AI and natural language processingThe intersection of diversity and AI development offers potential for significant positive impact, with natural language becoming the basis for higher-level abstractions. Challenges in adding reliability and structure to these systems remain, but the speaker is optimistic about the future of AI and the role of natural language processing in its advancement.

      The intersection of diversity in the tech industry and the development of advanced AI systems is a promising area with the potential for significant positive impact. As AI agents become more reliable and autonomous, they may represent a new programming paradigm, with natural language as the basis for higher-level abstractions. The speaker expresses excitement about this potential future, but also acknowledges the challenges that come with adding reliability and structure to these complex systems. The speaker also suggests that current programming languages and abstractions may become less relevant over time as we move towards more efficient and approachable natural language-based solutions. While the exact timeline for these developments is uncertain, the speaker expresses optimism about the future of AI and the role of natural language processing in its advancement. The speaker also encourages listeners to subscribe to Practical AI, share the podcast with others, and check out Fastly and Fly for their partnership in bringing the show to listeners.

    Recent Episodes from Practical AI: Machine Learning, Data Science

    Stanford's AI Index Report 2024

    Stanford's AI Index Report 2024
    We’ve had representatives from Stanford’s Institute for Human-Centered Artificial Intelligence (HAI) on the show in the past, but we were super excited to talk through their 2024 AI Index Report after such a crazy year in AI! Nestor from HAI joins us in this episode to talk about some of the main takeaways including how AI makes workers more productive, the US is increasing regulations sharply, and industry continues to dominate frontier AI research.

    Apple Intelligence & Advanced RAG

    Apple Intelligence & Advanced RAG
    Daniel & Chris engage in an impromptu discussion of the state of AI in the enterprise. Then they dive into the recent Apple Intelligence announcement to explore its implications. Finally, Daniel leads a deep dive into a new topic - Advanced RAG - covering everything you need to know to be practical & productive.

    The perplexities of information retrieval

    The perplexities of information retrieval
    Daniel & Chris sit down with Denis Yarats, Co-founder & CTO at Perplexity, to discuss Perplexity’s sophisticated AI-driven answer engine. Denis outlines some of the deficiencies in search engines, and how Perplexity’s approach to information retrieval improves on traditional search engine systems, with a focus on accuracy and validation of the information provided.

    Using edge models to find sensitive data

    Using edge models to find sensitive data
    We’ve all heard about breaches of privacy and leaks of private health information (PHI). For healthcare providers and those storing this data, knowing where all the sensitive data is stored is non-trivial. Ramin, from Tausight, joins us to discuss how they have deploy edge AI models to help company search through billions of records for PHI.

    Rise of the AI PC & local LLMs

    Rise of the AI PC & local LLMs
    We’ve seen a rise in interest recently and a number of major announcements related to local LLMs and AI PCs. NVIDIA, Apple, and Intel are getting into this along with models like the Phi family from Microsoft. In this episode, we dig into local AI tooling, frameworks, and optimizations to help you navigate this AI niche, and we talk about how this might impact AI adoption in the longer term.

    AI in the U.S. Congress

    AI in the U.S. Congress
    At the age of 72, U.S. Representative Don Beyer of Virginia enrolled at GMU to pursue a Master’s degree in C.S. with a concentration in Machine Learning. Rep. Beyer is Vice Chair of the bipartisan Artificial Intelligence Caucus & Vice Chair of the NDC’s AI Working Group. He is the author of the AI Foundation Model Transparency Act & a lead cosponsor of the CREATE AI Act, the Federal Artificial Intelligence Risk Management Act & the Artificial Intelligence Environmental Impacts Act. We hope you tune into this inspiring, nonpartisan conversation with Rep. Beyer about his decision to dive into the deep end of the AI pool & his leadership in bringing that expertise to Capitol Hill.

    Full-stack approach for effective AI agents

    Full-stack approach for effective AI agents
    There’s a lot of hype about AI agents right now, but developing robust agents isn’t yet a reality in general. Imbue is leading the way towards more robust agents by taking a full-stack approach; from hardware innovations through to user interface. In this episode, Josh, Imbue’s CTO, tell us more about their approach and some of what they have learned along the way.

    Private, open source chat UIs

    Private, open source chat UIs
    We recently gathered some Practical AI listeners for a live webinar with Danny from LibreChat to discuss the future of private, open source chat UIs. During the discussion we hear about the motivations behind LibreChat, why enterprise users are hosting their own chat UIs, and how Danny (and the LibreChat community) is creating amazing features (like RAG and plugins).

    Related Episodes

    When data leakage turns into a flood of trouble

    When data leakage turns into a flood of trouble
    Rajiv Shah teaches Daniel and Chris about data leakage, and its major impact upon machine learning models. It’s the kind of topic that we don’t often think about, but which can ruin our results. Raj discusses how to use activation maps and image embedding to find leakage, so that leaking information in our test set does not find its way into our training set.

    Stable Diffusion (Practical AI #193)

    Stable Diffusion (Practical AI #193)
    The new stable diffusion model is everywhere! Of course you can use this model to quickly and easily create amazing, dream-like images to post on twitter, reddit, discord, etc., but this technology is also poised to be used in very pragmatic ways across industry. In this episode, Chris and Daniel take a deep dive into all things stable diffusion. They discuss the motivations for the work, the model architecture, and the differences between this model and other related releases (e.g., DALL·E 2). (Image from stability.ai)

    AlphaFold is revolutionizing biology

    AlphaFold is revolutionizing biology
    AlphaFold is an AI system developed by DeepMind that predicts a protein’s 3D structure from its amino acid sequence. It regularly achieves accuracy competitive with experiment, and is accelerating research in nearly every field of biology. Daniel and Chris delve into protein folding, and explore the implications of this revolutionary and hugely impactful application of AI.

    Zero-shot multitask learning (Practical AI #158)

    Zero-shot multitask learning (Practical AI #158)
    In this Fully-Connected episode, Daniel and Chris ponder whether in-person AI conferences are on the verge of making a post-pandemic comeback. Then on to BigScience from Hugging Face, a year-long research workshop on large multilingual models and datasets. Specifically they dive into the T0, a series of natural language processing (NLP) AI models specifically trained for researching zero-shot multitask learning. Daniel provides a brief tour of the possible with the T0 family. They finish up with a couple of new learning resources.