Logo
    Search

    Podcast Summary

    • MLOps community's growth and shift towards practical implementations of LLMsThe MLOps community, with 37 global cities, is thriving and moving from theory to practice, sharing workflows and implementing LLMs, with recent events attracting over 80 speakers and both virtual and in-person components. Grassroots initiatives are driving growth, and clearer use cases and a stack are emerging.

      The MLOps community is thriving and making significant strides in the application of large language models (LLMs) and machine learning operations. The community, which now spans 37 cities globally, has seen a shift from theoretical discussions to practical implementations and sharing of workflows. Recent events, including the LLMs in Production conference, featured over 80 speakers and both virtual and in-person components. The community's growth is driven by the unbelievable power of grassroots initiatives, with new chapters forming in various cities. The use cases for LLMs are becoming clearer, and a stack is forming around their implementation. A recent survey conducted by the community further highlights the progress being made in this field.

    • Evaluating Language Model Large models for specific use casesThe effectiveness of Language Model Large models for specific use cases is complex and unclear, requiring evaluation beyond provided tools and best practices.

      The use of Language Model Large models (LLMs) is not a one-size-fits-all solution, and evaluating their effectiveness for specific use cases can be a complex and confusing process. According to a recent survey, people are using LLMs for various reasons, and a stack of related technologies, including foundational models, vector databases, developer SDKs, and monitoring tools, is emerging to support these models. However, the evaluation of these models is still unclear, and it's not guaranteed that the best-performing model on a leaderboard will yield the best results for a specific use case. Additionally, evaluating models for toxicity and specific use-case requirements can be challenging, and the constant release of new SOTA (State of the Art) models can sometimes feel like marketing hype. It's important to remember that LLMs are not complete applications and should be seen as a component in a larger stack. The survey results also revealed that people are doing extra evaluation work on top of the provided tools, but there's a lack of clarity on best practices and what exactly to evaluate. Therefore, a new survey is being conducted to gather more information on this topic. In summary, the evaluation of LLMs for specific use cases is a complex and evolving process, and it's crucial to understand the limitations and requirements of these models to achieve the desired outcomes.

    • Debugging LLMs involves complex processes like prompt engineering and vector embeddingsDebugging LLMs requires a deep understanding of prompt engineering, retrieval-based augmentation, alignment, and vector embeddings to optimize performance

      While Large Language Models (LLMs) like ChatGPT are powerful tools, implementing and optimizing them for specific use cases involves a complex process that goes beyond just using the model. This process includes prompt engineering, retrieval-based augmentation, alignment, evaluation, and debugging. These concepts are often confused, and it can be overwhelming for those new to the field. The debugging process can be particularly challenging, as it involves isolating issues in the prompt, retrieval, or vector embeddings. Tools like Langchain, Llama Index, and others can help with orchestration, but their rapid advancement can lead to unexpected issues. A simpler approach, such as writing out the chain of reasoning in Python logic, can sometimes be more effective for debugging and optimizing performance. The field is advancing quickly, and solutions to common issues are likely to emerge. However, it's important to remember the KISS (Keep It Simple, Stupid) principle and avoid overengineering solutions. Despite the challenges, the potential benefits of LLMs make the effort worthwhile.

    • Understanding the limitations of fine tuning large language modelsFine tuning large language models is not always necessary and may not significantly improve understanding or writing style. It's primarily used to teach new functions or outputs and requires clean, structured data and resources.

      While fine tuning large language models (LLMs) can be a popular topic, it's important to understand its limitations and when to use it effectively. Not all cases require fine tuning, and sometimes simpler solutions like retrieval-augmented generation may be more suitable. The misconception arises when people assume that fine tuning an LLM will make it understand them better or write in their unique style. However, fine tuning is primarily used to teach the model new functions or outputs that it doesn't already know how to do. It's essential to evaluate the need for fine tuning based on the specific use case and the availability of clean, structured data. Fine tuning can be resource-intensive and time-consuming, so it's crucial to consider its necessity before embarking on the process. Additionally, the success of fine tuning depends on the original base model's exposure to the relevant data, and some models may not benefit significantly from fine tuning if they lack sufficient examples. Overall, it's essential to approach fine tuning with a clear understanding of its purpose and limitations to maximize its potential benefits.

    • Fine-tuning LLMs requires significant effort and resources, especially data cleaningFine-tuning LLMs for specific tasks requires a large dataset of instruction prompts and answers, and retrieval augmented generation is an important approach to consider for handling complex queries and generating accurate answers.

      Fine-tuning language models (LLMs) requires a significant amount of effort and resources, especially when it comes to collecting, labeling, and cleaning data. Daniel, in the discussion, shared his nostalgia for the data cleaning process and reminded listeners that fine-tuning an LLM on raw text data may only result in a better autocomplete model, not a better question answering model. He emphasized that creating a large dataset of instruction prompts and answers is necessary for fine-tuning a model to perform specific tasks effectively. Demetrios added that retrieval augmented generation is an important approach to consider when working with LLMs. Retrieval augmented generation involves using a retrieved text snippet as input to generate a response, making it a valuable technique for handling complex queries and generating accurate and contextually relevant answers. Raul, an expert in the field, will be leading a course on this topic to help those interested in learning more about the implementation and benefits of retrieval augmented generation. It's important to remember that working with LLMs and fine-tuning them for specific tasks is not a simple process. It requires a solid understanding of the underlying systems and a significant investment in data collection, labeling, and cleaning. The rewards, however, can be substantial, as seen in the success of companies like Mosaic that have capitalized on the demand for more advanced and accurate language models.

    • Retrieval Augmented Generation (RAG) for building Q&A bots or chatbotsRAG involves creating a data pipeline, preprocessing data, ingesting it into a vector database, and semantic searching for answers to user questions. It enhances LLM performance and is a valuable tool for ML and MLOps fields.

      Retrieval Augmented Generation (RAG) is a valuable addition to an LLM (Large Language Model) system for building Q&A bots or chatbots, especially for those looking to level up their skills quickly. This approach involves creating a data pipeline, preprocessing data, ingesting it into a vector database, and semantic searching for answers to user questions. RAG was showcased in a hackathon where participants built QA bots using the MLOps community Slack data, and the most accurate responses were those that provided not only the answer but also relevant citations and threads from the Slack conversation. The MLOps community is offering a course on this topic, which includes go-at-your-own-pace and cohort-based learning options. The learning platform can be found at learn.mlops.community. Overall, RAG is an effective way to enhance the performance of LLMs and is a valuable tool for those in the ML and MLOps fields.

    • Exploring Various Use Cases of Large Language Models in OrganizationsText generation and summarization are popular applications of LLMs, but organizations also use them for data enrichment, labeling augmentation, and generating content for experts. However, high costs and uncertain ROI hinder their widespread adoption.

      The survey on company use cases of large language models (LLMs) revealed that text generation and summarization are popular applications, but participants are also exploring other ways to use LLMs, such as data enrichment, data labeling augmentation, and generating content for subject matter experts. However, the use of LLMs in organizations is still unclear due to high costs and uncertain ROI. The survey also highlighted the challenges of hallucinations and the speed of inference with LLMs, as well as the need for consistent models and infrastructure. During the creation of this report, the speaker acknowledged that they are not an expert report-generating organization and that the process was time-consuming due to the open-ended nature of the survey responses. The speaker emphasized the importance of getting diverse perspectives and incorporated feedback from multiple rounds of reviews to minimize bias. Moving forward, the speaker plans to include more structured survey responses, such as multiple choice and checkboxes, to make data analysis easier and more efficient. Despite the challenges, the speaker remains committed to providing a comprehensive and unbiased report for the community.

    • Survey results on LLM usage in businessMajority of respondents set up LLMs for business use, smaller companies less likely to use OpenAI, larger companies also avoided it, middle-sized companies most likely to use OpenAI.

      While Large Language Models (LLMs) like ChatGPT can provide valuable insights, the time spent prompting and tuning them may equal or even exceed the time spent generating the insights on your own. The survey results showed that a majority of respondents were setting up systems with LLMs, rather than just using them casually. The survey respondents were most curious about the usage of open source versus OpenAI in the upcoming survey. The visual representation of OpenAI usage and company size was met with criticism, as it did not clearly convey the intended information. Preliminary data suggested that smaller companies (1-50 employees) were less likely to use OpenAI, while larger companies (1000+ employees) also avoided it. Middle-sized companies (500-1000 employees) were the most likely to use OpenAI. Possible theories for this trend include startups trying to differentiate themselves by not using OpenAI, or larger companies having the resources to develop their own LLMs.

    • Deciding Between Single Model Family or Model Agnostic Approach for Large Language ModelsLarger companies must weigh the benefits of vendor lock-in, data security, and model landscape evolution against the need for quick implementation and access to advanced features when deciding between committing to a single model family or maintaining a more model-agnostic approach for large language models.

      As companies consider implementing large language models like OpenAI's ChatGPT for their operations, they face a decision between committing to a single model family or maintaining a more model-agnostic approach. For smaller companies, the speed of implementation may outweigh concerns about vendor lock-in or data security. However, for larger companies with more resources and legal departments, there is a healthy skepticism towards allowing their data to leave their "walled garden." The model landscape is also evolving rapidly, adding another layer of complexity to the decision-making process. Some companies may prioritize getting as many features and functions up and running as possible, even if it means being locked into a single model family. Others may prefer a more flexible approach, allowing them to pivot between different models and maintain privacy. While there is immense value in the capabilities offered by large language models, the decision to commit to a single model family or maintain a more agnostic approach is an important one that requires careful consideration.

    • Startups vs. larger organizations approach to AI and ML implementationStartups experiment quickly, while larger organizations are more cautious due to data security and potential lock-in concerns. The graph shows an increase in experimentation at the start, followed by a slowdown as organizations scale up. The barrier to entry for AI and ML has been significantly lowered, leading to increased innovation.

      There's a noticeable difference in approach to implementing AI and machine learning between startups and larger organizations. Startups tend to move quickly and experiment with various models and tools, while larger organizations are more cautious due to concerns over data security and potential lock-in. The graph discussed in the conversation illustrates this trend, showing an increase in experimentation at the beginning of projects, followed by a slowdown as organizations scale up. Looking ahead, Dimitrios is excited about the accessibility of AI and machine learning for anyone to experiment with, leading to increased creativity and value creation for companies. The barrier to entry has been significantly lowered, making it an exciting time for innovation in the industry.

    • Exploring LLMs at the Product ConferenceLearn about economical LLM solutions, prioritizing use cases, and putting LLMs into products from diverse speakers at the Product Conference, with technical details, live music, and special merchandise.

      The LLMs and Production conference on October 3rd is an exciting event for product owners and engineers, offering valuable insights into building economical LLM solutions, prioritizing use cases, and putting LLMs into products. The conference stands out for its technical details, live music interludes, and diverse field of speakers, with underrepresented groups being a priority. The event also features a sponsor who has rented a studio in Amsterdam and special merchandise for sale during the conference. Dimitrios, the organizer, is dedicated to showcasing a wide range of speakers and is proud of the conference's diversity.

    • Emphasizing the importance of staying engaged and informed in the MLOps communityStay updated on the latest MLOps developments, events, and initiatives by subscribing to Practical AI, engaging with the community, and collaborating on projects.

      The MLOps community is continuously evolving, and there's always something new to look forward to. The importance of staying engaged and informed about the latest developments, events, and initiatives was emphasized during the discussion. The speaker expressed excitement about the various projects and initiatives in the MLOps community and appreciated the opportunity to share his insights. He encouraged listeners to subscribe to Practical AI, share it with their networks, and learn more about Fastly and Fly, the podcast's partners. Overall, the conversation highlighted the importance of collaboration, continuous learning, and staying up-to-date in the ever-evolving field of MLOps.

    Recent Episodes from Practical AI: Machine Learning, Data Science

    Apple Intelligence & Advanced RAG

    Apple Intelligence & Advanced RAG
    Daniel & Chris engage in an impromptu discussion of the state of AI in the enterprise. Then they dive into the recent Apple Intelligence announcement to explore its implications. Finally, Daniel leads a deep dive into a new topic - Advanced RAG - covering everything you need to know to be practical & productive.

    The perplexities of information retrieval

    The perplexities of information retrieval
    Daniel & Chris sit down with Denis Yarats, Co-founder & CTO at Perplexity, to discuss Perplexity’s sophisticated AI-driven answer engine. Denis outlines some of the deficiencies in search engines, and how Perplexity’s approach to information retrieval improves on traditional search engine systems, with a focus on accuracy and validation of the information provided.

    Using edge models to find sensitive data

    Using edge models to find sensitive data
    We’ve all heard about breaches of privacy and leaks of private health information (PHI). For healthcare providers and those storing this data, knowing where all the sensitive data is stored is non-trivial. Ramin, from Tausight, joins us to discuss how they have deploy edge AI models to help company search through billions of records for PHI.

    Rise of the AI PC & local LLMs

    Rise of the AI PC & local LLMs
    We’ve seen a rise in interest recently and a number of major announcements related to local LLMs and AI PCs. NVIDIA, Apple, and Intel are getting into this along with models like the Phi family from Microsoft. In this episode, we dig into local AI tooling, frameworks, and optimizations to help you navigate this AI niche, and we talk about how this might impact AI adoption in the longer term.

    AI in the U.S. Congress

    AI in the U.S. Congress
    At the age of 72, U.S. Representative Don Beyer of Virginia enrolled at GMU to pursue a Master’s degree in C.S. with a concentration in Machine Learning. Rep. Beyer is Vice Chair of the bipartisan Artificial Intelligence Caucus & Vice Chair of the NDC’s AI Working Group. He is the author of the AI Foundation Model Transparency Act & a lead cosponsor of the CREATE AI Act, the Federal Artificial Intelligence Risk Management Act & the Artificial Intelligence Environmental Impacts Act. We hope you tune into this inspiring, nonpartisan conversation with Rep. Beyer about his decision to dive into the deep end of the AI pool & his leadership in bringing that expertise to Capitol Hill.

    Full-stack approach for effective AI agents

    Full-stack approach for effective AI agents
    There’s a lot of hype about AI agents right now, but developing robust agents isn’t yet a reality in general. Imbue is leading the way towards more robust agents by taking a full-stack approach; from hardware innovations through to user interface. In this episode, Josh, Imbue’s CTO, tell us more about their approach and some of what they have learned along the way.

    Private, open source chat UIs

    Private, open source chat UIs
    We recently gathered some Practical AI listeners for a live webinar with Danny from LibreChat to discuss the future of private, open source chat UIs. During the discussion we hear about the motivations behind LibreChat, why enterprise users are hosting their own chat UIs, and how Danny (and the LibreChat community) is creating amazing features (like RAG and plugins).

    Mamba & Jamba

    Mamba & Jamba
    First there was Mamba… now there is Jamba from AI21. This is a model that combines the best non-transformer goodness of Mamba with good ‘ol attention layers. This results in a highly performant and efficient model that AI21 has open sourced! We hear all about it (along with a variety of other LLM things) from AI21’s co-founder Yoav.

    Related Episodes

    When data leakage turns into a flood of trouble

    When data leakage turns into a flood of trouble
    Rajiv Shah teaches Daniel and Chris about data leakage, and its major impact upon machine learning models. It’s the kind of topic that we don’t often think about, but which can ruin our results. Raj discusses how to use activation maps and image embedding to find leakage, so that leaking information in our test set does not find its way into our training set.

    Stable Diffusion (Practical AI #193)

    Stable Diffusion (Practical AI #193)
    The new stable diffusion model is everywhere! Of course you can use this model to quickly and easily create amazing, dream-like images to post on twitter, reddit, discord, etc., but this technology is also poised to be used in very pragmatic ways across industry. In this episode, Chris and Daniel take a deep dive into all things stable diffusion. They discuss the motivations for the work, the model architecture, and the differences between this model and other related releases (e.g., DALL·E 2). (Image from stability.ai)

    AlphaFold is revolutionizing biology

    AlphaFold is revolutionizing biology
    AlphaFold is an AI system developed by DeepMind that predicts a protein’s 3D structure from its amino acid sequence. It regularly achieves accuracy competitive with experiment, and is accelerating research in nearly every field of biology. Daniel and Chris delve into protein folding, and explore the implications of this revolutionary and hugely impactful application of AI.

    Zero-shot multitask learning (Practical AI #158)

    Zero-shot multitask learning (Practical AI #158)
    In this Fully-Connected episode, Daniel and Chris ponder whether in-person AI conferences are on the verge of making a post-pandemic comeback. Then on to BigScience from Hugging Face, a year-long research workshop on large multilingual models and datasets. Specifically they dive into the T0, a series of natural language processing (NLP) AI models specifically trained for researching zero-shot multitask learning. Daniel provides a brief tour of the possible with the T0 family. They finish up with a couple of new learning resources.