Logo
    Search

    Podcast Summary

    • Generative models as assistants in AIGenerative models are excelling in areas like transcription, code generation, and large language models, while traditional ML models continue to serve specific functions like fraud detection and churn prediction. Both serve distinct purposes and flexibility is key.

      Generative models are increasingly being used as assistants and automators rather than predictors or other functionalities, especially in the context of "rag as a service." This shift in use cases is becoming clearer as the industry matures, with traditional machine learning workloads continuing to serve specific functions, such as fraud detection and churn prediction. Generative AI workloads, on the other hand, are excelling in areas like transcription, code generation, and large language models. The two are not expected to replace each other entirely, but rather, they serve distinct purposes. This was a key theme in a recent Practical AI podcast episode featuring Dimitrios from the ML Ops community, where the panel discussed the evolving roles of generative models and traditional machine learning models. The episode also touched on the importance of understanding the strengths and limitations of each approach and using them appropriately. Overall, the conversation highlighted the importance of flexibility and adaptability in the ever-evolving world of artificial intelligence.

    • ML and AI Intersection in MLOps: Budgets and Use CasesA survey of 322 MLOps professionals revealed growing budgets for AI, focus on valuable use cases, and increasing interest in LLMs and Rag Models.

      There's a significant push towards exploring the intersection of traditional Machine Learning (ML) and Generative Artificial Intelligence (AI) in the MLOps community. This was evident in a recent survey conducted during a virtual conference, which saw a record-breaking response of 322 participants. The survey revealed that there's a growing allocation of budget towards AI, with 45% of respondents using existing budgets and 43% using new ones. Furthermore, there's a strong focus on identifying the most valuable use cases for these technologies, with many companies open to exploration. The majority of participants identified as having some experience with Large Language Models (LLMs) and Rags, with only 6% at the frontier of LLM and Rag Model Innovation. Overall, the survey highlights the excitement and potential of these technologies, with many organizations eager to understand their applications and benefits.

    • Choosing Between Pre-trained Models and Fine-tuningPre-trained models like RAG offer quick validation for general use cases, while fine-tuning is better for specialized fields and specific output forms, requiring significant resources and expertise.

      The choice between using a pre-trained model like RAG (Reinforcement Learning from Human Feedback) or fine-tuning for specific use cases depends on the required level of expertise and the desired output form. While RAG is suitable for general use cases and those requiring some domain knowledge, fine-tuning is better for highly specialized fields and specific output forms, such as functions or unique formats. However, using a small, fine-tuned model can be challenging, requiring a dedicated team and significant resources, while using pre-trained models through APIs allows for quick validation of use cases before committing to more complex solutions. The recent availability of OpenAI's API has made it easier for users to explore their ideas and determine the appropriate level of model complexity for their needs. Ultimately, the decision between RAG and fine-tuning depends on the trade-offs in terms of expertise, resources, and desired output.

    • Enterprises adopt multi-model approach for generative AIEnterprises move beyond relying on one model provider, instead opting for a multi-model approach to leverage unique capabilities and reduce risk.

      Enterprises are moving towards a multi-model approach in building and buying generative AI, driven by the flexibility and control offered by open models. This trend is not only due to security and privacy concerns but also the unique behaviors and capabilities of different models for specific tasks. The ability to use multiple models and create reasoning chains with them is proving to be a more effective route around fine-tuning. However, not all organizations have reached this level of understanding and implementation, and the fear of a single point of failure, such as relying on one model provider's API, is a concern. To mitigate this risk, organizations are exploring having multiple options and developing prompt suites or templates. This shift towards multi-model capabilities is a sign of maturity in the field, but the majority of enterprises are still figuring out how to make it work for them.

    • Challenges in machine learning and AI model evaluationThe evaluation process in machine learning and AI models is facing challenges due to lack of standardized data, long iteration times, and fragmented community, making it expensive, time-consuming, and frustrating.

      The evaluation process in the field of machine learning and AI models is currently facing several challenges. The data used for evaluation is not standardized, with many people creating their own datasets, making it difficult and expensive to do at scale. The iteration time for testing and evaluating models is also a significant issue, with long wait times and the need for human curation of testing datasets. Additionally, there is a lack of clear guidance on best practices, leading to a fragmented community with multiple, often platform-dependent, groups. These challenges make the evaluation process time-consuming, costly, and frustrating for many. Furthermore, the reliance on human-labeled ground truth data adds to the complexity and cost of the process. Overall, the current state of the industry requires a more standardized and streamlined approach to evaluation to enable faster iteration, more efficient use of resources, and clearer guidance for the community.

    • Diverse and fragmented AIML communityEach sub-community within AIML has unique focuses and challenges, requiring an understanding of their specific areas of exploration to ensure data accuracy and up-to-dateness in vector databases.

      The AIML (Artificial Intelligence Modeling Language) community is diverse and fragmented, with each sub-community focusing on specific tools and areas of exploration. This leads to a lack of generalized best practices and a focus on unique outcomes. For instance, the MLOps community is industry-focused and prioritizes practical applications and productionization, while the Llama Index community emphasizes retrieval evaluation and updating data in vector databases. The challenges of ensuring data accuracy and up-to-dateness in vector databases are crucial concerns in both communities. Ultimately, the fragmentation of the AIML community necessitates an understanding of the unique focuses and challenges within each sub-community.

    • Managing complex data challenges in various systems and databasesEnsuring data is open, reproducible, and secure while maintaining privacy and access control is crucial in managing complex data challenges in various systems and databases, especially for large language models.

      Managing data and access in various databases and systems can present complex challenges. For instance, in the context of Slack, there can be discrepancies between the data displayed and the actual data, which can lead to misunderstandings. Furthermore, when it comes to updating or syncing data in databases, there can be questions about the best approach, especially when dealing with large amounts of data. Another issue that arises is role-based access control (RBAC). While some tools and databases offer some level of RBAC, it's not always a seamless process, especially when dealing with vector databases. In such cases, managing metadata and ensuring that the correct users have access to the correct data can be a challenge. Moreover, the size and complexity of language models like Common Corpus, recently released by Hugging Face, add another layer of complexity to data management. Ensuring that these models are trained on open and reproducible data while maintaining privacy and security is a significant challenge. In conclusion, managing data and access in various systems and databases, especially in the context of large language models, can be a complex and ongoing process. It requires careful planning, attention to detail, and a solid understanding of the tools and databases being used. Additionally, ensuring that data is open, reproducible, and secure while maintaining privacy and access control is a critical aspect of this process.

    • Exploring beyond transformers in AITransformers are currently dominant in AI, but there's excitement about new approaches like neuromorphic computing and alternative architectures that could offer new solutions.

      While transformers have been the dominant architecture in AI for some time, there's a growing sense that they may be a stopgap solution. The Common Corpus, a large multilingual dataset, demonstrates the viability of training large language models without copyright concerns, and yet, the models still require significant post-processing to meet the needs of specific applications. Dimitrios and the podcast hosts discuss the possibility that transformers are a Band-Aid solution, and that future advancements in AI might involve new architectures or approaches. They acknowledge that research is ongoing, but it remains to be seen when a viable alternative to transformers will emerge. The hosts express excitement about the potential of new technologies like neuromorphic computing, which could offer different approaches to both hardware and software. Overall, the conversation underscores the ongoing exploration and experimentation in the field of AI, and the understanding that transformers may not be the final word.

    • Neuromorphic Computing: The Future of AINeuromorphic computing, inspired by the brain, could lead to more efficient and effective AI systems. Intel is a leader in this field and has a strong relationship with the guest. The future holds exciting potential for neuromorphic computing in both software and hardware.

      Neuromorphic computing, a type of artificial intelligence inspired by the structure and function of the human brain, is gaining significant attention and investment from industry leaders like Intel. This form of computing, which mimics the way neurons communicate in the brain, could potentially lead to more efficient and effective AI systems. The speaker expressed excitement about the potential results that will emerge from this research in the coming years, as it applies not only to software but also to hardware architectures. Despite not being an expert on the topic, the speaker emphasized the importance of neuromorphic computing and its potential to revolutionize the field of AI. Intel, which is reportedly a leader in this space, was mentioned as having a strong relationship with the show's guest, Daniel, through Prediction Guard. The speaker also announced plans for a future episode dedicated to neuromorphic computing, and encouraged listeners to explore the MLOps community podcast for more information on AI-related topics. Additionally, the speaker announced an upcoming in-person conference focused on AI quality, which will feature a variety of speakers and activities.

    • Emphasis on balance of learning and entertainment at the upcoming conferenceThe upcoming conference offers a unique balance of valuable insights and unexpected, entertaining elements, promising an unforgettable experience for attendees.

      Learning from this episode of Practical AI is the emphasis on making the upcoming conference an unforgettable experience. The speakers will provide valuable insights, but there will also be unexpected and entertaining elements, as Demetrios, known for his humor in the AI world, will be present. Audience members are encouraged to attend not just for the educational content, but also for the enjoyment of the random and fun elements. The conference promises to offer a perfect balance of learning and entertainment. So, mark your calendars and get ready for a unique experience. If you're not already following Demetrios on social media, it's highly recommended to do so for a dose of his hilarious content. Don't miss out on this opportunity to learn, be entertained, and connect with the community. Subscribe to Practical AI for more updates and join the free Slack team at practicalai.fm/community to be part of the conversation.

    Recent Episodes from Practical AI: Machine Learning, Data Science

    Stanford's AI Index Report 2024

    Stanford's AI Index Report 2024
    We’ve had representatives from Stanford’s Institute for Human-Centered Artificial Intelligence (HAI) on the show in the past, but we were super excited to talk through their 2024 AI Index Report after such a crazy year in AI! Nestor from HAI joins us in this episode to talk about some of the main takeaways including how AI makes workers more productive, the US is increasing regulations sharply, and industry continues to dominate frontier AI research.

    Apple Intelligence & Advanced RAG

    Apple Intelligence & Advanced RAG
    Daniel & Chris engage in an impromptu discussion of the state of AI in the enterprise. Then they dive into the recent Apple Intelligence announcement to explore its implications. Finally, Daniel leads a deep dive into a new topic - Advanced RAG - covering everything you need to know to be practical & productive.

    The perplexities of information retrieval

    The perplexities of information retrieval
    Daniel & Chris sit down with Denis Yarats, Co-founder & CTO at Perplexity, to discuss Perplexity’s sophisticated AI-driven answer engine. Denis outlines some of the deficiencies in search engines, and how Perplexity’s approach to information retrieval improves on traditional search engine systems, with a focus on accuracy and validation of the information provided.

    Using edge models to find sensitive data

    Using edge models to find sensitive data
    We’ve all heard about breaches of privacy and leaks of private health information (PHI). For healthcare providers and those storing this data, knowing where all the sensitive data is stored is non-trivial. Ramin, from Tausight, joins us to discuss how they have deploy edge AI models to help company search through billions of records for PHI.

    Rise of the AI PC & local LLMs

    Rise of the AI PC & local LLMs
    We’ve seen a rise in interest recently and a number of major announcements related to local LLMs and AI PCs. NVIDIA, Apple, and Intel are getting into this along with models like the Phi family from Microsoft. In this episode, we dig into local AI tooling, frameworks, and optimizations to help you navigate this AI niche, and we talk about how this might impact AI adoption in the longer term.

    AI in the U.S. Congress

    AI in the U.S. Congress
    At the age of 72, U.S. Representative Don Beyer of Virginia enrolled at GMU to pursue a Master’s degree in C.S. with a concentration in Machine Learning. Rep. Beyer is Vice Chair of the bipartisan Artificial Intelligence Caucus & Vice Chair of the NDC’s AI Working Group. He is the author of the AI Foundation Model Transparency Act & a lead cosponsor of the CREATE AI Act, the Federal Artificial Intelligence Risk Management Act & the Artificial Intelligence Environmental Impacts Act. We hope you tune into this inspiring, nonpartisan conversation with Rep. Beyer about his decision to dive into the deep end of the AI pool & his leadership in bringing that expertise to Capitol Hill.

    Full-stack approach for effective AI agents

    Full-stack approach for effective AI agents
    There’s a lot of hype about AI agents right now, but developing robust agents isn’t yet a reality in general. Imbue is leading the way towards more robust agents by taking a full-stack approach; from hardware innovations through to user interface. In this episode, Josh, Imbue’s CTO, tell us more about their approach and some of what they have learned along the way.

    Private, open source chat UIs

    Private, open source chat UIs
    We recently gathered some Practical AI listeners for a live webinar with Danny from LibreChat to discuss the future of private, open source chat UIs. During the discussion we hear about the motivations behind LibreChat, why enterprise users are hosting their own chat UIs, and how Danny (and the LibreChat community) is creating amazing features (like RAG and plugins).

    Related Episodes

    When data leakage turns into a flood of trouble

    When data leakage turns into a flood of trouble
    Rajiv Shah teaches Daniel and Chris about data leakage, and its major impact upon machine learning models. It’s the kind of topic that we don’t often think about, but which can ruin our results. Raj discusses how to use activation maps and image embedding to find leakage, so that leaking information in our test set does not find its way into our training set.

    Stable Diffusion (Practical AI #193)

    Stable Diffusion (Practical AI #193)
    The new stable diffusion model is everywhere! Of course you can use this model to quickly and easily create amazing, dream-like images to post on twitter, reddit, discord, etc., but this technology is also poised to be used in very pragmatic ways across industry. In this episode, Chris and Daniel take a deep dive into all things stable diffusion. They discuss the motivations for the work, the model architecture, and the differences between this model and other related releases (e.g., DALL·E 2). (Image from stability.ai)

    AlphaFold is revolutionizing biology

    AlphaFold is revolutionizing biology
    AlphaFold is an AI system developed by DeepMind that predicts a protein’s 3D structure from its amino acid sequence. It regularly achieves accuracy competitive with experiment, and is accelerating research in nearly every field of biology. Daniel and Chris delve into protein folding, and explore the implications of this revolutionary and hugely impactful application of AI.

    Zero-shot multitask learning (Practical AI #158)

    Zero-shot multitask learning (Practical AI #158)
    In this Fully-Connected episode, Daniel and Chris ponder whether in-person AI conferences are on the verge of making a post-pandemic comeback. Then on to BigScience from Hugging Face, a year-long research workshop on large multilingual models and datasets. Specifically they dive into the T0, a series of natural language processing (NLP) AI models specifically trained for researching zero-shot multitask learning. Daniel provides a brief tour of the possible with the T0 family. They finish up with a couple of new learning resources.