Logo
    Search

    Podcast Summary

    • AI voices in robo calls under FCC scrutinyThe FCC has taken action against the use of AI voices in robo calls due to ethical concerns and potential for misrepresentation, highlighting the need for continued regulation and ethical use of AI technology.

      The use of AI voices in robo calls has become a topic of regulation following an incident involving an AI voice clone of President Biden. The Federal Communications Commission (FCC) has taken action against the use of such technology in automated phone calls due to ethical concerns and potential for misrepresentation. The FCC ruling is a response to the increasing capabilities of AI and the potential for fraudulent activities. It's a reminder of the ethical implications and potential consequences of using AI technology in deceptive ways. This incident also highlights the need for continued vigilance and regulation in the use of AI voices and other advanced technologies. Additionally, it's important to note that not all actors in this space adhere to ethical standards, making regulation a necessary step. The use of AI in robo calls is just one example of the many ways AI is being used and regulated in various industries. Stay tuned to Practical AI for more updates and insights on the latest developments in AI.

    • FCC's ruling on AI voices in telemarketingThe FCC's ruling aims to prevent misrepresentation and fraud in telemarketing using AI voices, but raises questions about legitimate use cases.

      The FCC's recent ruling on the use of artificial intelligence (AI) voices in telemarketing and robocalls is a step forward in combating fraud and misrepresentation. The use of conversational AI to keep spammers on the line and prevent them from targeting individuals is an innovative solution. However, the ruling raises questions about the legality of using AI voices in legitimate situations, such as making reservations or ordering pizzas. It seems the FCC's primary concern is preventing misrepresentation and fraud. The ruling could potentially create a gray area, as there are numerous legitimate use cases for AI voices. The line between legitimate and illegitimate use will likely depend on the intent and representation of the voice. As technology advances, government regulation of generated content will continue to be a trending issue.

    • Regulatory Discussions and Google's Gemini: AI's Evolving LandscapeRegulations for AI are complex, especially internationally, and Google's Gemini, a more advanced AI chatbot, adds to the evolving AI landscape

      There are ongoing regulatory discussions and actions regarding AI, with the FCC being one of the agencies involved in the US. The complexity of these regulations, particularly in a transnational context, will create challenges for organizations operating internationally. Google is one company making strides in AI, having rebranded its chatbot Bard as Gemini and offering various subscription tiers, with Ultra being the most advanced model. The competition between Google's Advanced (Ultra model) and OpenAI's GPT 4 model has been a topic of much discussion. While I haven't tried Google Ultra yet due to its $20 monthly fee, there have been numerous comparisons between the two models available online. The regulatory landscape for AI and the advancements in AI technology, such as Google's Gemini, continue to evolve and will be topics of ongoing interest.

    • Google's Bard: Rough Edges and Lack of PolishGoogle's new language model, Bard, faces criticism for its unpolished performance and complexities, hindering its adoption and potentially requiring Google to catch up with competitors.

      Google's new language model, Bard (or Gemini), has received mixed reviews due to its rough edges and lack of polish compared to competitors like GPT-4 from OpenAI. The speaker, who has used Google's ecosystem extensively, had a disappointing experience with Bard when attempting a simple example prompt. He likens the experience of working directly with the model to taking a drone out of autopilot mode, where developers must deal with various complexities and behaviors that are typically handled by well-designed products. The speaker believes that Google could have benefited from more extensive testing and refinement before releasing the model publicly. Despite Google's reputation for powerful AI technologies, the underdeveloped state of Bard may hinder its adoption and may lead Google to play catch-up with competitors in the near future.

    • Exploring the ecosystems supporting large language modelsAnthropic and Cohere offer new features and upgrades, while Unbabel and others explore open and closed models, and Hugging Face showcases diverse applications with Meta Voice.

      The landscape of large language models is not just about the models themselves, but also about the ecosystems that support them. The conversation has primarily focused on Anthropic and Cohere, which are often overshadowed by the closed proprietary models like Google and OpenAI. However, these models are on different release cycles, and we can expect new features and upgrades from Anthropic and Cohere in the coming months. It's essential to remember that the software and hardware are all part of one big system, and improvements in the ecosystem are just as important as the models themselves. Additionally, there are other players in the space, such as Unbabel, that are exploring the boundary between open and closed models, releasing open-source models with usage restrictions or focusing on multimodality. The text-to-speech model, Meta Voice, is a current trend on Hugging Face, showcasing the diverse range of applications and advancements in this field.

    • Exploring AI trends: text-to-speech, image-to-image, and data analyticsApple's MGIE allows image editing using natural language instructions, data analytics chat interfaces are gaining popularity, and AI is improving workflows by analyzing CSV files

      The field of artificial intelligence (AI) is rapidly advancing, with a focus on multimodality and workflow-related applications. During our discussion, we touched on various trends, including text to speech, image to image, and text to image transformations. One intriguing development is Apple's MGIE (Model-driven Generative Image Editing), which allows users to edit images using natural language instructions. This technology could potentially compete with companies like Adobe in the image generation space. Moreover, there's a growing trend towards data analytics use cases in AI. Companies like Defog are offering chat interfaces that enable users to ask natural language queries and receive data analytics answers or charts. This approach is gaining popularity, but there's a need for better understanding of how these systems actually process and analyze data. During our conversation, we also explored the use of AI in uploading and analyzing CSV files. While the results are not yet perfect, this application holds significant potential for improving workflows and making data more accessible. Overall, the advancements in AI are leading to new and innovative applications, making it an exciting time for the field.

    • Misconceptions about AI-driven conversational analyticsAI models don't perform math directly, but generate code to analyze data, allowing them to handle various data types and use cases effectively.

      AI-driven conversational analytics is becoming more accessible to everyone, and it's a valuable tool for handling complex data connections in various industries. However, there's a common misunderstanding about how generative AI models analyze data. Unlike popular belief, these models don't excel at performing mathematical calculations or aggregations directly. Instead, they generate code to analyze the data, which is then executed under the hood. This approach allows these models to handle a wide range of data types and use cases effectively, even if they struggle with basic math. The conversation also highlighted the benefits of using a graph database like Neo4j for handling complex data connections and real-time analytics. It's essential to understand the strengths and limitations of these AI models and databases to make the most of their capabilities. Overall, the conversation emphasized the importance of leveraging AI and databases to tackle complex data challenges and solve significant problems in various industries.

    • Shifting from code generation to SQL generation for data analysisIn 2024, we'll see a surge in the adoption of hybrid methods for data analysis, combining neuro-symbolic approaches and natural language interfaces using SQL generation.

      In enterprise use cases, there's a shift from code generation to SQL generation for data analysis. Instead of generating Python code, models like Defog, Vana dotai, and SQL Coder generate SQL queries from natural language queries to analyze data. This approach is beneficial because SQL is effective for data aggregations, groupings, and joins, and SQL queries can be easily executed using regular programming code. This methodology allows for flexibility in data analytics without requiring a model that excels at a specific task. It's a hybrid approach, combining traditional data analytics methods with a natural language interface driven by a large language model. In the coming year, we can expect more tools and ecosystems to emerge, enabling users to generate intermediates that can perform tasks effectively. This development signifies the maturity of the field and the recognition that there might be better ways to approach data analysis than relying solely on the latest and greatest models. My prediction for 2024 is that we will witness a surge in the adoption of hybrid methods, combining neuro-symbolic approaches and natural language interfaces to enhance data analytics and processing.

    • Merging old data science methods with new LLMsLarge language models can assist in extracting necessary parameters and generating SQL queries, while traditional methods like ARIMA are used for forecasting. This fusion leads to innovation and new applications in various industries, with a focus on effective utilization of smaller models.

      Large language models (LLMs) can be used to extract necessary parameters and possibly generate SQL queries for traditional data science tasks, such as forecasting, while the actual forecasting can still be done using traditional methods like ARIMA statistical forecasting. This merging of old data science methods with new, flexible front-end interfaces is expected to lead to a lot more innovation in various industries. Additionally, there's a growing recognition that smaller LLMs have significant utility and can be used in edge computing and local applications. The focus is shifting from the race to build the largest models to exploring new ways to utilize these models effectively. Even in areas where AI may not yet be a perfect solution, like connecting to printers, LLMs are being integrated to enhance user experience. Overall, the future of LLMs lies in their ability to complement and enhance traditional data science methods, rather than replacing them entirely.

    • Teacher's concerns about AI implementation in schoolsDespite challenges, teachers are encouraged to advocate for AI in classrooms to enhance learning experiences. Teachers can seek support from Daniel and Chris for assistance in implementing these tools.

      The integration of AI in various aspects of life, from commercials during major events like the Super Bowl to local PC usage, is a growing trend. However, it's important to note that not everyone has the same level of control or freedom to utilize these technologies, especially in educational settings. A teacher reached out to express concerns about the limitations schools have in implementing AI tools for homework, and it's crucial to acknowledge and respect these challenges. Despite these obstacles, teachers are encouraged to continue advocating for the use of AI in classrooms to enhance learning experiences for students. If teachers need support in convincing their school systems to adopt these tools, they can reach out to Daniel and Chris for assistance. The complexities of implementing new technologies in educational settings should not discourage us from striving to provide students with the best available resources for their learning journey.

    • Effective Prompting Strategies for Multimodal ModelsThe Prompt Engineering Guide from DARE AI offers valuable strategies for optimizing prompting of multimodal models like ChatGPT, Codex, Gemini, and Gemini Advanced for various tasks, improving outcomes and saving time and effort.

      For those experimenting with multimodal models like ChatGPT, Codex, Gemini, and Gemini Advanced for various tasks, the Prompt Engineering Guide from DARE AI is a valuable resource. The guide provides strategies for effective prompting of these models to achieve desired results. It covers different models and walks users through various techniques to optimize prompting for specific tasks. If you're not getting the desired outcomes while working with these models, the Prompt Engineering Guide is a recommended resource to help you understand and apply effective prompting strategies. The guide is available at promptingguide.ai, and based on the speaker's experience, it's the best resource they've come across so far. Overall, this resource can save time and effort in the process of working with these models for multimodal tasks.

    Recent Episodes from Practical AI: Machine Learning, Data Science

    Apple Intelligence & Advanced RAG

    Apple Intelligence & Advanced RAG
    Daniel & Chris engage in an impromptu discussion of the state of AI in the enterprise. Then they dive into the recent Apple Intelligence announcement to explore its implications. Finally, Daniel leads a deep dive into a new topic - Advanced RAG - covering everything you need to know to be practical & productive.

    The perplexities of information retrieval

    The perplexities of information retrieval
    Daniel & Chris sit down with Denis Yarats, Co-founder & CTO at Perplexity, to discuss Perplexity’s sophisticated AI-driven answer engine. Denis outlines some of the deficiencies in search engines, and how Perplexity’s approach to information retrieval improves on traditional search engine systems, with a focus on accuracy and validation of the information provided.

    Using edge models to find sensitive data

    Using edge models to find sensitive data
    We’ve all heard about breaches of privacy and leaks of private health information (PHI). For healthcare providers and those storing this data, knowing where all the sensitive data is stored is non-trivial. Ramin, from Tausight, joins us to discuss how they have deploy edge AI models to help company search through billions of records for PHI.

    Rise of the AI PC & local LLMs

    Rise of the AI PC & local LLMs
    We’ve seen a rise in interest recently and a number of major announcements related to local LLMs and AI PCs. NVIDIA, Apple, and Intel are getting into this along with models like the Phi family from Microsoft. In this episode, we dig into local AI tooling, frameworks, and optimizations to help you navigate this AI niche, and we talk about how this might impact AI adoption in the longer term.

    AI in the U.S. Congress

    AI in the U.S. Congress
    At the age of 72, U.S. Representative Don Beyer of Virginia enrolled at GMU to pursue a Master’s degree in C.S. with a concentration in Machine Learning. Rep. Beyer is Vice Chair of the bipartisan Artificial Intelligence Caucus & Vice Chair of the NDC’s AI Working Group. He is the author of the AI Foundation Model Transparency Act & a lead cosponsor of the CREATE AI Act, the Federal Artificial Intelligence Risk Management Act & the Artificial Intelligence Environmental Impacts Act. We hope you tune into this inspiring, nonpartisan conversation with Rep. Beyer about his decision to dive into the deep end of the AI pool & his leadership in bringing that expertise to Capitol Hill.

    Full-stack approach for effective AI agents

    Full-stack approach for effective AI agents
    There’s a lot of hype about AI agents right now, but developing robust agents isn’t yet a reality in general. Imbue is leading the way towards more robust agents by taking a full-stack approach; from hardware innovations through to user interface. In this episode, Josh, Imbue’s CTO, tell us more about their approach and some of what they have learned along the way.

    Private, open source chat UIs

    Private, open source chat UIs
    We recently gathered some Practical AI listeners for a live webinar with Danny from LibreChat to discuss the future of private, open source chat UIs. During the discussion we hear about the motivations behind LibreChat, why enterprise users are hosting their own chat UIs, and how Danny (and the LibreChat community) is creating amazing features (like RAG and plugins).

    Mamba & Jamba

    Mamba & Jamba
    First there was Mamba… now there is Jamba from AI21. This is a model that combines the best non-transformer goodness of Mamba with good ‘ol attention layers. This results in a highly performant and efficient model that AI21 has open sourced! We hear all about it (along with a variety of other LLM things) from AI21’s co-founder Yoav.

    Related Episodes

    When data leakage turns into a flood of trouble

    When data leakage turns into a flood of trouble
    Rajiv Shah teaches Daniel and Chris about data leakage, and its major impact upon machine learning models. It’s the kind of topic that we don’t often think about, but which can ruin our results. Raj discusses how to use activation maps and image embedding to find leakage, so that leaking information in our test set does not find its way into our training set.

    Stable Diffusion (Practical AI #193)

    Stable Diffusion (Practical AI #193)
    The new stable diffusion model is everywhere! Of course you can use this model to quickly and easily create amazing, dream-like images to post on twitter, reddit, discord, etc., but this technology is also poised to be used in very pragmatic ways across industry. In this episode, Chris and Daniel take a deep dive into all things stable diffusion. They discuss the motivations for the work, the model architecture, and the differences between this model and other related releases (e.g., DALL·E 2). (Image from stability.ai)

    AlphaFold is revolutionizing biology

    AlphaFold is revolutionizing biology
    AlphaFold is an AI system developed by DeepMind that predicts a protein’s 3D structure from its amino acid sequence. It regularly achieves accuracy competitive with experiment, and is accelerating research in nearly every field of biology. Daniel and Chris delve into protein folding, and explore the implications of this revolutionary and hugely impactful application of AI.

    Zero-shot multitask learning (Practical AI #158)

    Zero-shot multitask learning (Practical AI #158)
    In this Fully-Connected episode, Daniel and Chris ponder whether in-person AI conferences are on the verge of making a post-pandemic comeback. Then on to BigScience from Hugging Face, a year-long research workshop on large multilingual models and datasets. Specifically they dive into the T0, a series of natural language processing (NLP) AI models specifically trained for researching zero-shot multitask learning. Daniel provides a brief tour of the possible with the T0 family. They finish up with a couple of new learning resources.