
    Podcast Summary

    • Image captioning competition for underrepresented languages using AI172 teams from 36 universities collaborated to create image captions for underrepresented languages using AI, demonstrating the power of data analytics for positive impact.

      The data analytics community came together at Purdue University for a unique case competition focused on using artificial intelligence for good, specifically in the task of image captioning for underrepresented languages. Sponsored by Purdue University, Microsoft, SIL International, and Informs, the competition attracted 172 teams from 36 universities across the nation, with 2 teams from outside the United States. The problem presented to the students was to use natural language processing for image captioning, which was not a typical problem in traditional NLP courses, leading to valuable learning experiences. SIL International's recent release of an image captioning dataset provided the perfect opportunity for this collaboration. Matthew Lanham, the academic director of the MS Business Analytics and Information Management Program at Purdue, shared that the goal was to create a national data analytics competition that focused on making a positive impact, rather than just making money. The competition showcased the power of AI in tackling real-world problems and brought together a diverse group of students to collaborate and innovate. The event was a testament to the potential of data analytics for good and the importance of collaboration and knowledge sharing in the field.

    • Holistic approach to complex problems in AI, analytics, and data scienceThe INFORMS Certified Analytics Professional (CAP) program emphasizes a structured approach to solving business problems, including problem framing, data understanding, methodology selection, and implementation, to ensure comprehensive and effective solutions.

      Working on complex problems involving AI, analytics, or data science requires a holistic approach that goes beyond just the technical aspects. This was highlighted during a competition focused on natural language processing and image captioning in various languages, where the challenge of handling languages without spaces led to interesting discoveries for students. The Institute for Operations Research and Management Science (INFORMS) was introduced as a vibrant community that promotes this holistic approach through its Certified Analytics Professional (CAP) program. This program identifies key tasks involved in solving business problems, including business problem framing, analytical problem framing, knowing your data, methodology selection, model building, deployment, and life cycle management. By following a structured process like this, students and professionals are better equipped to tackle real-world problems in a comprehensive manner. This not only helps in understanding the audience and their needs but also ensures that the technical solution is effectively implemented and maintained over time.

    • Understanding complex business problems with Informs CAPInforms CAP is a seven-domain framework for solving business problems through data science, with training and certification opportunities for students using Azure AI.

      The Informs Capability Analysis Process (Informs CAP) is a crucial framework for solving complex business problems through data science. It consists of seven domains: business understanding, problem formulation, data understanding, data preparation, modeling, data presentation, and deployment and life cycle management. This process ensures that all aspects of a problem are considered, from understanding the business issue to deploying and managing the solution. Additionally, this competition for students, designed in collaboration with Microsoft, offers free training on Azure AI and certification vouchers to help students gain the necessary skills to effectively apply these web services to real-world problems. The competition's three phases allow students to receive training, work on the problem, and present their solutions, demonstrating the importance of both theoretical knowledge and practical application. Microsoft's involvement in the competition further emphasizes the significance of utilizing cloud services in data science solutions.

    • Exploring Cloud Technologies with Microsoft's Free ResourcesMicrosoft offers free resources for learning cloud technologies, including YouTube tutorials and a free Azure subscription, providing hands-on experience and practical knowledge.

      For students and individuals looking to explore the cloud and expand their horizons in technology without needing extensive knowledge of specific tools like Docker and Kubernetes, there are numerous resources available online for self-learning. Microsoft's ecosystem, for instance, offers a wealth of free content on YouTube, including short sessions and longer tutorials on various topics. The Azure platform, which includes Azure Active Directory for authentication, acts as a unifying factor for many Microsoft services. A free subscription to Azure can provide hands-on experience, and the Azure Machine Learning Studio is a flagship technology for machine learning that can run on regular CPUs or GPUs within the subscription. By utilizing these resources and gaining practical experience, individuals can deepen their understanding of cloud technologies and their potential applications.

    • Microsoft Azure caters to diverse industries and use cases with various solutions, including machine learning and cognitive services.Microsoft Azure offers machine learning solutions through cognitive services and MLflow for infrastructure understanding and model management.

      Microsoft Azure offers various solutions for different industries and use cases, including federal spaces, sovereign clouds, and specialty clouds. Machine learning is a significant focus, with cognitive services as pre-trained models and APIs. Microsoft has adopted MLflow as the primary method for organizing machine learning experiments, training, models, and deployment within Azure. This open-source technology allows for better infrastructure understanding and model life cycle management. For those not able to attend institutions like Purdue, Microsoft's AI Business School serves as an alternative entry point into the industry, providing valuable hands-on experience.

    • Microsoft's AI Business School: A Comprehensive Resource for Implementing AIMicrosoft's AI Business School offers a range of courses, tutorials, samples, and case studies to help individuals understand how AI is used in a business context. Microsoft's commitment to sharing knowledge and experiences is an invaluable resource for those looking to integrate AI into their operations.

      Microsoft's AI Business School and resources offer valuable insights into implementing AI in a business context. The school consists of a series of courses demonstrating how Microsoft uses AI in its own business, covering various aspects from data modeling to evaluation. It serves as a starting point for individuals with different roles, and Microsoft is continually expanding its offerings to cater to various personas. Additionally, Microsoft provides tutorials, samples, and case studies to help users get started with their technologies. They also collaborate with organizations like Purdue University and the Metropolitan Museum of Art to share success stories. The Azure Architecture Center is another resource, showcasing various architecture designs and use cases. These resources provide a wealth of information on how to effectively utilize different AI resources. Overall, Microsoft's commitment to sharing knowledge and experiences is an invaluable resource for individuals and organizations looking to integrate AI into their operations.

    • Overcoming challenges in image captioning for underrepresented languagesTeam used dataset augmentation and new images to tackle small datasets and limited image variety. Envisioned an app to help small businesses and preserve underrepresented languages. Learned about AI's impact on language accessibility.

      The challenge of image captioning for underrepresented languages presented significant hurdles due to small datasets and limited image variety. However, the team overcame this by artificially augmenting the dataset or adding new pictures. Furthermore, they envisioned a potential impact on small businesses and their communities by creating a web or mobile app for image captioning in underrepresented languages. This app could help small businesses attract customers by providing captions for their images, benefiting both the businesses and the small language communities. Additionally, the team recognized the importance of preserving cultural heritage through this technology, as many languages are lost every two weeks. During the competition, team members gained new insights, particularly in the application of AI and machine learning to language preservation. Sean opened their eyes to this issue, and they learned that technology can make a difference in the world by directly impacting language accessibility. Overall, the team's experience showcased the potential for using data science and analytics to address both technical challenges and social issues.

    • Creating a multilingual image captioning model with a multistage solutionThe team used a multistage solution, including a state-of-the-art multilingual CLIP model and checking for existing captions in a database, to create a successful image captioning model for diverse languages like Thai, Kyrgyz, and Hausa.

      The team, consisting of Harsha, Varun, Ravi, and Sanchita, made history by creating the first solution in a competition of around 170 teams to develop an image captioning model that performs effectively on diverse languages - Thai, Kyrgyz, and Hausa. Their innovative approach combined the use of state-of-the-art models like CLIP with a multistage solution that checked for existing captions in a database before generating new ones. Initially, the team considered using a classification model to select the best-matching sentence from a corpus. However, they soon realized the challenges of creating a good zero-shot captioning model for such diverse and complex prediction tasks. After researching, they discovered the multilingual CLIP model from Hugging Face, which significantly improved their overall solution. When implementing the idea of using existing captions, they found that only about 20-30% of the images in the training dataset had captions that matched the multilingual clip model. The team decided to focus on this percentage and set a threshold for false positives. The application of the multilingual clip model led to a significant jump in their test set scores. Despite using industry-standard tools like CLIP and Hugging Face, the team faced challenges in the beginning, including being stumped by the dataset and dealing with computational issues. Their perseverance and innovative approach ultimately led to their success in the competition.

    • The importance of deep analysis in complex data projectsInitial EDA may not be sufficient for complex data projects, requiring advanced models for meaningful insights. AI and data science have the potential to improve language proficiency and increase educational rates.

      When working on complex data science or AI projects, initial exploratory data analysis (EDA) can provide valuable insights but may not be sufficient. The team in this discussion encountered this challenge when working on a project involving text data that contained poems and other contextual information. They initially used Microsoft Azure's computation API and translator model, but found that it only provided reasonable guesses and they needed to delve deeper. They considered other ideas such as clustering common images and looking for models that could generate context. This experience highlighted the importance of considering the depth and complexity of the data, and the need for advanced models to extract meaningful insights. Furthermore, the potential positive impact of AI and data science on the real world was emphasized in the discussion. The team noted how AI and data are improving language proficiency and increasing educational rates. This experience will influence their future thinking about AI and data science problems, as they recognize the far-reaching implications and potential benefits for people's lives. The team expressed their commitment to contributing to this field and making a positive impact wherever they can. Overall, this project served as a reminder of the importance of deep analysis and the potential for AI and data science to make a significant impact on the world.

    Recent Episodes from Practical AI: Machine Learning, Data Science

    Stanford's AI Index Report 2024

    Stanford's AI Index Report 2024
    We’ve had representatives from Stanford’s Institute for Human-Centered Artificial Intelligence (HAI) on the show in the past, but we were super excited to talk through their 2024 AI Index Report after such a crazy year in AI! Nestor from HAI joins us in this episode to talk about some of the main takeaways including how AI makes workers more productive, the US is increasing regulations sharply, and industry continues to dominate frontier AI research.

    Apple Intelligence & Advanced RAG

    Apple Intelligence & Advanced RAG
    Daniel & Chris engage in an impromptu discussion of the state of AI in the enterprise. Then they dive into the recent Apple Intelligence announcement to explore its implications. Finally, Daniel leads a deep dive into a new topic - Advanced RAG - covering everything you need to know to be practical & productive.

    The perplexities of information retrieval

    The perplexities of information retrieval
    Daniel & Chris sit down with Denis Yarats, Co-founder & CTO at Perplexity, to discuss Perplexity’s sophisticated AI-driven answer engine. Denis outlines some of the deficiencies in search engines, and how Perplexity’s approach to information retrieval improves on traditional search engine systems, with a focus on accuracy and validation of the information provided.

    Using edge models to find sensitive data

    Using edge models to find sensitive data
    We’ve all heard about breaches of privacy and leaks of private health information (PHI). For healthcare providers and those storing this data, knowing where all the sensitive data is stored is non-trivial. Ramin, from Tausight, joins us to discuss how they have deploy edge AI models to help company search through billions of records for PHI.

    Rise of the AI PC & local LLMs

    Rise of the AI PC & local LLMs
    We’ve seen a rise in interest recently and a number of major announcements related to local LLMs and AI PCs. NVIDIA, Apple, and Intel are getting into this along with models like the Phi family from Microsoft. In this episode, we dig into local AI tooling, frameworks, and optimizations to help you navigate this AI niche, and we talk about how this might impact AI adoption in the longer term.

    AI in the U.S. Congress

    AI in the U.S. Congress
    At the age of 72, U.S. Representative Don Beyer of Virginia enrolled at GMU to pursue a Master’s degree in C.S. with a concentration in Machine Learning. Rep. Beyer is Vice Chair of the bipartisan Artificial Intelligence Caucus & Vice Chair of the NDC’s AI Working Group. He is the author of the AI Foundation Model Transparency Act & a lead cosponsor of the CREATE AI Act, the Federal Artificial Intelligence Risk Management Act & the Artificial Intelligence Environmental Impacts Act. We hope you tune into this inspiring, nonpartisan conversation with Rep. Beyer about his decision to dive into the deep end of the AI pool & his leadership in bringing that expertise to Capitol Hill.

    Full-stack approach for effective AI agents

    Full-stack approach for effective AI agents
    There’s a lot of hype about AI agents right now, but developing robust agents isn’t yet a reality in general. Imbue is leading the way towards more robust agents by taking a full-stack approach; from hardware innovations through to user interface. In this episode, Josh, Imbue’s CTO, tell us more about their approach and some of what they have learned along the way.

    Private, open source chat UIs

    Private, open source chat UIs
    We recently gathered some Practical AI listeners for a live webinar with Danny from LibreChat to discuss the future of private, open source chat UIs. During the discussion we hear about the motivations behind LibreChat, why enterprise users are hosting their own chat UIs, and how Danny (and the LibreChat community) is creating amazing features (like RAG and plugins).

    Related Episodes

    When data leakage turns into a flood of trouble

    When data leakage turns into a flood of trouble
    Rajiv Shah teaches Daniel and Chris about data leakage, and its major impact upon machine learning models. It’s the kind of topic that we don’t often think about, but which can ruin our results. Raj discusses how to use activation maps and image embedding to find leakage, so that leaking information in our test set does not find its way into our training set.

    Stable Diffusion (Practical AI #193)

    Stable Diffusion (Practical AI #193)
    The new stable diffusion model is everywhere! Of course you can use this model to quickly and easily create amazing, dream-like images to post on twitter, reddit, discord, etc., but this technology is also poised to be used in very pragmatic ways across industry. In this episode, Chris and Daniel take a deep dive into all things stable diffusion. They discuss the motivations for the work, the model architecture, and the differences between this model and other related releases (e.g., DALL·E 2). (Image from stability.ai)

    AlphaFold is revolutionizing biology

    AlphaFold is revolutionizing biology
    AlphaFold is an AI system developed by DeepMind that predicts a protein’s 3D structure from its amino acid sequence. It regularly achieves accuracy competitive with experiment, and is accelerating research in nearly every field of biology. Daniel and Chris delve into protein folding, and explore the implications of this revolutionary and hugely impactful application of AI.

    Zero-shot multitask learning (Practical AI #158)

    Zero-shot multitask learning (Practical AI #158)
    In this Fully-Connected episode, Daniel and Chris ponder whether in-person AI conferences are on the verge of making a post-pandemic comeback. Then on to BigScience from Hugging Face, a year-long research workshop on large multilingual models and datasets. Specifically they dive into the T0, a series of natural language processing (NLP) AI models specifically trained for researching zero-shot multitask learning. Daniel provides a brief tour of the possible with the T0 family. They finish up with a couple of new learning resources.