Logo
    Search

    Podcast Summary

    • Exploring the Power of Go Language for AI ModelsGo language's simplicity, productivity, and small size make it an excellent choice for creating and implementing AI models, while the OpenAI ambassador program offers opportunities to collaborate and share knowledge with the community.

      Go language, with its simplicity and productivity, is an excellent choice for creating and implementing AI models, especially when these models need to be integrated into software. Go's small size, consistent approach, and the ease of remembering its syntax contribute to increased productivity and fewer errors. Natalie Pasternovic, an OpenAI ambassador and developer advocate at Aerospike, shared her experiences as an ambassador, which includes weekly syncs with fellow ambassadors, offering help to those using OpenAI's engines, and getting to trial new engines before they're released. While she couldn't reveal specific details, she mentioned that the interesting ideas and use cases she's encountered through this role demonstrate the versatility and potential of AI technologies. Additionally, she emphasized the importance of collaboration and knowledge sharing among the ambassadors and the OpenAI team. Overall, the conversation highlighted the importance of choosing the right programming language and embracing the opportunities that come with being part of a supportive and innovative community.

    • Exploring GPT-3's versatility: City analysis, contradictory translations, and code generationGPT-3 and Codex have diverse applications, from city analysis and contradictory translations to content creation and code generation in programming languages like Python, Go, and Shell using Copilot.

      GPT-3 and its related technologies like Codex are versatile tools with various applications. One team used GPT-3 to create a knowledge base about cities and analyze them, while another team used it to generate contradicting translations for creating a labeled dataset. The most common use cases include creating content for marketing and writing code with the help of Codex in Copilot. Codex, specifically, is designed to translate languages and performs exceptionally well in languages like Python, Go, and even unexpected ones like Shell. It's primarily used through the Copilot plugin in Visual Studio Code, where users can give commands to complete or write code.

    • Revolutionizing software development with AI-powered coding toolsAI-powered coding tools like Copilot save time, provide suggestions, and even propose new solutions, improving workflow and development times in various programming languages.

      AI-powered coding tools, such as Copilot, are revolutionizing the software development process by significantly reducing the time it takes to write code and providing suggestions that can improve workflow and even propose new solutions. These tools can help developers by generating examples, writing unit tests, refactoring code, and even suggesting alternative implementation methods. They can learn from existing code on platforms like GitHub and provide suggestions in various programming languages. In the short term, these tools will be used within IDEs, but in the future, they may become even more accessible through a graphical user interface that translates user actions into code. The benefits of these tools include faster development times, improved accuracy, and the ability to automate repetitive tasks. For data scientists, who may be hesitant to write tests, these tools can help streamline the testing process. The future of coding may involve less writing of code and more use of these powerful AI-assisted tools.

    • GitHub Copilot: An AI-powered coding assistant with limitationsWhile GitHub Copilot can generate syntactically correct code, it may not produce valid or meaningful code for specific contexts. NLP advancements like OpenAI's GPT 3 have broader applications beyond customer support and chatbots.

      The GitHub Copilot, an AI-powered coding assistant, can generate syntactically correct code but may not produce valid or meaningful code for specific contexts. This was a topic of discussion when Copilot was first released, with questions surrounding the licensing of the content it was trained on. While the code generated is grammatically equivalent, it may not make sense or be suitable for certain use cases. For instance, it can generate an SSH key with the right syntax but it won't be a valid one. The Copilot automates development, but it doesn't yet handle DevOps infrastructure or configurations. In the realm of Natural Language Processing (NLP), there have been significant advancements, most notably with OpenAI's GPT 3, which can mimic human language. NLP is one of the most mature branches of AI, and its applications extend beyond customer support and chatbots to industries like law, healthcare, and finance. At the upcoming ML DataOps Summit, experts will discuss these developments and their implications. The event, hosted by Imerit, is free and virtual, and registration is open at imerit.net/dataops. I, as a speaker at the event, have found the trial version of GitHub Copilot intriguing. It offers tab completion suggestions and prompts, allowing users to start typing something and then auto-completing it. However, it's important to note that the generated code may not always be suitable for specific use cases.

    • Automating Coding Tasks with Codex from OpenAICodex generates code based on user input, saving time and making coding more efficient. However, code quality depends on input and training data, and adapting to different styles can be challenging, especially with open source code.

      Codex, a model from OpenAI, can help automate coding tasks by generating code based on user input. This can save time and make the coding process more efficient for users, especially for those who are learning new programming languages or tools. The model can understand natural language instructions and generate corresponding code snippets. However, it's important to note that the quality of the generated code depends on the quality of the input and the training data. Open source code, which is publicly available, can serve as a valuable resource for training such models. However, there are concerns about the code quality and stylistic differences in open source code. Go, a programming language, has a consistent style, making it easier to maintain a consistent codebase. The model can adapt to different styles, but it may stick to the initial style used in the prompt. The model is trained on a large dataset, which includes both good and bad code, but the ratio of good to bad code in open source versus closed source is unclear. Open source code tends to be of higher quality due to the public nature of it, as developers are less likely to publish poor quality code. Overall, Codex and similar models have the potential to revolutionize the coding process, but it's important to consider the quality and consistency of the generated code, especially when dealing with open source code.

    • Go: Google's Statically-Typed Language for Back-End, DevOps, and AI InfrastructureGo, developed by Google, is a popular choice for back-end development, DevOps, infrastructure, and AI infrastructure due to its built-in concurrency, ease of cross-compilation, and strong community support. It's used in various tools and systems, including Docker, Kubernetes, and AI systems like Prometheus and Jaeger.

      Go, also known as Golang, is a statically typed programming language that was developed by Google and is now widely used beyond Google. It's known for its built-in concurrency and parallelism, making it a great choice for back-end development, DevOps, infrastructure, and even machine learning. Go is easy to cross-compile and run on multiple platforms, making it popular for teams with diverse systems. Go's benefits include its speed, safety, and community support. It's used in various tools and systems, such as Docker, Kubernetes, Prometheus, Jaeger, and even at SpaceX and CERN. While Python is popular for AI experimentation, Go is a good fit for serving AI models and integrating them with APIs, streaming servers, or batch processing infrastructure. Additionally, Go has an ecosystem that's well-suited for the infrastructure needs of AI systems, including monitoring and security. A paper from Google in 2015 highlighted the technical debt of AI systems and the importance of considering various considerations beyond just training and running models. Go's fast serving capabilities and useful infrastructure make it an even better choice for AI systems.

    • Go's consistency and productivity make it effective for AI modelsGo's small size, limited ways of doing things, and uniformity lead to fewer errors, quicker productivity, and easier integration into existing codebases for AI projects

      Go is an effective language for implementing AI models due to its consistency and productivity. Go's small size and limited ways of doing things make it easier for developers to remember and use, leading to fewer errors and quicker productivity. Additionally, if AI generates Go code, it will look identical to human-written code due to the language's uniformity, avoiding the "uncanny valley" effect. In the context of MLOps or AI projects, this consistency makes it easier to integrate generated code into existing codebases and reduces the variability often seen in other languages. Overall, Go's unique characteristics make it a valuable choice for AI development.

    • MLOps Components and Go's Role in Feature EngineeringGo's speed and ease of use make it a valuable tool for automating repetitive tasks and handling features in MLOps projects, contributing to streamlined development processes.

      For an MLOps project, there are essential components that are crucial for making things work in production. These include data processing, data governance, model serving, and the feedback loop for retraining models. A growing trend in MLOps is the importance of feature extraction and feature engineering, which is where Go comes in due to its speed and ease of use for handling features. Go's benefits for MLOps include automating repetitive tasks, offering code documentation, and providing a clear understanding of how AI is integrating into developer workflows. For those new to Go and interested in incorporating it into their AI projects, they can start by familiarizing themselves with its advantages and exploring resources like GopherCon talks for more information. Ultimately, MLOps is about streamlining the machine learning development process, and Go can be a valuable tool for achieving that.

    • Rewriting Python code in Go for machine learning projectsStart by experimenting with rewriting Python code into Go for machine learning projects to explore productivity and educational benefits.

      Exploring the use of Go for machine learning projects, particularly as a way to rewrite Python code, can be a productive and educational experience. This was emphasized during a discussion about using Go for infrastructure in machine learning, with the suggestion to start by rewriting Python code into Go and experimenting with the results. Additionally, resources like the Go tour and upcoming workshops at events like GopherCon can provide valuable insights and learning opportunities. The conversation also touched on the evolution of AI and machine learning discussions in the tech community, with a shift from fear and speculation to a more practical focus on integration and automation. This trend reflects the growing importance of AI and machine learning in the tech industry and the need for infrastructure and DevOps professionals to adapt and support these technologies.

    • AI tools like Copilot and Codex will revolutionize software developmentAI tools will increase developer productivity by generating and compiling code from English commands, creating two branches of productivity: code generation and infrastructure monitoring.

      The integration of AI tools like Copilot and Codex into software development workflows will significantly increase efficiency and productivity for developers, acting as an extension of Integrated Development Environments (IDEs). This development marks a new level of abstraction, allowing developers to write code in English and have the AI generate and compile the code for them. This will lead to two branches of developer productivity: one focused on code generation and the other on infrastructure and monitoring, which will still require manual intervention. The rise of these AI tools also opens up opportunities for non-coders to create tech solutions using no-code tools that translate their English commands into code. However, it's important to note that the impact on infrastructure and monitoring might not be as significant as in other areas. Overall, this development represents an exciting new chapter in how we write code and automate processes.

    Recent Episodes from Practical AI: Machine Learning, Data Science

    Vectoring in on Pinecone

    Vectoring in on Pinecone
    Daniel & Chris explore the advantages of vector databases with Roie Schwaber-Cohen of Pinecone. Roie starts with a very lucid explanation of why you need a vector database in your machine learning pipeline, and then goes on to discuss Pinecone’s vector database, designed to facilitate efficient storage, retrieval, and management of vector data.

    Stanford's AI Index Report 2024

    Stanford's AI Index Report 2024
    We’ve had representatives from Stanford’s Institute for Human-Centered Artificial Intelligence (HAI) on the show in the past, but we were super excited to talk through their 2024 AI Index Report after such a crazy year in AI! Nestor from HAI joins us in this episode to talk about some of the main takeaways including how AI makes workers more productive, the US is increasing regulations sharply, and industry continues to dominate frontier AI research.

    Apple Intelligence & Advanced RAG

    Apple Intelligence & Advanced RAG
    Daniel & Chris engage in an impromptu discussion of the state of AI in the enterprise. Then they dive into the recent Apple Intelligence announcement to explore its implications. Finally, Daniel leads a deep dive into a new topic - Advanced RAG - covering everything you need to know to be practical & productive.

    The perplexities of information retrieval

    The perplexities of information retrieval
    Daniel & Chris sit down with Denis Yarats, Co-founder & CTO at Perplexity, to discuss Perplexity’s sophisticated AI-driven answer engine. Denis outlines some of the deficiencies in search engines, and how Perplexity’s approach to information retrieval improves on traditional search engine systems, with a focus on accuracy and validation of the information provided.

    Using edge models to find sensitive data

    Using edge models to find sensitive data
    We’ve all heard about breaches of privacy and leaks of private health information (PHI). For healthcare providers and those storing this data, knowing where all the sensitive data is stored is non-trivial. Ramin, from Tausight, joins us to discuss how they have deploy edge AI models to help company search through billions of records for PHI.

    Rise of the AI PC & local LLMs

    Rise of the AI PC & local LLMs
    We’ve seen a rise in interest recently and a number of major announcements related to local LLMs and AI PCs. NVIDIA, Apple, and Intel are getting into this along with models like the Phi family from Microsoft. In this episode, we dig into local AI tooling, frameworks, and optimizations to help you navigate this AI niche, and we talk about how this might impact AI adoption in the longer term.

    AI in the U.S. Congress

    AI in the U.S. Congress
    At the age of 72, U.S. Representative Don Beyer of Virginia enrolled at GMU to pursue a Master’s degree in C.S. with a concentration in Machine Learning. Rep. Beyer is Vice Chair of the bipartisan Artificial Intelligence Caucus & Vice Chair of the NDC’s AI Working Group. He is the author of the AI Foundation Model Transparency Act & a lead cosponsor of the CREATE AI Act, the Federal Artificial Intelligence Risk Management Act & the Artificial Intelligence Environmental Impacts Act. We hope you tune into this inspiring, nonpartisan conversation with Rep. Beyer about his decision to dive into the deep end of the AI pool & his leadership in bringing that expertise to Capitol Hill.

    Full-stack approach for effective AI agents

    Full-stack approach for effective AI agents
    There’s a lot of hype about AI agents right now, but developing robust agents isn’t yet a reality in general. Imbue is leading the way towards more robust agents by taking a full-stack approach; from hardware innovations through to user interface. In this episode, Josh, Imbue’s CTO, tell us more about their approach and some of what they have learned along the way.

    Related Episodes

    When data leakage turns into a flood of trouble

    When data leakage turns into a flood of trouble
    Rajiv Shah teaches Daniel and Chris about data leakage, and its major impact upon machine learning models. It’s the kind of topic that we don’t often think about, but which can ruin our results. Raj discusses how to use activation maps and image embedding to find leakage, so that leaking information in our test set does not find its way into our training set.

    Stable Diffusion (Practical AI #193)

    Stable Diffusion (Practical AI #193)
    The new stable diffusion model is everywhere! Of course you can use this model to quickly and easily create amazing, dream-like images to post on twitter, reddit, discord, etc., but this technology is also poised to be used in very pragmatic ways across industry. In this episode, Chris and Daniel take a deep dive into all things stable diffusion. They discuss the motivations for the work, the model architecture, and the differences between this model and other related releases (e.g., DALL·E 2). (Image from stability.ai)

    AlphaFold is revolutionizing biology

    AlphaFold is revolutionizing biology
    AlphaFold is an AI system developed by DeepMind that predicts a protein’s 3D structure from its amino acid sequence. It regularly achieves accuracy competitive with experiment, and is accelerating research in nearly every field of biology. Daniel and Chris delve into protein folding, and explore the implications of this revolutionary and hugely impactful application of AI.

    Zero-shot multitask learning (Practical AI #158)

    Zero-shot multitask learning (Practical AI #158)
    In this Fully-Connected episode, Daniel and Chris ponder whether in-person AI conferences are on the verge of making a post-pandemic comeback. Then on to BigScience from Hugging Face, a year-long research workshop on large multilingual models and datasets. Specifically they dive into the T0, a series of natural language processing (NLP) AI models specifically trained for researching zero-shot multitask learning. Daniel provides a brief tour of the possible with the T0 family. They finish up with a couple of new learning resources.