Logo
    Search

    Podcast Summary

    • The Importance of Understanding Data's Role in Our LivesData, collected through everyday tech use, can significantly impact our lives, with benefits and risks. Awareness and informed decisions are crucial as technology advances.

      We often underestimate the value and implications of the data we willingly share through our everyday use of technology, particularly our mobile devices. Author John Thompson, in his new book "Data For All," discusses the importance of understanding the motivations behind data collection, storage, and control, as well as the potential benefits and risks. Thompson, a seasoned data expert with a long career in analytics and business intelligence, has seen firsthand the power of data and how it impacts our lives. He believes that as technology advances and data becomes even more ubiquitous, it's crucial to be aware of the potential consequences and make informed decisions. The conversation on Practical AI podcast with Daniel Whitenak and Chris Benson delves deeper into Thompson's perspectives and the significance of data in our modern world.

    • The Evolution of Data Industry and Misunderstanding of Data OwnershipThe data industry's history shows that individuals have the right to control and potentially earn from their data, but many feel their data is being abused without consent. EU legal frameworks are starting to address this issue, allowing individuals to manage and monetize their data.

      Over the last century, the data industry has evolved into a complex system where companies collect, manage, and monetize user data, often without users' full understanding or consent. This history is important to understand as we grapple with the implications of data privacy and ownership in the modern world. Arthur C. Nielsen, a pioneer in the industry, started the data world in the Midwest, and his legacy has led to a misunderstanding that our data is not our own. However, with the EU leading the charge, we are starting to see legal frameworks that allow individuals to manage, delete, and monetize their data. For example, an average user on three platforms could potentially earn $2 a year from their data. Despite experts dismissing the value, many people would welcome this income. It's crucial to recognize that our data is valuable and that we have the right to control it. The history of data collection and monetization has led to a sense of mystery and unease, with many people feeling their data is being abused. Understanding this history can help us navigate the complexities of data privacy and ownership in the digital age.

    • Individuals deserve fair compensation for their dataIn today's digital age, individuals generate vast amounts of data and should be fairly compensated for it, as demonstrated by the EU's GDPR law.

      Individuals need to realize the value of their data and demand fair compensation for it. The precedent set a century ago by companies like Nielsen, which obtained data for free, no longer applies in today's digital age where individuals generate vast amounts of data constantly. The EU's GDPR law is a leading example of how individuals can have control over their data and companies should pay for it. The success of GDPR in Europe demonstrates the benefits of such a system, and it's high time for other parts of the world, including the US, to follow suit. Individuals own their data, and it's essential that they are fairly compensated for it. Companies should not continue to profit from individuals' data for free.

    • New data intermediaries enabling individual data controlThe EU's new data legislation creates data intermediaries like pools and exchanges, allowing individuals to control their data and even charge fees for its use, using a data monetization model similar to music royalties.

      The EU's recent data legislation, including the Data Act, Data Governance Act, and Digital Markets Act, are creating new structures like data pools and data exchanges that will serve as intermediaries between individuals and data-using companies. These intermediaries will allow individuals to control their data, including the ability to withdraw it or even charge fees for its use. The music royalty system can be used as an analogy for this new data monetization model. As these laws continue to roll out in places like California and Europe, data exchanges will become the middlemen, enabling individuals to retain ownership of their data while companies continue to use it. Examples of these data exchanges already exist, particularly in the UK and EU, and they have been successful in areas like healthcare. The future of data ownership and exchange is moving towards a decentralized model where individuals maintain control over their data and intermediaries facilitate the exchange.

    • Understanding Data Types and Characteristics for Effective Data ManagementData exchanges help individuals monetize their data while understanding data's freshness, repetition, infrequency, and continuity is crucial for effective data management.

      Data exchanges are third-party entities that enable individuals to control and monetize their data by setting prices and usage policies. These exchanges cannot monetize or store data themselves but can suggest optimal data monetization strategies. The EU has already established data exchanges as legal entities, and the US is expected to follow suit. Data exchanges function like a marketplace, allowing individuals to set objectives such as donating earnings to charities or reducing data usage by climate offenders. However, beyond the monetization aspect, there are various types and characteristics of data that people might not consider enough. Data can be fresh or stale, repetitive or infrequent, and episodic or continuous. Understanding these characteristics can help individuals and businesses make informed decisions about how to collect, store, and utilize data effectively. For instance, fresh data might be crucial for real-time analytics, while stale data could be sufficient for historical analysis. Repetitive data, on the other hand, might require different handling than infrequent data, and episodic data could have unique implications for data analysis and interpretation. By acknowledging and addressing these data types and characteristics, individuals and businesses can make the most of their data assets and avoid potential pitfalls.

    • The use of technology provides valuable data about our daily lives but comes at a cost to privacyTechnology collects data about our location, voice, browsing, commerce, and driving habits, creating a comprehensive picture of our behavior. While this data can be useful, it's important to consider privacy implications and make informed decisions about what data to share.

      Our reliance on technology, particularly mobile devices with location services enabled, is providing a vast amount of data about our daily lives that can be used in ways we may not fully understand or intend. This data is valuable and can be used to make accurate predictions about our behavior. However, it comes at a cost to our privacy. The use of multiple sources of data, such as location, voice, browsing, commerce, and driving data, can create a comprehensive picture of who we are and what we do. While this data can be useful in certain situations, such as being available for emergencies, it's important to consider the potential implications and make informed decisions about what data we're willing to share. The integration of multiple sources of data is becoming increasingly important in analytics, allowing for more accurate predictions of behavior. However, it's crucial to remember that this data is a reflection of our actions and may not always align with how we perceive ourselves. Therefore, it's essential to consider the trade-offs between convenience and privacy and make conscious choices about how we use technology in our lives.

    • Balancing Data Usage and PrivacyProfessionals must prioritize privacy and trust in data usage by understanding and adhering to regulations, being transparent about data usage, and giving individuals control over their data.

      As data becomes more valuable and regulations evolve, individuals and organizations must consider the implications of data usage on privacy and trust. The speaker, an analytics professional, shares his personal experience of prioritizing privacy by turning off location services and limiting mobile phone use at night. This approach, while unconventional for someone in his field, highlights the importance of individual control over data. Looking at the broader landscape, the EU and Australia are expected to fully implement data privacy regulations within the next few years, with the US following suit in certain states. These regulations aim to give individuals more control over their data and the ability to monetize it. However, not all countries share this perspective, with some autocratic regimes prioritizing data control over individual privacy and transparency. As professionals in the field, it's crucial to keep trust and privacy in mind when developing AI-enabled applications or conducting data analysis. This includes understanding and adhering to regulations, being transparent about data usage, and giving individuals control over their data. By prioritizing trust and privacy, we can build more ethical and effective data-driven solutions.

    • Understanding privacy and trust in data analyticsExperts in data analytics should consult with government officials to help formulate laws protecting privacy and prevent data misuse, addressing the knowledge gap and need for a more tech-savvy approach.

      As data analytics professionals have progressed in their field, they have come to recognize the need for government regulation in areas of trust and privacy due to the increasing power and potential misuse of data. The speaker emphasizes the importance of understanding the nuances of these concepts and the need for a more educated public and government officials to navigate this complex issue. The speaker also highlights the knowledge gap between technology experts and government officials as a significant challenge. They suggest that experts in the field should be consulted to help formulate laws and regulations to protect individuals' privacy and prevent misuse of data. The speaker also acknowledges the slow response from the current, largely older, government officials and the need for a more tech-savvy approach to addressing these issues.

    • Data as a Valuable Asset: GDPR and Future ActsGDPR and future acts will give individuals more control over their data and require businesses to pay for it, impacting how data is accessed, exchanged, and used. Data professionals must adapt to these changes and consider data as a valuable commodity when budgeting and designing systems.

      Data is a valuable asset, much like money, and the upcoming data regulations like GDPR and future acts will significantly impact how data is accessed, exchanged, and used. For individuals, these regulations mean they have more control over their data and can potentially profit from it. For businesses and organizations, they will need to pay for the data they use, either from within their own company or from external sources. This shift in thinking about data as a valuable commodity should influence how analytics and AI professionals approach their work, particularly when it comes to budgeting for data and architecting systems. The upcoming data regulations will require a change in mindset and potentially increased costs, but the fundamental principles of data analysis and use will remain the same. Additionally, the concept of data as a form of currency is becoming increasingly relevant, with companies like Google already profiting significantly from the sale and use of data.

    • The Value and Complexity of Data Generation and MonetizationData is a valuable commodity, derived from real and synthetic sources. Monetization can be achieved through various means, but regulation and potential implications must be considered.

      Data is becoming a valuable commodity in today's world, with the potential for monetization through various means. The discussion touched upon the complexity of data generation and acquisition, with the existence of both real and synthetic data. Real data is derived from existing sources and can be combined to create new, proprietary datasets. Synthetic data, on the other hand, is created from indirect measures when access to proprietary data is limited. The conversation also highlighted the potential for investing and monetization in data, drawing an analogy to cryptocurrencies. However, questions were raised regarding the regulation and potential implications of monetizing synthetic data, which could be compared to printing money. The discussion concluded with an interesting fact - more market research organizations in the US have produced millionaire entrepreneurs than any other business, emphasizing the value of data in various industries.

    • Navigating the New Era of Data and AnalyticsEmbrace the future of data and analytics, look for ways to monetize and improve with education and skills

      That we're living in a new era where data and analytics are becoming increasingly important, and there's a need to adapt and monetize our data in beneficial ways. The future of professions like analytics and AI is exciting, and there's a massive demand for data scientists. The EU is putting in place structures and frameworks to help us navigate this new era. The data and analytics field offers plenty of opportunities, and it's a bright spot for employment. The speaker encourages everyone to embrace this change and look for ways to leverage data to improve their lives. The speaker also shared his personal experience of working in manual labor jobs and emphasized the importance of education and skills development in the data field. The speaker's book, Practical AI, is now available, and listeners can use the discount code "pod practical AI 19" for a 40% discount. The speaker expressed his appreciation for the conversation and looked forward to sharing his future work on the show. Overall, the message is one of optimism and the importance of adapting to the changing times.

    Recent Episodes from Practical AI: Machine Learning, Data Science

    Stanford's AI Index Report 2024

    Stanford's AI Index Report 2024
    We’ve had representatives from Stanford’s Institute for Human-Centered Artificial Intelligence (HAI) on the show in the past, but we were super excited to talk through their 2024 AI Index Report after such a crazy year in AI! Nestor from HAI joins us in this episode to talk about some of the main takeaways including how AI makes workers more productive, the US is increasing regulations sharply, and industry continues to dominate frontier AI research.

    Apple Intelligence & Advanced RAG

    Apple Intelligence & Advanced RAG
    Daniel & Chris engage in an impromptu discussion of the state of AI in the enterprise. Then they dive into the recent Apple Intelligence announcement to explore its implications. Finally, Daniel leads a deep dive into a new topic - Advanced RAG - covering everything you need to know to be practical & productive.

    The perplexities of information retrieval

    The perplexities of information retrieval
    Daniel & Chris sit down with Denis Yarats, Co-founder & CTO at Perplexity, to discuss Perplexity’s sophisticated AI-driven answer engine. Denis outlines some of the deficiencies in search engines, and how Perplexity’s approach to information retrieval improves on traditional search engine systems, with a focus on accuracy and validation of the information provided.

    Using edge models to find sensitive data

    Using edge models to find sensitive data
    We’ve all heard about breaches of privacy and leaks of private health information (PHI). For healthcare providers and those storing this data, knowing where all the sensitive data is stored is non-trivial. Ramin, from Tausight, joins us to discuss how they have deploy edge AI models to help company search through billions of records for PHI.

    Rise of the AI PC & local LLMs

    Rise of the AI PC & local LLMs
    We’ve seen a rise in interest recently and a number of major announcements related to local LLMs and AI PCs. NVIDIA, Apple, and Intel are getting into this along with models like the Phi family from Microsoft. In this episode, we dig into local AI tooling, frameworks, and optimizations to help you navigate this AI niche, and we talk about how this might impact AI adoption in the longer term.

    AI in the U.S. Congress

    AI in the U.S. Congress
    At the age of 72, U.S. Representative Don Beyer of Virginia enrolled at GMU to pursue a Master’s degree in C.S. with a concentration in Machine Learning. Rep. Beyer is Vice Chair of the bipartisan Artificial Intelligence Caucus & Vice Chair of the NDC’s AI Working Group. He is the author of the AI Foundation Model Transparency Act & a lead cosponsor of the CREATE AI Act, the Federal Artificial Intelligence Risk Management Act & the Artificial Intelligence Environmental Impacts Act. We hope you tune into this inspiring, nonpartisan conversation with Rep. Beyer about his decision to dive into the deep end of the AI pool & his leadership in bringing that expertise to Capitol Hill.

    Full-stack approach for effective AI agents

    Full-stack approach for effective AI agents
    There’s a lot of hype about AI agents right now, but developing robust agents isn’t yet a reality in general. Imbue is leading the way towards more robust agents by taking a full-stack approach; from hardware innovations through to user interface. In this episode, Josh, Imbue’s CTO, tell us more about their approach and some of what they have learned along the way.

    Private, open source chat UIs

    Private, open source chat UIs
    We recently gathered some Practical AI listeners for a live webinar with Danny from LibreChat to discuss the future of private, open source chat UIs. During the discussion we hear about the motivations behind LibreChat, why enterprise users are hosting their own chat UIs, and how Danny (and the LibreChat community) is creating amazing features (like RAG and plugins).

    Related Episodes

    When data leakage turns into a flood of trouble

    When data leakage turns into a flood of trouble
    Rajiv Shah teaches Daniel and Chris about data leakage, and its major impact upon machine learning models. It’s the kind of topic that we don’t often think about, but which can ruin our results. Raj discusses how to use activation maps and image embedding to find leakage, so that leaking information in our test set does not find its way into our training set.

    Stable Diffusion (Practical AI #193)

    Stable Diffusion (Practical AI #193)
    The new stable diffusion model is everywhere! Of course you can use this model to quickly and easily create amazing, dream-like images to post on twitter, reddit, discord, etc., but this technology is also poised to be used in very pragmatic ways across industry. In this episode, Chris and Daniel take a deep dive into all things stable diffusion. They discuss the motivations for the work, the model architecture, and the differences between this model and other related releases (e.g., DALL·E 2). (Image from stability.ai)

    AlphaFold is revolutionizing biology

    AlphaFold is revolutionizing biology
    AlphaFold is an AI system developed by DeepMind that predicts a protein’s 3D structure from its amino acid sequence. It regularly achieves accuracy competitive with experiment, and is accelerating research in nearly every field of biology. Daniel and Chris delve into protein folding, and explore the implications of this revolutionary and hugely impactful application of AI.

    Zero-shot multitask learning (Practical AI #158)

    Zero-shot multitask learning (Practical AI #158)
    In this Fully-Connected episode, Daniel and Chris ponder whether in-person AI conferences are on the verge of making a post-pandemic comeback. Then on to BigScience from Hugging Face, a year-long research workshop on large multilingual models and datasets. Specifically they dive into the T0, a series of natural language processing (NLP) AI models specifically trained for researching zero-shot multitask learning. Daniel provides a brief tour of the possible with the T0 family. They finish up with a couple of new learning resources.