Logo

    dataops

    Explore "dataops" with insightful episodes like "DataOps: Datenversorgung für Self-Service Analytics – mit Xuanpu Sun, LBBW", "What are AI ProductOps and why do I need them?", "#09. Pioneering DataOps in the Age of AI with Audrey Smith", "Seamless SQL And Python Transformations For Data Engineers And Analysts With SQLMesh" and "Reduce The Overhead In Your Pipelines With Agile Data Engine's DataOps Service" from podcasts like ""Data Culture Podcast", "Industrial AI Podcast", "Humans of AI", "Data Engineering Podcast" and "Data Engineering Podcast"" and more!

    Episodes (20)

    What are AI ProductOps and why do I need them?

    What are AI ProductOps and why do I need them?
    Attention, dear product managers from the industry. This episode is just for you. Anna Maria Brunnhofer-Pedemonte explains us their AI ProductOps-Approach and why the industrial sector needs them. Thanks for listening. We welcome suggestions for topics, criticism and a few stars on Apple, Spotify and Co. We thank our partner Siemens **OUR EVENT IN JANUARY ** https://www.hannovermesse.de/de/rahmenprogramm/special-events/ki-in-der-industrie/ Contact Anna Maria https://www.linkedin.com/in/anna-maria-brunnhofer/ We thank our team: Barbara, Anne and Michael!

    #09. Pioneering DataOps in the Age of AI with Audrey Smith

    #09. Pioneering DataOps in the Age of AI with Audrey Smith

    In this episode of Humans of AI, host Sheikh Shuvo engages with Audrey Smith, the Chief Operating Officer of MLTwist. They delve into the fascinating world of automating data pipelines and the crucial role of DataOps in AI development.


    Key highlights of this episode include:

    • Audrey's Unique Perspective: Coming from a non-technical background, Audrey shares her unique viewpoint on machine learning and data operations, emphasizing the importance of diverse perspectives in tackling data bias.
    • The Evolution of DataOps: Audrey discusses her journey from law to leading operations at ML Twist, shedding light on the growing complexity and significance of DataOps in the AI industry.
    • Future Trends and Challenges: The conversation explores future trends in AI, such as the rise of synthetic data and the impact of regulatory frameworks like the EU AI Act on data management and ethical AI development.


    Join us for an enlightening discussion that uncovers the layers of DataOps and its integral role in shaping the AI landscape.

    Seamless SQL And Python Transformations For Data Engineers And Analysts With SQLMesh

    Seamless SQL And Python Transformations For Data Engineers And Analysts With SQLMesh
    Summary Data transformation is a key activity for all of the organizational roles that interact with data. Because of its importance and outsized impact on what is possible for downstream data consumers it is critical that everyone is able to collaborate seamlessly. SQLMesh was designed as a unifying tool that is simple to work with but powerful enough for large-scale transformations and complex projects. In this episode Toby Mao explains how it works, the importance of automatic column-level lineage tracking, and how you can start using it today. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management RudderStack helps you build a customer data platform on your warehouse or data lake. Instead of trapping data in a black box, they enable you to easily collect customer data from the entire stack and build an identity graph on your warehouse, giving you full visibility and control. Their SDKs make event streaming from any app or website easy, and their extensive library of integrations enable you to automatically send data to hundreds of downstream tools. Sign up free at dataengineeringpodcast.com/rudderstack (https://www.dataengineeringpodcast.com/rudderstack)- Your host is Tobias Macey and today I'm interviewing Toby Mao about SQLMesh, an open source DataOps framework designed to scale data transformations with ease of collaboration and validation built in Interview Introduction How did you get involved in the area of data management? Can you describe what SQLMesh is and the story behind it? DataOps is a term that has been co-opted and overloaded. What are the concepts that you are trying to convey with that term in the context of SQLMesh? What are the rough edges in existing toolchains/workflows that you are trying to address with SQLMesh? How do those rough edges impact the productivity and effectiveness of teams using those Can you describe how SQLMesh is implemented? How have the design and goals evolved since you first started working on it? What are the lessons that you have learned from dbt which have informed the design and functionality of SQLMesh? For teams who have already invested in dbt, what is the migration path from or integration with dbt? You have some built-in integration with/awareness of orchestrators (currently Airflow). What are the benefits of making the transformation tool aware of the orchestrator? What do you see as the potential benefits of integration with e.g. data-diff? What are the second-order benefits of using a tool such as SQLMesh that addresses the more mechanical aspects of managing transformation workfows and the associated dependency chains? What are the most interesting, innovative, or unexpected ways that you have seen SQLMesh used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on SQLMesh? When is SQLMesh the wrong choice? What do you have planned for the future of SQLMesh? Contact Info tobymao (https://github.com/tobymao) on GitHub @captaintobs (https://twitter.com/captaintobs) on Twitter Website (http://tobymao.com/) Parting Question From your perspective, what is the biggest gap in the tooling or technology for data management today? Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ (https://www.pythonpodcast.com) covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast (https://www.themachinelearningpodcast.com) helps you go from idea to production with machine learning. Visit the site (https://www.dataengineeringpodcast.com) to subscribe to the show, sign up for the mailing list, and read the show notes. If you've learned something or tried out a project from the show then tell us about it! Email hosts@dataengineeringpodcast.com (mailto:hosts@dataengineeringpodcast.com)) with your story. To help other people find the show please leave a review on Apple Podcasts (https://podcasts.apple.com/us/podcast/data-engineering-podcast/id1193040557) and tell your friends and co-workers Links SQLMesh (https://github.com/TobikoData/sqlmesh) Tobiko Data (https://tobikodata.com/) SAS (https://www.sas.com/en_us/home.html) AirBnB Minerva (https://medium.com/airbnb-engineering/how-airbnb-achieved-metric-consistency-at-scale-f23cc53dea70) SQLGlot (https://github.com/tobymao/sqlglot) Cron (https://man.freebsd.org/cgi/man.cgi?query=cron&sektion=8&n=1) AST == Abstract Syntax Tree (https://en.wikipedia.org/wiki/Abstract_syntax_tree) Pandas (https://pandas.pydata.org/) Terraform (https://www.terraform.io/) dbt (https://www.getdbt.com/) Podcast Episode (https://www.dataengineeringpodcast.com/dbt-data-analytics-episode-81/) SQLFluff (https://github.com/sqlfluff/sqlfluff) Podcast.__init__ Episode (https://www.pythonpodcast.com/sqlfluff-sql-linter-episode-318/) The intro and outro music is from The Hug (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Love_death_and_a_drunken_monkey/04_-_The_Hug) by The Freak Fandango Orchestra (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/) / CC BY-SA (http://creativecommons.org/licenses/by-sa/3.0/)

    Reduce The Overhead In Your Pipelines With Agile Data Engine's DataOps Service

    Reduce The Overhead In Your Pipelines With Agile Data Engine's DataOps Service
    Summary A significant portion of the time spent by data engineering teams is on managing the workflows and operations of their pipelines. DataOps has arisen as a parallel set of practices to that of DevOps teams as a means of reducing wasted effort. Agile Data Engine is a platform designed to handle the infrastructure side of the DataOps equation, as well as providing the insights that you need to manage the human side of the workflow. In this episode Tevje Olin explains how the platform is implemented, the features that it provides to reduce the amount of effort required to keep your pipelines running, and how you can start using it in your own team. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management RudderStack helps you build a customer data platform on your warehouse or data lake. Instead of trapping data in a black box, they enable you to easily collect customer data from the entire stack and build an identity graph on your warehouse, giving you full visibility and control. Their SDKs make event streaming from any app or website easy, and their extensive library of integrations enable you to automatically send data to hundreds of downstream tools. Sign up free at dataengineeringpodcast.com/rudderstack (https://www.dataengineeringpodcast.com/rudderstack) Your host is Tobias Macey and today I'm interviewing Tevje Olin about Agile Data Engine, a platform that combines data modeling, transformations, continuous delivery and workload orchestration to help you manage your data products and the whole lifecycle of your warehouse Interview Introduction How did you get involved in the area of data management? Can you describe what Agile Data Engine is and the story behind it? What are some of the tools and architectures that an organization might be able to replace with Agile Data Engine? How does the unified experience of Agile Data Engine change the way that teams think about the lifecycle of their data? What are some of the types of experiments that are enabled by reduced operational overhead? What does CI/CD look like for a data warehouse? How is it different from CI/CD for software applications? Can you describe how Agile Data Engine is architected? How have the design and goals of the system changed since you first started working on it? What are the components that you needed to develop in-house to enable your platform goals? What are the changes in the broader data ecosystem that have had the most influence on your product goals and customer adoption? Can you describe the workflow for a team that is using Agile Data Engine to power their business analytics? What are some of the insights that you generate to help your customers understand how to improve their processes or identify new opportunities? In your "about" page it mentions the unique approaches that you take for warehouse automation. How do your practices differ from the rest of the industry? How have changes in the adoption/implementation of ML and AI impacted the ways that your customers exercise your platform? What are the most interesting, innovative, or unexpected ways that you have seen the Agile Data Engine platform used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Agile Data Engine? When is Agile Data Engine the wrong choice? What do you have planned for the future of Agile Data Engine? Guest Contact Info LinkedIn (https://www.linkedin.com/in/tevjeolin/?originalSubdomain=fi) Parting Question From your perspective, what is the biggest gap in the tooling or technology for data management today? About Agile Data Engine Agile Data Engine unlocks the potential of your data to drive business value - in a rapidly changing world. Agile Data Engine is a DataOps Management platform for designing, deploying, operating and managing data products, and managing the whole lifecycle of a data warehouse. It combines data modeling, transformations, continuous delivery and workload orchestration into the same platform. Links Agile Data Engine (https://www.agiledataengine.com/agile-data-engine-x-data-engineering-podcast) Bill Inmon (https://en.wikipedia.org/wiki/Bill_Inmon) Ralph Kimball (https://en.wikipedia.org/wiki/Ralph_Kimball) Snowflake (https://www.snowflake.com/en/) Redshift (https://aws.amazon.com/redshift/) BigQuery (https://cloud.google.com/bigquery) Azure Synapse (https://azure.microsoft.com/en-us/products/synapse-analytics/) Airflow (https://airflow.apache.org/) The intro and outro music is from The Hug (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Love_death_and_a_drunken_monkey/04_-_The_Hug) by The Freak Fandango Orchestra (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/) / CC BY-SA (http://creativecommons.org/licenses/by-sa/3.0/)

    The New Data Engineering Landscape: DataOps, VectorOps, and LangChain

    The New Data Engineering Landscape: DataOps, VectorOps, and LangChain

    This story was originally published on HackerNoon at: https://hackernoon.com/the-new-data-engineering-landscape-dataops-vectorops-and-langchain.
    DataOps, VectorOps, and LangChain integration creates powerful applications that combine efficient data management, high-dimensional data processing.
    Check more stories related to data-science at: https://hackernoon.com/c/data-science. You can also check exclusive content about #data, #dataops, #devops, #vectorops, #vector-search, #langchain, #gpt, #bert, and more.

    This story was written by: @epappas. Learn more about this writer by checking @epappas's about page, and for more stories, please visit hackernoon.com.

    As large language models (LLMs) like GPT-4 emerge, managing high-dimensional data structures becomes increasingly important. LangChain, an LLM-powered application development framework, integrates with DataOps and VectorOps processes and utilizes vector databases to create data-aware, interactive applications.

    Season 4 Episode 3 - The Year Ahead in Federal Health IT

    Season 4 Episode 3 - The Year Ahead in Federal Health IT

    COVID-19 catalyzed federal health IT to expand innovation around health equity, data management and improved customer experience. In 2022, federal health agencies focused on interoperability, cybersecurity, DataOps and more to ensure equitable, accessible and secure health services. Our researchers reflect on the top news and trends of 2022 including the PACT Act, electronic health records and data modernization, and how these efforts will drive momentum in 2023.

    DataOps and the Data Catalog with Guest Speaker Michele Goetz, Vice President and Principal Analyst, Forrester

    DataOps and the Data Catalog with Guest Speaker Michele Goetz, Vice President and Principal Analyst, Forrester

    DataOps is having a moment. Where does it sit in the data lifecycle? And how is this emerging trend changing data management today? To find out, Satyen sits down with guest speaker Michele Goetz, author of The Forrester Wave: Enterprise Data Catalogs for DataOps.

    --------

    “DataOps is really the engineering and practices of designing and developing data capabilities, launching them out to production and ensuring that they're providing value and delivering on the outcomes that businesses expect in being able to use that data.” — Guest Speaker Michele Goetz

    --------

    Time Stamps

    * (0:00) The birth of DataOps

    * (2:43) What is DataOps?

    * (11:18) DataOps and the Data Mesh

    * (18:41) Diving into data prep

    * (22:09) Tackling data governance for your data catalog

    * (31:17) The future of the data cataloging landscape

    --------

    Sponsor

    This podcast is presented by Alation.

    Learn more:

    * Data Radicals: https://www.alation.com/podcast/

    * Alation’s LinkedIn Profile: https://www.linkedin.com/company/alation/

    * Satyen’s LinkedIn Profile: https://www.linkedin.com/in/ssangani/

    --------

    Links

    Connect with guest speaker Michele on LinkedIn

    Check out Forrester

    Newscast 07/2022

    Newscast 07/2022
    Data Lakehouse - Was ist es, braucht man es und was muss man darüber wissen? Andreas und Carsten besprechen außerdem die M&A News & aktuellsten BARC Studien, was in den letzten beiden Monaten im BI-Markt so los war und welche wichtigen Events dieses Jahr noch anstehen! ⪧ Studien • The Planning Survey 22 • Driving Innovation with AI. Getting Ahead with DataOps and MLOps • BARC Score Enterprise BI & Analytics Platforms • BARC Score Analytics for Business Users • CFO-Studie (New Value for the CFO): Konsolidierung macht Platz für integrierte Konzernrechnungslegung ⪧ Events • BI or DIE Level Up - Part II • BI or DIE on tour • Data Festival • Big Data & AI World • BARC Future of SAP Data & Analytics • DATA Festival #online

    Prepare to Scale - How Data Ops Enables Growth

    Prepare to Scale - How Data Ops Enables Growth

    We all know that data drives the information economy, but the efficiency of that engine makes all the difference. That is why DataOps came to be. Savvy practitioners realized that taking a quality manufacturing approach to data management could yield wonders for agility, awareness, and growth.

    Find out more on this episode of #DMRadio as Eric Kavanagh interviews Yves Mulkers of 7wData and Christopher Bergh of DataKitchen, a pioneer of DataOps.

    Interoperability, Governance, and Divergent Teams with Prukalpa Sankar

    Interoperability, Governance, and Divergent Teams with Prukalpa Sankar

    This episode features an interview with Prukalpa Sankar, Co-Founder of Atlan. Atlan is a venture-backed startup building a modern data workspace. Prukalpa also co-founded SocialCops, a data for good company behind landmark projects such as India’s National Data Platform. Prukalpa is a recognized industry leader, landing on the Forbes 30 Under 30 list and Fortune’s 40 Under 40.

    In this episode, Prukalpa and Sam discuss how diversity is a data team’s biggest strength, why governance isn’t always a bad thing, and what they hope the modern data stack will look like in 5 years.

    -------------------

    “Diversity is our biggest strength but our biggest weakness, because it's really hard to make that team collaborate. Because most of the teams in the world are very uniform. So when every single person in the room is a subject matter expert on something, nobody else actually can have oversight on each other's work because they've never done it before. Then how do you create true trust? How do you create trust when things are breaking? If you're able to create a way for these diverse people to collaborate really effectively, to be a dream team, a dream data team where they trust each other and they can collaborate effectively, then magic can happen.” – Prukalpa Sankar

    -------------------

    Episode Timestamps:

    [01:55]: What open source data means to Prukalpa

    [05:38]: Prukalpa’s journey to data for good movement

    [04:51]: How Prukalpa and her team provided gas to 80 million Indian women

    [06:33]: How diversity can help a data team succeed

    [15:10]: What gives Atlan its magic

    [18:58]: How being open by default influenced Atlan’s architecture choices

    [22:45]: The reality of the modern data stack in 5 years

    [27:36]: Advice for people getting started with DataOps

    -------------------

    Links:

    LinkedIn - Connect with Prukalpa

    LinkedIn - Connect with Atlan

    Twitter - Follow Prukalpa

    Twitter - Follow Atlan

    Visit Atlan

    Introduction to Data Operations: Ryan Gross

    The Disappearing DBA: Embracing Automation to Advance Your Career

    The Disappearing DBA: Embracing Automation to Advance Your Career

    A good DBA is hard to find. This is because a good DBA wears many hats, often equating to 3 – 4 full time jobs. How can this be achieved or sustained? The secret is automation, and automation is also the reason the DBA is disappearingIn this final episode of our three-part series exploring the disappearing DBA, Head Geeks Thomas LaRock and Kevin Kline along with GitHub Star and Dual Microsoft MVP Chrissy LeMaire offer advice to data professionals looking to advance their careers by embracing automation. 

    This podcast is provided for informational purposes only.
    © 202
    1 SolarWinds Worldwide, LLC. All rights reserved.

    Data Operations in the Era of AI

    Data Operations in the Era of AI

    In this episode, Jarah Euston discusses:

    1. How is Data Operations being redefined – 2015 Vs 2017 & Beyond
    2. Top challenges most data-driven executives face when it comes to gaining insights from their data.
    3. An opportunity that most data-driven organization are unaware of – this is going to be blow up in next couple of years
    4. Entrepreneur / Executive Special Edition:
      1. How to achieve unstoppable momentum for your startup (whether you work in a scrappy startup environment or a large org, this applies to you!)
      2. How to get better at making presentations
      3. Women Tech Entrepreneurship: Challenges and Path to Success
    5. Top stories she uses when communicating with C-Suite about Data Science

    About Jarah Euston:

    Jarah Euston is a growth and analytics leader who has propelled several Silicon Valley Startups to the top.

    Jarah is one of the Business Insider’s “28 Most Powerful Women in Mobile Advertising.”

    Jarah served as growth and analytics leader at Flurry, Yahoo, and most recently Nexla, an early stage next gen data ops startup, winner of several startup awards, including Strata Hadoop and TechCrunch Disrupt.

    Jarah has a background in Economics and an MBA from Wharton

    To learn more about Jarah and Nexla, please visit nexla.com or reach out to Jarah “at” nexla.com

    About Nexla:

    Nexla automates DataOps so companies can quickly derive value from their data, with minimal engineering required.

    Nexla’s secure platform runs in the cloud or on-premise. It allows business users to send, receive, transform, and monitor data in their preferred format via an easy to use web interface.