Logo

    The Emergence of Warehouse-native Product Analytics | Vijay Ganesan, CEO at NetSpring

    enMarch 23, 2023
    What was the main topic of the podcast episode?
    Summarise the key points discussed in the episode?
    Were there any notable quotes or insights from the speakers?
    Which popular books were mentioned in this episode?
    Were there any points particularly controversial or thought-provoking discussed in the episode?
    Were any current events or trending topics addressed in the episode?

    About this Episode

    šŸ„ This episode is brought to you by NetSpring, a Warehouse-native Product Analytics tool. šŸ„

    The future looks bright ā€” one where the long-lasting conflict between Business Intelligence (BI) and Product Analytics is finally resolved!

    For data-forward companies, there is a strong need for both types of analysis tools as they were built to work with fundamentally different types of data ā€” BI for relational data and Product Analytics for event data ā€” and serve fundamentally different purposes.

    BI is an analysis interface for the entire organization, enabling stakeholders to derive insights from the analyses performed by data teams.

    Product Analytics, on the other hand, has proven to be non-negotiable for teams to better understand product usage and identify points of friction in the user journey.

    In reality though, the user journey extends well beyond the core product (web and mobile apps) ā€” it includes interactions a user has with a brand across engagement, advertising, and support channels.

    Teams need to combine product-usage data with data from third-party tools to get a complete picture of the user journey. But doing so has been rather challenging using first-generation Product Analytics tools, and even more so using BI tools.

    Enter Warehouse-native Product Analytics.

    A Warehouse-native Product Analytics tool sits on top of the customer's data warehouse, allowing teams to perform end-to-end analyses that combine first-party behavioral data with data from third-party sources used for email engagement, advertising, and support.

    In this episode, Vijay Ganesan, CEO of NetSpring walks us through the ins and outs of this technology (Warehouse-native Product Analytics) via answers to questions like:

    * Why has it been challenging to offer both BI and Product Analytics capabilities in a single product? How is that changing with the rise of the cloud data warehouse?

    * What are the organizational shifts contributing to the adoption of warehouse-native apps?

    * How does the warehouse-native approach better equip organizations to comply with privacy regulations like the GDPR?

    If you work in Product, Growth, or Data, Iā€™m certain that youā€™ll find this conversation insightful.

    You can also tune in on Apple, Spotify, Google, and YouTube, or read the key takeaways from the conversation below (slightly edited for clarity).

    Key takeaways from this conversation

    Arpit (02:06):

    BI was built to explore relational data, whereas Product Analytics relies on event data or behavioral data. From a technical point of view, why is it so challenging to offer both BI and Product Analytics capabilities in a single product?

    Vijay (02:20):

    Historically, the two worlds have been very distinct at all levels of the stack.

    * The way you collect and store data is very different for BI and Product Analytics: High-velocity event data typically never reached the data warehouse ā€” it wasnā€™t feasible to store petabyte-scale data in traditional data warehouses anyway.

    * The nature of computation is very different: In BI, you're doing dimensional slice and dice ā€” you take a metric and slice it by different dimensions. The way you structure the data for BI is very different from what you do for Product Analytics where youā€™re not studying the final state ā€” you're studying the sequence of events that lead to a final state.

    * The way you express the analytical computations are very different too: In BI, it's SQL oriented whereas, in the event data world of Product Analytics, SQL is not the best language to express the analytical intent.

    Therefore, at all levels, BI and Product Analytics are very different types of systems, making it very difficult to do one in the other.

    Arpit (06:50):

    NetSpring doesn't ingest any data but are there any prerequisites for it to work? Do companies need to model their event data as per a specific schema?

    Vijay (07:06):

    We don't do any instrumentation as we believe in the concept of decoupled instrumentation ā€” the idea that you use best-of-breed, unopinionated instrumentation systems (CDIs) like Rudderstack, Segment, or Snowplow to land the data in the data warehouse in a form that is consumable by anyone.

    Secondly, in terms of schemas and data models, and this is one of our key differentiations, we can consume any arbitrary schema. Unlike first-generation tools (like Mixpanel and Amplitude), we don't require you to force your data model into some pre-canned user event ā€” we can work off a generic data model.

    And this goes back to our BI DNA where we're similar to BI tools in the sense that they can work off arbitrary schemas.

    We're fundamentally relational in nature with event layered on top of it which we refer to as Relational Event Streams ā€” you point NetSpring to whatever schema you have in your data warehouse, do some decorations on certain data sets to turn them into event streams, and then you have the full glory of product analytics and sophisticated BI-type analytics on top of the data warehouse.

    Arpit (08:31):

    Today, all first-gen product analytics tools support the data warehouse as a data source, but they still need to store the data in their own environment. Besides the lack of ingestion, is there anything else a warehouse-native product like NetSpring does differently?

    Vijay (09:36):

    When we talk about working off the data model in your data warehouse, we talk about consuming those data models in their native form. On NetSpring, if you're looking at Salesforce data, for example, you'll see first-class entities for accounts, contracts, and opportunities. Similarly, with Zendesk, it's tickets. The ability to consume this business context in a native form is very powerful, and that's what lends itself to very rich analytics.

    Arpit (10:42):

    And what are the key factors that make warehouse native products more affordable than their traditional counterparts?

    Vijay (11:38):

    Besides the fact that you donā€™t pay twice for your data with a copy in the warehouse and a copy elsewhere,

    * You need not pay for Reverse ETL jobs to move data from the warehouse to the Product Analytics tool

    * And thereā€™s a cost associated even with figuring out what to send, what to not to send, what to delete, etc ā€” you don't have to worry about any of that.

    So there's process cost, operational cost, and then there is a large opportunity cost. If your analytics is siloed and not impactful, the opportunity cost is huge.

    With warehouse-native tools, you only pay when someone queries the data.

    Therefore, the overall difference in cost between the first-generation approach to ...

    Recent Episodes from databeats

    Make Your Data Warehouse Your Growth Engine | Boris Jabes, CEO at Census

    Make Your Data Warehouse Your Growth Engine | Boris Jabes, CEO at Census

    In this episode, our host Arpit Choudhury talks with Boris Jabes, CEO of Census, about how you can transform your data warehouse into your growth engine.

    While many of us say that using the existing data in your warehouse is the best way forward, the real question is, how do we ensure that the growth team can effectively leverage the data that is already in the warehouse?

    If you're looking for insights into making your data warehouse a powerhouse for growth, this episode is a must-watch. In just under 15 minutes, you'll receive answers to the following questions:

    Ā ā€¢ What does it take for a growth team to effectively leverage the data that's already in your warehouse?
    Ā ā€¢ What role does the data team play in empowering the growth team to utilize the available data?
    Ā ā€¢ How can data analysts and engineers engage in more impactful work and understand how their efforts drive business outcomes?

    Happy watching! šŸ„

    You can learn more about Census here: https://www.getcensus.com/

    _______________

    Come say hi šŸ‘‹Ā  on our socials!Ā 

    šŸ§³ LinkedIn: https://www.linkedin.com/company/data-beats/
    šŸ“±Instagram: https://www.instagram.com/datafreakinbeats/
    šŸ¦Twitter: https://twitter.com/databeatsnow

    You can also check out our āœØ free newsletter āœØ https://databeats.community/

    The Rapid Evolution of Reverse ETL | Boris Jabes, CEO of Census

    The Rapid Evolution of Reverse ETL | Boris Jabes, CEO of Census

    Reverse ETL has been gaining traction over the last few years. In this episode, our host Arpit Choudhury talks to Boris Jabes, CEO at Census about Reverse ETL and how it can improve customer experiences, especially given the increasingly complex user journeys spanning multiple touchpoints across various channels.


    With data permeating every aspect of businesses, the conversation moves to how people in GTM (go-to-market) roles can leverage available customer data to improve campaigns via privacy-friendly personalization, and how modern tooling is enabling GTM folks to move even faster.Ā 


    Arpit also shares his take on the term "non-technical" and Boris describes the factors leading to the rapid adoption of Reverse ETL as well as the pros and cons of centralizing all the data in the warehouse.


    Happy listening!Ā 


    _______________


    Come say hi šŸ‘‹Ā  on our socials!Ā 


    šŸ§³LinkedIn: https://www.linkedin.com/company/data-beats/

    šŸ“±Instagram: https://www.instagram.com/datafreakinbeats/

    šŸ¦Twitter: https://twitter.com/databeatsnow


    Data Minimization | Siobhan Solberg, Privacy Expert and Creator

    Data Minimization | Siobhan Solberg, Privacy Expert and Creator

    What is data minimization?Ā 


    As per the GDPR, data minimization implies that ā€œdata controllers should collect only the personal data they really need, and should keep it only for as long as they need itā€Ā 


    Organizations that collect data about their users and customers are essentially data controllers. Organizations control the data they collect and store and are responsible for the consequences of that data being misused.Ā 


    But thatā€™s not all.Ā 


    To stay compliant with privacy regulations such as the GDPR, organizations need to ensure the following:Ā 

    • They only collect and store customer data that they have received consent for
    • They do not continue storing any data that theyā€™re supposed to delete from all they systems


    The practice of Data Minimization ensures that organizations only collect and store data that they have an identified need for ā€“ they know why theyā€™re collecting the data and how theyā€™re going to use that data to improve the customer experience. Knowing the purpose of the collected data enables organizations to easily keep customers and regulators informed about what data is being collected, how itā€™s being collected, and where it is being used.Ā 


    It also makes it easy for customers to opt out from certain data collection practices because they know exactly what they will be losing out on ā€“ they need not continue sharing data in fear of losing access to a service or being subject to a degraded customer experience.Ā 


    Itā€™s becoming the norm for organizations to collect ALL the data from ALL the sources and dump it ALL in the data warehouse. And this practice of collecting and dumping all the data is fueling the rise of ā€œdata swampsā€.Ā 


    Thereā€™s a massive disconnect between data teams that implement data collection initiatives and non-data teams that need the data in the tools they use every day. And that is the biggest cause for a data swamp ā€“ too much raw, unusable data that not only increases storage cost but also increases the risk potential for the organization.


    Therefore, organizations that are serious about adopting privacy-friendly personalization practices must embrace the practice of Data Minimization ā€” sooner rather than later.

    CDP Rapid Fire - Round 2

    CDP Rapid Fire - Round 2

    Welcome to Round 2 of the CDP Rapid Fire!

    In this round, our host, Glenn Vanderlinden asked the guests follow-up questions based on their responses to the statements from round 1.


    This one is packed with too much good advice and too many laughs, leaving no reason to miss it.


    In fact, there was so much goodness in this episode that I had to cut it short. In the coming weeks, weā€™ll release the rest as short snippets so stay tuned (and subscribe if you havenā€™t already).Ā 


    P.S. If youā€™re a recent subscriber and are wondering whatā€™s with CDPs being all the rage, please have a quick look at our campaign, Let's End The CDP Battle: https://databeats.community/p/lets-end-the-cdp-battle-a-campaign

    If you prefer to read, here you go.

    The Evolution of the CDP | Kevin Niparko, VP of Product at Twilio Segment

    The Evolution of the CDP | Kevin Niparko, VP of Product at Twilio Segment

    Itā€™s been 10 years since the term ā€œCustomer Data Platformā€ was coined by David Raab, Founder of the CDP Institute. Needless to say, the definition of a CDP has evolved a lot, and slowly but surely, the beast that is the CDP has grown new heads ā€“ or components ā€“ each of which serves a specific purpose.Ā  Ā 


    Part of the confusion regarding what a CDP even means stems from the fact that companies that recognized the opportunity early have been pushing the CDP envelope by building or buying complementary solutions, while others are selling CDP components but calling themselves a CDP nonetheless.Ā 


    Segment, which was acquired by Twilio in late 2020, has been around since the early days. And so has their VP of Product, Kevin Niparko whoā€™s been with Segment since 2015 and has had a front-row seat to how the CDP space has evolved over the last 8 years.Ā 


    In this episode, our host, Arpit Choudhury, and our guest, Kevin Niparko rapidly discussed his early days as a growth analyst at Segment followed by the key innovations that put Segment on the map. We concluded the episode by discussing the two underrated but extremely important components of the CDP ā€“ data quality and privacy.


    I learned a lot while researching for this episode and my key takeaway was that the industry needs to focus less on whatā€™s new and flashy and take a moment to acknowledge the innovations that enable most of whatā€™s new and flashy.Ā 


    Fun fact: In late 2015, Segment launched its Warehouses product that let customers sync data to their own Redshift or Postgres database ā€“ long before the rise of the cloud data warehouse.Ā 


    P.S. A conversation about CDPs in 2023 is incomplete without shedding some light on the Composable vs Packaged CDP debate.Ā 


    If you prefer to read, here you go: https://databeats.community/p/the-evolution-of-the-cdp


    CDP Rapid Fire - Round 1

    CDP Rapid Fire - Round 1

    We recently managed to bring together some leading minds in the CDP arena ā€” not to fight or argue, but to find some common ground and put an end to the debate-turned-battle between the Composable and the Packaged CDP camps.Ā 


    Hereā€™s the guest list:

    • Boris Jabes from Census represents the Composable CDP camp
    • Michael Katz from mParticle represents the Packaged CDP camp
    • David Raab from the CDP Institute represents the neutral party that cares deeply about the category (he coined the term, Customer Data Platform, after all)
    • Jacques Corby-Tuech, a RevOps practitioner, represents the end user or the beneficiary of a CDP
    • Matthew Niederberger, a CDP consultant, represents folks who implement CDPs of all types

    And some context on how we landed here:

    At Human37, Glenn implements CDPs of all types for companies in Europe. And in my quest to grow this community (thank you for being a part), I talk to a lot of people ā€” all types of stakeholders essentially.

    And Glenn and I found one thing in common:

    Everybody in the CDP space was confused.

    People building CDPs, people selling CDPs, people buying CDPs. Even people using CDPs and those implementing CDPs ā€” everyone was confused and many were frustrated.

    And we just wanted to change that.

    We also think that this battle between Composable and Packaged CDPs is fruitless ā€” itā€™s not helping anybody or adding much value. And we wanted to get people together who want to discuss more pressing problems in the data space.

    Needless to say, weā€™re far from achieving that goal but this is a good start and weā€™re optimistic.


    So, without further ado, welcome to Round 1 of the CDP Rapid Fire! šŸ„šŸ„

    In this round, Iā€™ll deliver one statement at a time and each guest will respond with ā€œI agreeā€ or ā€œI disagreeā€, along with some quick thoughts to support their stance.Ā 

    Whatā€™s really valuable here is that together, these five individuals represent all the stakeholders involved in buying, deploying, and deriving value from a CDP.Ā 

    Letā€™s get into it.

    Building a Warehouse-native App | Abhishek Rai, Co-Founder at NetSpring

    Building a Warehouse-native App | Abhishek Rai, Co-Founder at NetSpring

    If youā€™re planning to build a warehouse-native app or support this growing architecture for your existing SaaS, then you definitely don't want to miss this conversation between two leading minds in the warehouse-native domain.Ā 

    In this episode of the data beats show, Luke Ambrosetti hosted Abhishek Rai, the Co-Founder and Head of ProductĀ at NetSpring.

    šŸ„ This episode is brought to you by NetSpring, a Warehouse-native Product Analytics tool. šŸ„

    Luke has spent more time working on warehouse-native solutions than anyone else I know. He was formerly at MessageGears and is now at Snowflake where he helps partner organizations adopt the warehouse-native architecture for their joint customers.

    Luke is also a two-time guest and now a two-time host on the data beats show.

    Abhishek, who was also a Co-founder at ThoughSpot and has spent over a decade building analytics products, shares some hard-won lessons that are super valuable for companies considering the warehouse-native approach to building B2B apps.Ā In just 12 short minutes, you'll get answers to questions like:

    * What does the architecture of a warehouse-native app look like?

    * Whatā€™s the biggest engineering challenge in building a warehouse-native app?

    * What are the benefits of going warehouse-native only instead of hybrid?

    You can also tune in on Apple, Spotify, Google, and YouTube, or read the key takeaways from the conversation below (slightly edited for clarity).

    Key takeaways from this conversation

    Luke:

    Let's get into the specifics of warehouse-native apps and their architecture on a cloud data platform or as some say ā€” a cloud data warehouse.

    In the simplest terms, what does the architecture of a warehouse-native app look like?

    Abhishek:

    The architecture of a warehouse-native app starts at the highest level with the translation of user intent into workflows or SQL to access the data in the warehouse (cloud data platform).

    And the most fundamental property of a warehouse-native architecture is you never copy data out of the warehouse in order to operate on it ā€” all the data stays in the warehouse and all the computation that you do on the data takes place in the warehouse.

    Luke:

    From an engineering point of view, what has been the biggest challenge for your team to adopt this deployment model in the way B2B software is built?

    Abhishek:

    The warehouse-native architecture is very promising but the biggest challenge that we faced from an engineering point of view is the lack of a standard data model.

    In the pre-warehouse-native days when most of these analytical applications were full-stack, there was a standard data model that these applications would enforce at the time of data collection.

    However, a warehouse-native architecture allows you to bring data from a whole bunch of disparate data sources, join the data, and draw insights from it ā€” but this architecture also leaves you with the problem of a missing standard data model. A related challenge has been that while SQL is a great standard to work across data warehousing solutions like Snowflake, BigQuery, Redshift, and Databricks, there is much less standardization when it comes to data science workflows. Therefore, providing a single application experience across different warehouses becomes more of a challenge as you start leaning more on the data science side of things.

    Luke:

    In my experience at Snowflake, and even before Snowflake at a company offering a warehouse-native product (MessageGears), many customers arenā€™t ready for this deployment model ā€” some may not even have a data warehouse or a concept of a data warehouse internally.

    And oftentimes those who do, don't have their data together ā€” they don't have it modeled properly or don't have the right schema for a warehouse-native app.

    What are your thoughts on this and why did NetSpring decide to be warehouse-native only?

    Abhishek:

    That's a great question and if you would've asked me that a few years back, I would've probably said that warehouse-native doesn't make sense.

    However, what I've seen over the past couple of years of building NetSpring is that even though not everyone has fully embraced the cloud data warehouse, there are enough organizations that already have embraced it and have put their mission-critical data in the cloud, making it feasible to build a completely warehouse-native solution.

    In terms of building out the company, there are a couple of additional things that supported the decision to be completely warehouse-native:

    * We have been able to build a significant competitive moat around this architecture which has brought in a lot of focus in terms of execution, the messaging, and our customers knowing exactly what to expect from us.

    * From a product perspective, we've been able to imbibe the analytical power of business intelligence (BI) into product analytics. BI thrives on a warehouse-native architecture and we're able to offer a solution that combines the best of product analytics with the power of BI.

    Luke:

    Is NetSpring an evolution of ThoughtSpot or can they both coexist for a customer?

    Abhishek:

    NetSpring is actually completely complementary to ThoughtSpot and both can absolutely coexist.

    ThoughtSpot, as you know, is BI on the warehouse and is powered by NLP (natural language processing) ā€” there's a lot of kind of goodness there. Whereas with NetSpring, the primary emphasis is on product analytics in a warehouse-native architecture.

    Thereā€™s a lot of depth in the product analytics space and in fact, we're realizing that the first-generation product analytics tools ā€” by being these fully integrated vertical silos ā€” have left a lot of value on the table, which one can harness with a warehouse-native architecture.

    And that's what we are excited about ā€” product analytics with the power of BI.

    Luke:

    What is the one piece of advice you have for startups as they consider taking this new approach to building B2B apps?

    Abhishek:

    My biggest recommendation is to go full warehouse-native ā€” this is our foremost learning.

    By building a fully warehouse-native application, youā€™ll be surprised how much it simplifies your product, your stack, and your messaging ā€” and how much it allows you to focus on your core competitive moat instead of moving around data, dealing with data duplication, ETL pipelines, and so on.

    Just go full warehouse-native. That is the one piece of ad...

    The CDP Battle is Not a Real Battle | Luke Ambrosetti and Glenn Vanderlinden

    The CDP Battle is Not a Real Battle | Luke Ambrosetti and Glenn Vanderlinden

    Hey there,

    You might already know that weā€™re doubling down on our latest campaign, Letā€™s End The CDP Battle ā€” check out the campaign trailer in case you missed it.

    Our goal here is to clear the air and make the CDP space a little less divided. And we hope to do so by bringing people together who believe that the Composable vs Packaged CDP battle is pointless.

    In todayā€™s episode, I was joined by LukeĀ andĀ Glenn who work with vendors from both camps and have a deep understanding of what it takes for organizations to implement a CDP-like solution successfully.

    Luke was atĀ MessageGearsĀ and is now atĀ Snowflake, and Glenn runsĀ Human37Ā where they implement CDPs of all shapes for companies of all sizes.

    They both offered some valuable insights based on their experience working with CDP vendors as well as CDP customers. And to keep things fun, I also had Luke respond to the following statements with his very personal opinion:

    * The Composable CDP will beat the Packaged CDP

    * Composable CDP is largely a marketing term propagated by Reverse ETL vendors

    * Snowflake and the other cloud providers will make ETL and Reverse ETL obsolete in the next 5 years

    * Thereā€™s an opportunity for both CDP camps to come together and solve more pressing problems related to data governance and privacy compliance

    All in all, hereā€™s whatā€™s been established so far:

    Without organizational context ā€” needs, goals, and priorities, as well as resources, culture, and philosophy ā€” one cannot decide which approach between Composable and Packaged is better, cheaper, or faster to implement.

    Moreover, knowing which of the two approaches is more suitable requires organizations to look inwards and assess what might work best for them.

    We had a lot of fun recording this conversation, hope you enjoy watching it. Itā€™s only 13 mins and Iā€™m sure if nothing else, itā€™ll leave you entertained!

    You can tune in on Apple, Spotify, Google, or YouTube or watch the full thing on LinkedIn and share your thoughts with us.



    This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit databeats.community

    The Emergence of Warehouse-native Product Analytics | Vijay Ganesan, CEO at NetSpring

    The Emergence of Warehouse-native Product Analytics | Vijay Ganesan, CEO at NetSpring

    šŸ„ This episode is brought to you by NetSpring, a Warehouse-native Product Analytics tool. šŸ„

    The future looks bright ā€” one where the long-lasting conflict between Business Intelligence (BI) and Product Analytics is finally resolved!

    For data-forward companies, there is a strong need for both types of analysis tools as they were built to work with fundamentally different types of data ā€” BI for relational data and Product Analytics for event data ā€” and serve fundamentally different purposes.

    BI is an analysis interface for the entire organization, enabling stakeholders to derive insights from the analyses performed by data teams.

    Product Analytics, on the other hand, has proven to be non-negotiable for teams to better understand product usage and identify points of friction in the user journey.

    In reality though, the user journey extends well beyond the core product (web and mobile apps) ā€” it includes interactions a user has with a brand across engagement, advertising, and support channels.

    Teams need to combine product-usage data with data from third-party tools to get a complete picture of the user journey. But doing so has been rather challenging using first-generation Product Analytics tools, and even more so using BI tools.

    Enter Warehouse-native Product Analytics.

    A Warehouse-native Product Analytics tool sits on top of the customer's data warehouse, allowing teams to perform end-to-end analyses that combine first-party behavioral data with data from third-party sources used for email engagement, advertising, and support.

    In this episode, Vijay Ganesan, CEO of NetSpring walks us through the ins and outs of this technology (Warehouse-native Product Analytics) via answers to questions like:

    * Why has it been challenging to offer both BI and Product Analytics capabilities in a single product? How is that changing with the rise of the cloud data warehouse?

    * What are the organizational shifts contributing to the adoption of warehouse-native apps?

    * How does the warehouse-native approach better equip organizations to comply with privacy regulations like the GDPR?

    If you work in Product, Growth, or Data, Iā€™m certain that youā€™ll find this conversation insightful.

    You can also tune in on Apple, Spotify, Google, and YouTube, or read the key takeaways from the conversation below (slightly edited for clarity).

    Key takeaways from this conversation

    Arpit (02:06):

    BI was built to explore relational data, whereas Product Analytics relies on event data or behavioral data. From a technical point of view, why is it so challenging to offer both BI and Product Analytics capabilities in a single product?

    Vijay (02:20):

    Historically, the two worlds have been very distinct at all levels of the stack.

    * The way you collect and store data is very different for BI and Product Analytics: High-velocity event data typically never reached the data warehouse ā€” it wasnā€™t feasible to store petabyte-scale data in traditional data warehouses anyway.

    * The nature of computation is very different: In BI, you're doing dimensional slice and dice ā€” you take a metric and slice it by different dimensions. The way you structure the data for BI is very different from what you do for Product Analytics where youā€™re not studying the final state ā€” you're studying the sequence of events that lead to a final state.

    * The way you express the analytical computations are very different too: In BI, it's SQL oriented whereas, in the event data world of Product Analytics, SQL is not the best language to express the analytical intent.

    Therefore, at all levels, BI and Product Analytics are very different types of systems, making it very difficult to do one in the other.

    Arpit (06:50):

    NetSpring doesn't ingest any data but are there any prerequisites for it to work? Do companies need to model their event data as per a specific schema?

    Vijay (07:06):

    We don't do any instrumentation as we believe in the concept of decoupled instrumentation ā€” the idea that you use best-of-breed, unopinionated instrumentation systems (CDIs) like Rudderstack, Segment, or Snowplow to land the data in the data warehouse in a form that is consumable by anyone.

    Secondly, in terms of schemas and data models, and this is one of our key differentiations, we can consume any arbitrary schema. Unlike first-generation tools (like Mixpanel and Amplitude), we don't require you to force your data model into some pre-canned user event ā€” we can work off a generic data model.

    And this goes back to our BI DNA where we're similar to BI tools in the sense that they can work off arbitrary schemas.

    We're fundamentally relational in nature with event layered on top of it which we refer to as Relational Event Streams ā€” you point NetSpring to whatever schema you have in your data warehouse, do some decorations on certain data sets to turn them into event streams, and then you have the full glory of product analytics and sophisticated BI-type analytics on top of the data warehouse.

    Arpit (08:31):

    Today, all first-gen product analytics tools support the data warehouse as a data source, but they still need to store the data in their own environment. Besides the lack of ingestion, is there anything else a warehouse-native product like NetSpring does differently?

    Vijay (09:36):

    When we talk about working off the data model in your data warehouse, we talk about consuming those data models in their native form. On NetSpring, if you're looking at Salesforce data, for example, you'll see first-class entities for accounts, contracts, and opportunities. Similarly, with Zendesk, it's tickets. The ability to consume this business context in a native form is very powerful, and that's what lends itself to very rich analytics.

    Arpit (10:42):

    And what are the key factors that make warehouse native products more affordable than their traditional counterparts?

    Vijay (11:38):

    Besides the fact that you donā€™t pay twice for your data with a copy in the warehouse and a copy elsewhere,

    * You need not pay for Reverse ETL jobs to move data from the warehouse to the Product Analytics tool

    * And thereā€™s a cost associated even with figuring out what to send, what to not to send, what to delete, etc ā€” you don't have to worry about any of that.

    So there's process cost, operational cost, and then there is a large opportunity cost. If your analytics is siloed and not impactful, the opportunity cost is huge.

    With warehouse-native tools, you only pay when someone queries the data.

    Therefore, the overall difference in cost between the first-generation approach to ...

    Understanding Data Clean Rooms | Roopak Gupta, CTO at Habu

    Understanding Data Clean Rooms | Roopak Gupta, CTO at Habu

    If you haven't heard of Data Clean Rooms, well, you have now!

    In this episode, Roopak Gupta walks us through this new technology, the factors leading to it its rise, its impact on privacy compliance and data governance, as well as the top use cases.

    Roopak also offers some tips for those evaluating a clean room solution.

    Don't miss the conversation, especially if you like to stay on top of the innovations in data privacy tech.

    Listen now on Apple, Spotify, Google, or YouTube.

    Key takeaways from this conversation:

    Q. What is a data clean room?

    Roopak (00:36)

    Data clean room technology enables partners to collaborate in a secure manner ā€” where their (first-party) data never leaves the source ā€” and generate insights and outcomes that were not possible earlier (without moving the data out of their environment).

    Think of it as a distributed data platform that allows the connection and analysis of data across multiple platforms and partners.

    Iā€™ve read about clean rooms being referred to as Switzerland of data and I think it makes it easy to understand the core promise of this new technology.

    You bring your data, I bring mine, the neutral party performs joins behind closed doors (that has been swept clean to prevent any leaks), and delivers data points about our common audience. We both receive additional insights from each otherā€™s data while being fully compliant to privacy regulation ā€” now thatā€™s a win-win!

    Q. How does a clean room solution help adhere to privacy regulations such as the GDPR?

    Roopak (02:44):

    A clean room supports both consumer privacy and data governance via the following:

    * Provides technical guarantees that data is only used for approved purposes, which is huge for compliance

    * Ensures that data never leaves the data source

    * Provides complete control of the analysis that can be run with built-in approval flows

    * Follows the principles of data minimization

    * Supports data encryption and differential privacy via k-anonymity and noise injection, ensuring that no individual-level data is ever leaked

    Q. Can you describe the top use cases for clean rooms?

    Roopak (07:28):

    Besides analysis and activation, the enrichment of data and the data models is a big one.

    Companies don't want to limit themselves to query-based analysis of structured data ā€” they expect clean rooms to support machine learning workloads and protect inputs that are beyond data. We protect proprietary code, and we have customers who enrich their data and their models by enabling secure access to new inputs ā€” something they never had access to earlier.

    Q. What should companies look for when evaluating data clean room solutions?

    Roopak (08:16):

    I would say there are three key things you need to look for in a clean room partner:

    * Interoperability and automation: It should be tech stack-agnostic and support end-to-end automation.

    * Flexibility: You should be able to create and customize use cases based on your specific business needs.

    * Built for privacy, governance, and data decentralization from the ground up: Beware of all the rebranding of legacy offerings that claim to be clean rooms but they're not as they take the same old approach of ingesting your data.

    Additional Resources

    * Connect with Roopak on LinkedIn

    * An in-depth guide on data clean rooms

    * An in-depth guide on first-party behavioral data collection

    Thanks for reading/listening ā€” letā€™s beat the gap! šŸ¤



    This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit databeats.community
    Logo

    Ā© 2024 Podcastworld. All rights reserved

    Stay up to date

    For any inquiries, please email us at hello@podcastworld.io