Logo

    How developer experience can escape the spreadsheet

    enAugust 02, 2024
    What inspired Anish Dar to co-found Cortex?
    How did Uber's culture impact service development?
    What challenges did Uber engineers face regarding documentation?
    How does Cortex utilize AI for developer experience?
    What metrics can enhance collaboration in development teams?

    Podcast Summary

    • Uber experienceThe challenges of maintaining massive in-house infrastructure at a hyperscale company can inspire external solutions, as seen in the creation of Cortex by engineers with a background at Uber

      The experience of working at a hyperscale company like Uber can provide valuable insights and knowledge that can lead to the creation of successful external companies. Anish Dar, co-founder and CEO of Cortex, shared his journey from being an engineer at Uber to becoming a founder. Growing up in the Bay Area surrounded by technology and having a fascination with computer science led him to pursue a career in engineering. When he joined Uber in 2013, he was drawn to the company's impact and rapid growth. However, he soon realized the challenges of maintaining the massive infrastructure built in-house, which led to an explosion of services and a lack of documentation when engineers left the company. This issue was not unique to Uber and was a common problem among his engineer friends. These experiences inspired him to co-found Cortex with Ganesh Dada to help solve the problem of managing and scaling complex systems. Their backgrounds at Uber and the lessons they learned there have been instrumental in the creation and success of Cortex.

    • Internal tools challengesCreating and maintaining internal tools for managing services and developer productivity can lead to significant challenges due to ad hoc, scattered, and inconsistent solutions. Focusing on a microservice catalog as a first step towards a comprehensive developer portal can help, but the real value comes from getting engineers to care about data quality and maintain it.

      Attempting to build and maintain internal tools for managing services and developer productivity can lead to significant challenges. This was the experience of the founder of Cortex, who faced issues with onboarding new engineers, enforcing best practices, and scaling data tracking. This issue is not unique, as many companies, including Uber, have tried and failed to create their own solutions. The problems stem from the fact that these tools are often ad hoc, scattered across different teams, and lack consistent documentation or quality standards. As a result, companies eventually realize they need a dedicated solution to manage their services and documentation effectively. This realization led the founder of Cortex to focus on building a microservice catalog as the first step towards creating a comprehensive developer portal. However, the bigger challenge came in getting engineers to care about the quality of the data and maintain it. Solving this second problem is where the real value and ROI come from, as it enables continuous improvement and meaningful improvements to the development process.

    • Microservice ownership and qualityTools like scorecards and developer portals can help enforce best practices and make it clear what 'good' looks like for engineers, addressing the challenge of managing the complexity of many microservices and ensuring up-to-date information.

      Creating a culture of operational excellence in engineering teams, particularly around reliability and security, is a common challenge for many tech leaders. This is especially relevant in today's economic climate. One solution to this problem is the use of tools like scorecards and developer portals, which help enforce best practices and make it clear what "good" looks like for engineers. Ganesh, who has experience as a CTO outside of Uber, shared his perspective. He described a similar journey at his previous FinTech startup, where the ease of spinning up new microservices led to a proliferation of services that were difficult to track and manage. Ganesh's experience mirrored Anish's at Uber, where they both faced the challenge of managing the complexity of many microservices. They recognized the need to solve this problem in the microservice ownership and quality space. One of the biggest pain points was getting everyone to keep information up-to-date. Ganesh would have to email service owners every quarter to check if the information was still correct. This experience, along with the recognition that companies mature by renaming the cute names on their services to something more usable, led them to start working on a solution.

    • Centralized data platformA centralized data platform like Cortex saves time and improves productivity by collecting, connecting, and automating data from various systems, defining ownership, and providing a comprehensive view of the engineering ecosystem.

      Having an accurate and up-to-date centralized data platform like Cortex is crucial for effectively managing an engineering ecosystem. The power of Cortex lies in its ability to collect and connect data from various systems, acting as a pointer to where more information can be found. By defining a core data model and automating data collection, Cortex eliminates the need for constant chasing of team members to update information. This not only saves time but also ensures that ownership is clearly defined and easily tracked. With Cortex, engineers can easily access relevant tickets, documentation, and other important information, all connected to the catalog. Additionally, by piping in data from various tools, Cortex provides a comprehensive view of the engineering ecosystem, enhancing visibility and ownership. The incentive for engineers to engage with this information is that it streamlines their work and provides them with the necessary context to make informed decisions, ultimately improving productivity and efficiency.

    • Software update gamificationCortex tool uses gamification and a scorecard system to encourage engineers to adopt best practices, foster competition and collaboration, and maintain high-quality software.

      Promoting software updates and maintaining code quality is not just a technological issue but also a cultural one. Cortex, a tool mentioned in the discussion, addresses both aspects by implementing gamification and a scorecard system to encourage engineers to adopt best practices and maintain high-quality software. This system fosters a sense of competition and collaboration among teams, helping to establish a shared language of what good looks like and raising the overall quality of the codebase. The wisdom of the crowd plays a crucial role in this process by helping to define and iterate on the shared language and guardrails, enabling teams to move in the same direction towards maintaining and improving the software. The discussion also emphasizes the importance of autonomy within guardrails, allowing teams to work independently while ensuring they adhere to the established standards. The hard-earned lessons from experience highlight the need for a balance between autonomy and shared guidance to effectively drive software maintenance and improvement.

    • Uber's decentralized approachUber's emphasis on creating novel solutions led to over 4,000 services, but lack of communication and coordination resulted in productivity losses. Prioritizing collaboration and communication through a scorecard can increase productivity and reduce redundant services.

      Uber's in-house bias and culture of creating novel solutions led to an explosion of over 4,000 services. This was driven by a heavy emphasis on building internal tools and a promotion system that rewarded engineers for creating new services. However, this decentralized approach also resulted in a lack of communication and coordination between teams, leading to development productivity losses. To address this, it's essential for companies to establish a scorecard that prioritizes collaboration and communication between teams. This can help increase development productivity and reduce the number of redundant services. A company can track and measure progress towards this goal by monitoring metrics such as the number of shared services, the frequency of cross-team collaboration, and the amount of deprecated services that saved resources. Ultimately, implementing a scorecard requires a cultural shift towards openness and collaboration, but the benefits can be significant in terms of cost savings and improved development productivity.

    • Software Development ScorecardsScorecards help organizations define high-level metrics for continuous improvement and identify inputs that impact these metrics, allowing for visibility into the current state and areas for improvement.

      Scorecards are a tool used to define a set of criteria for continuous improvement in a culture of software development. These criteria are typically high-level metrics that organizations care about, such as developer productivity or mean time to repair (MTTR). To effectively use scorecards, organizations must first identify the inputs that impact these metrics, such as good on-call practices, monitoring, and alerting. Once these inputs are identified, they can be codified into a scorecard system, which provides visibility into an organization's current state and areas for improvement. The process of creating and refining scorecards involves defining the basics, setting next-level goals, and incentivizing continuous improvement. However, implementing scorecards requires alignment on what metrics matter and ongoing effort to measure and adjust inputs.

    • Cortex AI strategyCortex AI simplifies complex tasks, like identifying info on deployed services and their status, and encourages collaboration within the tech community.

      Cortex is using AI to simplify information discovery within engineering organizations. The company's product aims to solve the problem of finding information by allowing users to ask and introspect their data, making it easier to identify where things are deployed and their current status. Cortex's AI strategy focuses on making complex tasks, like identifying who to talk to about a specific service or component, simpler and more efficient. Additionally, the team at Cortex values knowledge sharing and encourages collaboration within the tech community. They can be found at various conferences and online platforms, and are always open to hiring new team members to join their mission of simplifying developer experience and service complexity.

    Recent Episodes from The Stack Overflow Podcast

    The world’s largest open-source business has plans for enhancing LLMs

    The world’s largest open-source business has plans for enhancing LLMs

    Red Hat Enterprise Linux may be the world’s largest open-source software business. You can dive into the docs here.

    Created by IBM and Red Hat, InstructLab is an open-source project for enhancing LLMs. Learn more here or join the community on GitHub.

    Connect with Scott on LinkedIn.  

    User AffluentOwl earned a Great Question badge by wondering How to force JavaScript to deep copy a string?

    The evolution of full stack engineers

    The evolution of full stack engineers

    From her early days coding on a TI-84 calculator, to working as an engineer at IBM, to pivoting over to her new role in DevRel, speaking, and community, Mrina has seen the world of coding from many angles. 

    You can follow her on Twitter here and on LinkedIn here.

    You can learn more about CK editor here and TinyMCE here.

    Congrats to Stack Overflow user NYI for earning a great question badge by asking: 

    How do I convert a bare git repository into a normal one (in-place)?

    The Stack Overflow Podcast
    enSeptember 10, 2024

    At scale, anything that could fail definitely will

    At scale, anything that could fail definitely will

    Pradeep talks about building at global scale and preparing for inevitable system failures. He talks about extra layers of security, including viewing your own VMs as untrustworthy. And he lays out where he thinks the world of cloud computing is headed as GenAI becomes a bigger piece of many company’s tech stack. 

    You can find Pradeep on LinkedIn. He also writes a blog and hosts a podcast over at Oracle First Principles

    Congrats to Stack Overflow user shantanu, who earned a Great Question badge for asking: 

    Which shell I am using in mac?

     Over 100,000 people have benefited from your curiosity.

    The Stack Overflow Podcast
    enSeptember 03, 2024

    Mobile Observability: monitoring performance through cracked screens, old batteries, and crappy Wi-Fi

    Mobile Observability: monitoring performance through cracked screens, old batteries, and crappy Wi-Fi

    You can learn more about Austin on LinkedIn and check out a blog he wrote on building the SDK for Open Telemetry here.

    You can find Austin at the CNCF Slack community, in the OTel SIG channel, or the client-side SIG channels. The calendar is public on opentelemetry.io. Embrace has its own Slack community to talk all things Embrace or all things mobile observability. You can join that by going to embrace.io as well.

    Congrats to Stack Overflow user Cottentail for earning an Illuminator badge, awarded when a user edits and answers 500 questions, both actions within 12 hours.

    Where does Postgres fit in a world of GenAI and vector databases?

    Where does Postgres fit in a world of GenAI and vector databases?

    For the last two years, Postgres has been the most popular database among respondents to our Annual Developer Survey. 

    Timescale is a startup working on an open-source PostgreSQEL stack for AI applications. You can follow the company on X and check out their work on GitHub

    You can learn more about Avthar on his website and on LinkedIn

    Congrats to Stack Overflow user Haymaker for earning a Great Question badge. They asked: 

    How Can I Override the Default SQLConnection Timeout

    ? Nearly 250,000 other people have been curious about this same question.

    Ryan Dahl explains why Deno had to evolve with version 2.0

    Ryan Dahl explains why Deno had to evolve with version 2.0

    If you’ve never seen it, check out Ryan’s classic talk, 10 Things I Regret About Node.JS, which gives a great overview of the reasons he felt compelled to create Deno.

    You can learn more about Ryan on Wikipedia, his website, and his Github page.

    To learn more about Deno 2.0, listen to Ryan talk about it here and check out the project’s Github page here.

    Congrats to Hugo G, who earned a Great Answer Badge for his input on the following question: 

    How can I declare and use Boolean variables in a shell script?

    Battling ticket bots and untangling taxes at the frontiers of e-commerce

    Battling ticket bots and untangling taxes at the frontiers of e-commerce

    You can find Ilya on LinkedIn here.

    You can listen to Ilya talk about Commerce Components here, a system he describes as a "modern way to approach your commerce architecture without reducing it to a (false) binary choice between microservices and monoliths."

    As Ilya notes, “there are a lot of interesting implications for runtime and how we're solving it at Shopify. There is a direct bridge there to a performance conversation as well: moving untrusted scripts off the main thread, sandboxing UI extensions, and more.” 

    No badge winner today. Instead, user Kaizen has a question about Shopify that still needs an answer. Maybe you can help! 

    How to Activate Shopify Web Pixel Extension on Production Store?

    Scaling systems to manage the data about the data

    Scaling systems to manage the data about the data

    Coalesce is a solution to transform data at scale. 

    You can find Satish on LinkedIn

    We previously spoke to Satish for a Q&A on the blog: AI is only as good as the data: Q&A with Satish Jayanthi of Coalesce

    We previously covered metadata on the blog: Metadata, not data, is what drags your database down

    Congrats to Lifeboat winner nwinkler for saving this question with a great answer: Docker run hello-world not working