Logo

    Are long context windows the end of RAG?

    enApril 02, 2024
    What is Google Gemini 1.5's context window capability?
    How does Gemini 1.5 impact retrieval augmented generation?
    What are potential drawbacks of large context windows?
    Why is regulation important for emerging technologies?
    How do bad actors affect innovation in technology?

    Podcast Summary

    • Longer context windows in AIGoogle's Gemini 1.5 AI model can handle a context window of 700,000 words, enabling less precise filtering and more exploration of entire texts, but may also lead to overwhelming the model with irrelevant information.

      Google's latest AI model, Gemini 1.5, is pushing the boundaries of context windows in retrieval augmented generation, potentially making it obsolete. The model can handle an unprecedented 700,000 words, equivalent to War and Peace 10 times or 3 hours of video or 22 hours of audio. This enormous context window allows for less precise filtering and more exploration of entire texts, such as an entire play of Les Mis, without having to narrow down to specific scenes or information. However, this comes with potential drawbacks, such as the possibility of overwhelming the model with too much irrelevant information, making it harder for the model to focus on the right answer. The paper from Google is a signal of where the market is shifting, towards larger context windows and more exploratory AI models.

    • Machine understanding contextMachines lack the ability to understand context beyond their programming and require human intervention for creativity and exploring new ideas

      While machines like Google's Bard or the AlphaGo model have the ability to know specific information deeply, they lack the ability to understand context beyond what they've been programmed with. They don't forget or get distracted like humans do, but they also don't possess the creativity and ability to branch out and explore new ideas on their own. The AlphaGo example shows that machines can come up with innovative strategies, but they require human intervention to do so. Additionally, machines are limited to the data they've been given and aren't capable of understanding the world beyond that. So, while they may excel in specific areas, they can't replicate the depth and breadth of human knowledge and understanding.

    • Human-Machine CollaborationHuman experts are crucial in guiding machines and ensuring accurate and effective results through a collaborative process, as machines lack the ability to understand context and intent behind questions.

      While machines can exhibit creativity and think outside the box within their specific context, they don't truly understand or humanize concepts like we do. They generate output based on what they've been taught, and sometimes they may produce unexpected results due to limitations or misunderstandings. Human experts are crucial in guiding machines and ensuring they're on the right path to achieving the desired outcome. The back-and-forth interaction between humans and machines, where humans question and machines provide answers, can lead to more accurate and effective results. However, machines don't have the ability to understand the context or intent behind a question like humans do, and they may miss important nuances or assumptions. Therefore, it's essential to leverage the strengths of both humans and machines, with humans guiding the process and machines providing quick answers, to achieve the best possible results.

    • Databricks LLM competition with OpenAIDatabricks introduces an open language model to compete with OpenAI, but not entirely open source, with base and fine-tune model weights published under an open license, using it as a marketing strategy to attract users and drive innovation in the field of language models.

      Databricks, a data platform, is introducing an open language model (LLM) as a way to compete with market leader OpenAI and its models like GPT 3.5 and 4. This open LLM is not entirely open source, as the specifics about the data used for training and transformations are not publicly available. However, the base and fine-tune model weights are published under an open license. The open label is used as a marketing strategy to attract users to their platform, encouraging them to use their software and pay for their ecosystem. The competition in the industry is leading to debates about the true meaning of open source, and it remains to be seen how this will unfold in the next few years. Despite the confusion, the competition is beneficial for consumers as it drives innovation and improvement in the field of language models.

    • Business practices in ML industryDatabricks offers a convenient solution for model deployment and fine-tuning, but users need to pay for this service. Ethical business practices are crucial in the tech industry, and fraudulent activities will not be tolerated and can lead to severe consequences.

      While individuals have the option to use their preferred cloud providers or their own computers for machine learning inference, Databricks offers a more convenient solution for fine-tuning models. However, this convenience comes with a cost. The discussion also touched upon the recent sentencing of FTX cofounder Sam Bankman Fried to 25 years in prison for fraud, which serves as a reminder of the importance of ethical business practices in the tech industry. Databricks provides an easier way to deploy and fine-tune models, but users need to pay for this convenience. While some may prefer to install their favorite GPUs on their own computers or use cloud providers like Google, AWS, or Azure, Databricks caters to those who value ease and support. These companies, including Databricks, have models that may not be open source, but they offer consulting services to help users set up their own models and pay later. This business model has proven successful, as people continue to seek AI solutions. The recent sentencing of Sam Bankman Fried for fraud in the crypto industry highlights the importance of ethical business practices. The 25-year sentence serves as a reminder that fraudulent activities will not be tolerated and can have severe consequences. The comparison to the Bernie Madoff sentence emphasizes the significance of this message and the potential impact on the industry's future.

    • Blockchain RegulationProper regulation is crucial for the blockchain industry's success despite the negative narrative surrounding bad actors and past technology boom and bust cycles

      While blockchain technology and cryptocurrencies like Bitcoin and Ethereum have gained significant attention and hype, much of the conversation has been overshadowed by those trying to make a quick profit or engage in scams. This negative narrative can discourage genuine exploration and innovation in the field. The comparison to the boom and bust of AI and LLMs highlights the presence of bad actors in emerging technologies, but also the potential for valuable innovation to surface over time with proper regulation. The Internet's past boom and bust cycle serves as a reminder that not all companies and ideas will succeed, but the underlying technology can still be beneficial and transformative. The next five years are expected to bring significant changes due to LLMs, and proper regulation will be crucial for the industry's success.

    • Blockchain RegulationThe need for regulation separates legitimate uses of blockchain technology from speculative or fraudulent activities, while also allowing for innovation with reduced financial risk.

      The recent legal action against Sam Bankman-Fried serves as a reminder that regulation is necessary to separate legitimate uses of technology like blockchain from speculative or fraudulent activities. The discussion also highlighted the potential benefits of using blockchain technology for legitimate purposes, such as cryptographic signature verification, which can lead to innovation without the high financial risk. The importance of good actors in the tech industry to outshine the bad ones was also emphasized. Additionally, the team answered a question about calculating the decimal max value in SQL Server during the show. Overall, the conversation underscored the importance of regulation and the potential value of emerging technologies when used responsibly.

    Recent Episodes from The Stack Overflow Podcast

    The world’s largest open-source business has plans for enhancing LLMs

    The world’s largest open-source business has plans for enhancing LLMs

    Red Hat Enterprise Linux may be the world’s largest open-source software business. You can dive into the docs here.

    Created by IBM and Red Hat, InstructLab is an open-source project for enhancing LLMs. Learn more here or join the community on GitHub.

    Connect with Scott on LinkedIn.  

    User AffluentOwl earned a Great Question badge by wondering How to force JavaScript to deep copy a string?

    The evolution of full stack engineers

    The evolution of full stack engineers

    From her early days coding on a TI-84 calculator, to working as an engineer at IBM, to pivoting over to her new role in DevRel, speaking, and community, Mrina has seen the world of coding from many angles. 

    You can follow her on Twitter here and on LinkedIn here.

    You can learn more about CK editor here and TinyMCE here.

    Congrats to Stack Overflow user NYI for earning a great question badge by asking: 

    How do I convert a bare git repository into a normal one (in-place)?

    The Stack Overflow Podcast
    enSeptember 10, 2024

    At scale, anything that could fail definitely will

    At scale, anything that could fail definitely will

    Pradeep talks about building at global scale and preparing for inevitable system failures. He talks about extra layers of security, including viewing your own VMs as untrustworthy. And he lays out where he thinks the world of cloud computing is headed as GenAI becomes a bigger piece of many company’s tech stack. 

    You can find Pradeep on LinkedIn. He also writes a blog and hosts a podcast over at Oracle First Principles

    Congrats to Stack Overflow user shantanu, who earned a Great Question badge for asking: 

    Which shell I am using in mac?

     Over 100,000 people have benefited from your curiosity.

    The Stack Overflow Podcast
    enSeptember 03, 2024

    Mobile Observability: monitoring performance through cracked screens, old batteries, and crappy Wi-Fi

    Mobile Observability: monitoring performance through cracked screens, old batteries, and crappy Wi-Fi

    You can learn more about Austin on LinkedIn and check out a blog he wrote on building the SDK for Open Telemetry here.

    You can find Austin at the CNCF Slack community, in the OTel SIG channel, or the client-side SIG channels. The calendar is public on opentelemetry.io. Embrace has its own Slack community to talk all things Embrace or all things mobile observability. You can join that by going to embrace.io as well.

    Congrats to Stack Overflow user Cottentail for earning an Illuminator badge, awarded when a user edits and answers 500 questions, both actions within 12 hours.

    Where does Postgres fit in a world of GenAI and vector databases?

    Where does Postgres fit in a world of GenAI and vector databases?

    For the last two years, Postgres has been the most popular database among respondents to our Annual Developer Survey. 

    Timescale is a startup working on an open-source PostgreSQEL stack for AI applications. You can follow the company on X and check out their work on GitHub

    You can learn more about Avthar on his website and on LinkedIn

    Congrats to Stack Overflow user Haymaker for earning a Great Question badge. They asked: 

    How Can I Override the Default SQLConnection Timeout

    ? Nearly 250,000 other people have been curious about this same question.

    Ryan Dahl explains why Deno had to evolve with version 2.0

    Ryan Dahl explains why Deno had to evolve with version 2.0

    If you’ve never seen it, check out Ryan’s classic talk, 10 Things I Regret About Node.JS, which gives a great overview of the reasons he felt compelled to create Deno.

    You can learn more about Ryan on Wikipedia, his website, and his Github page.

    To learn more about Deno 2.0, listen to Ryan talk about it here and check out the project’s Github page here.

    Congrats to Hugo G, who earned a Great Answer Badge for his input on the following question: 

    How can I declare and use Boolean variables in a shell script?

    Battling ticket bots and untangling taxes at the frontiers of e-commerce

    Battling ticket bots and untangling taxes at the frontiers of e-commerce

    You can find Ilya on LinkedIn here.

    You can listen to Ilya talk about Commerce Components here, a system he describes as a "modern way to approach your commerce architecture without reducing it to a (false) binary choice between microservices and monoliths."

    As Ilya notes, “there are a lot of interesting implications for runtime and how we're solving it at Shopify. There is a direct bridge there to a performance conversation as well: moving untrusted scripts off the main thread, sandboxing UI extensions, and more.” 

    No badge winner today. Instead, user Kaizen has a question about Shopify that still needs an answer. Maybe you can help! 

    How to Activate Shopify Web Pixel Extension on Production Store?

    Scaling systems to manage the data about the data

    Scaling systems to manage the data about the data

    Coalesce is a solution to transform data at scale. 

    You can find Satish on LinkedIn

    We previously spoke to Satish for a Q&A on the blog: AI is only as good as the data: Q&A with Satish Jayanthi of Coalesce

    We previously covered metadata on the blog: Metadata, not data, is what drags your database down

    Congrats to Lifeboat winner nwinkler for saving this question with a great answer: Docker run hello-world not working

     

    Related Podcasts