Logo

    isps

    Explore " isps" with insightful episodes like "The Facebook Outage, Explained (10/4/21) | Outage Deep Dive", "When BGP Routes Accidentally Get Hijacked: A Lesson In Internet Vulnerability | Outage Deep Dive", "The Akamai DNS Outage and the Case for CDN Redundancy (July 1-23, 2021) | Outage Deep Dive", "BGP Routing Incident Shows Why the Shortest Path Isn’t Always the Chosen Path | Outage Deep Dive" and "Akamai Prolexic Outage Analysis + Takeaways (Week of June 9-17, 2021) | Outage Deep Dive" from podcasts like ""The Internet Report", "The Internet Report", "The Internet Report", "The Internet Report" and "The Internet Report"" and more!

    Episodes (67)

    The Facebook Outage, Explained (10/4/21) | Outage Deep Dive

    The Facebook Outage, Explained (10/4/21) | Outage Deep Dive
    00:00 Welcome: This is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. 00:15 Headlines: Today we’re going to do a thorough analysis of the major Facebook outage that took place yesterday, Monday, October 4. I’m joined by ​​Gustavo Ramos, ThousandEyes’ in-house expert on Network Engineering. ThousandEyes Blog: https://www.thousandeyes.com/blog/facebook-outage-analysis Analysis from Facebook: https://engineering.fb.com/2021/10/05/networking-traffic/outage-details/ 1:17 Under the Hood: We'll walk through the sequence of events that led to this outage, understand what went wrong (and what actions may have made the situation worse), and what lessons we can all learn from this outage. 25:40 Outro: We've been on a bit of a break for the past several months as things were relatively quiet on the Internet front and for the foreseeable future we'll be a bit reactive in our episodes, when something major happens trust we'll be here. Questions? Feedback? Have an idea for a guest? Send us an email at internetreport@thousandeyes.com

    When BGP Routes Accidentally Get Hijacked: A Lesson In Internet Vulnerability | Outage Deep Dive

    When BGP Routes Accidentally Get Hijacked: A Lesson In Internet Vulnerability | Outage Deep Dive
    00:00 Welcome: This is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. 00:08 Headlines: Today, Mike Hicks (Principal Solutions Analyst, ThousandEyes) and I discuss a recent BGP routing incident that had intermittent impacts on Amazon’s services, including Amazon.com and AWS compute resources, during a five-hour period on July 12. 01:04 Under the Hood: When we look into BGP routing at the time, we can see multiple BGP path changes due to a service provider erroneously inserting themselves into the path for a large number of Amazon routes. Watch this episode to see how the BGP incident led to significant packet loss, resulting in service disruption for some Amazon and AWS users. We also discuss why enterprises need to have continuous oversight of the paths their traffic takes over the Internet. 17:58 Outro: Questions? Feedback? Have an idea for a guest? Send us an email at internetreport@thousandeyes.com

    The Akamai DNS Outage and the Case for CDN Redundancy (July 1-23, 2021) | Outage Deep Dive

    The Akamai DNS Outage and the Case for CDN Redundancy (July 1-23, 2021) | Outage Deep Dive
    This is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. I’m joined today by Mike Hicks, principal solutions analyst here at ThousandEyes, to cover the outage of Akamai’s DNS service. The outage, which occurred on July 22nd around 3:38 PM UTC (8:38AM PT), struck during the course of business hours in Europe and North America, resulting in widespread impacts to applications and services hosted within Akamai servers. The outage itself was short-lived and was resolved roughly one hour after the outage began. In this episode, we examine the customer impact, the relationship between DNS and CDNs, and what enterprises should take away from the incident. We also discuss the question of when it might make sense to invest in DNS or CDN redundancy—and when it is, frankly, overkill. Watch this week’s episode to hear our take, and as always let us know on Twitter what you think.

    BGP Routing Incident Shows Why the Shortest Path Isn’t Always the Chosen Path | Outage Deep Dive

    BGP Routing Incident Shows Why the Shortest Path Isn’t Always the Chosen Path | Outage Deep Dive
    00:00 Welcome:This is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. 00:13 Headlines: Today, Kemal and I unpack an interesting BGP incident, in which a large-scale route leak briefly altered traffic patterns across the Internet. 00:58 Under the Hood: The incident began on Thursday, June 3rd at around 10:24 UTC, and resulted in a significant spike in packet loss that was noticeable in ThousandEyes tests. While this packet loss resolved within the hour (at around 10:48 UTC), we observed some interesting routing changes during this window—as traffic was diverted to a Russian telecom provider that had not previously been in the path. Watch this episode as we explore how this network provider managed to get itself into the routing paths of many major services, and why network visibility is so important to recognize these types of incidents in which your site may still be reachable but your traffic is being sent through an unexpected network. 20:45 Outro: Questions? Feedback? Have an idea for a guest? Send us an email at internetreport@thousandeyes.com

    Akamai Prolexic Outage Analysis + Takeaways (Week of June 9-17, 2021) | Outage Deep Dive

    Akamai Prolexic Outage Analysis + Takeaways (Week of June 9-17, 2021) | Outage Deep Dive
    This is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. I’m joined by ThousandEyes’ BGP expert, Kemal Sanjta, to review the June 16th outage of Prolexic Routed, a DDoS Mitigation Service operated by Akamai. According to a statement from Akamai, the outage was not due to a DDoS attack or system update, but instead a routing table limitation that was inadvertently exceeded. In this episode, Kemal and I analyzed what happened and how customers of Akamai Prolexic who had automated failover mechanisms in place were able to recover more quickly than those that had to manually switch over to other providers. Watch this episode to learn more about this outage, and how different operational processes resulted in very different service outcomes.

    Fastly’s Outage and Why CDN Redundancy Matters (Week of June 3-8) | Outage Deep Dive

    Fastly’s Outage and Why CDN Redundancy Matters (Week of June 3-8) | Outage Deep Dive
    00:00 Welcome: This is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. 00:12 Headlines: Today, I’m joined by Hans Ashlock, Director of Technology & Innovation at ThousandEyes, to unpack today’s major outage at Fastly, a popular CDN provider. 3:46 Under the Hood: Today, I’m joined by Hans Ashlock, Director of Technology & Innovation at ThousandEyes, to unpack today’s major outage at Fastly, a popular CDN provider. The widespread outage occurred around 9:50 UTC, about 5:50 am ET, and mostly impacted users across Europe and Asia due to the timing. he outage lasted approximately one hour until 10:50 UTC, yet residual impacts were felt beyond that. Today’s outage is a good example of the importance of having outside-in visibility not just across your app, but also to your app’s edge and all its dependent services. 39:05 Outro: Questions? Feedback? Have an idea for a guest? Send us an email at internetreport@thousandeyes.com

    Bitcoin Dive Sparks Outage at a Popular Crypto Exchange (Weeks of May 17-June 2) | Outage Deep Dive

    Bitcoin Dive Sparks Outage at a Popular Crypto Exchange (Weeks of May 17-June 2) | Outage Deep Dive
    This is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. I’m joined today by Mike Hicks, Principle Solution Analyst at ThousandEyes, to cover two recent application-related outages. The first occurred on May 19th around 12:50 UTC at Coinbase—a well-known cryptocurrency exchange. Around the time that news broke saying that the Chinese government would be imposing strict regulation on cryptocurrencies, users attempting to execute transactions were unable to access the application. From the ThousandEyes platform we were able to see a drop in availability around this time as well as increased load times (which in some cases resulted in timeout errors). The second outage happened on May 20th around 17:35 UTC at Slack—an enterprise collaboration platform. While the outage was resolved within 90 minutes, it occurred during normal US business hours, making it particularly disruptive to users attempting to reach the application. These instances remind us that applications, much like the underlying networks they run on, can experience outages, and effective troubleshooting requires end-to-end visibility into both.

    DNS and BGP and DDoS Attacks—Oh, My! (May 11-17, 2021) | Outage Deep Dive

    DNS and BGP and DDoS Attacks—Oh, My! (May 11-17, 2021) | Outage Deep Dive
    00:00 Welcome 00:14 Headlines: DNS and BGP and DDoS Attacks—Oh, My! This week we cover a couple of recent service degradation incidents involving DNS providers 2:19 Under the Hood: Kemal Sanjta, ThousandEyes’ resident BGP expert, joins us to discuss the May 6th disruption to Neustar’s UltraDNS service, which lasted nearly four hours. We discuss the BGP routing changes we observed during the incident and what they can tell us about the cause of the disruption. We also cover a separate incident involving Quad 9, a public recursive resolver service, which the company says was caused by a DDoS attack on May 3rd. 16:19 Expert Spotlight: Michael Batchelder (A.K.A., Binky), is here to discuss the two “Ds” of the Internet — DDoS attacks and the DNS Questions for Binky? Contact him at binky@thousandeyes.com 31:49 Outro: Questions? Feedback? Have an idea for a guest? Send us an email at internetreport@thousandeyes.com

    Even Magic Can't Stop Internet Outages (April 28-May 3, 2021) | Outage Deep Dive

    Even Magic Can't Stop Internet Outages (April 28-May 3, 2021) | Outage Deep Dive
    This is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. Today, we focused on an interesting outage that impacted Cloudflare Magic Transit, a relatively new offering from the CDN provider which aims to efficiently route and protect the network traffic of its customers. On May 3rd at approximately 3:00 PM PDT (10:00 PM UTC), ThousandEyes vantage points connecting to sites using Magic Transit began to detect significant packet loss at Cloudflare’s network edge—with the loss continuing at varying levels, for approximately 2 hours. While the outage impacted some Magic Transit customers more significantly than others, we also observed mitigation actions by at least one customer to avoid the outage and restore the availability of their service to their users. This outage reminds us that no provider is immune to outages, even cloud and global CDN providers. However, with proactive visibility, you can respond quickly to reduce outage impact on your users. Watch this week’s episode to hear more about the outage from the ThousandEyes perspective.

    Microsoft Teams Outage Highlights: Need to See Beyond App Front Door (Week of April 20-27, 2021) | Outage Deep Dive

    Microsoft Teams Outage Highlights: Need to See Beyond App Front Door (Week of April 20-27, 2021) | Outage Deep Dive
    This is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. We’re joined this week by Hans Ashlock, Director of Technology & Innovation at ThousandEyes, to discuss Tuesday’s Microsoft Teams outage. On Tuesday, April 27th, ThousandEyes tests began to detect an outage affecting the Teams service starting around 3 AM (PT) and lasting approximately 1.5 hours. While the outage occurred in the overnight hours for much of the Americas, the global nature of the outage resulted in service disruption for users connecting from Asia and Europe. Transaction views within the ThousandEyes platform show that Microsoft’s authentication service appeared to be available, however, the Teams application was unable to initialize, resulting in error responses. Watch this week’s episode to hear more about what ThousandEyes revealed about the nature of this outage—and what we can all learn from the incident.

    Major BGP Route Leak Disrupts Internet Traffic Globally (April 13-19, 2021) | Outage Deep Dive

    Major BGP Route Leak Disrupts Internet Traffic Globally (April 13-19, 2021) | Outage Deep Dive
    This is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. On today’s episode, we’re thrilled to be joined by Kemal Sanjta, ThousandEyes’ resident expert on BGP. This week, we’re going under the hood on the April 16th BGP leak at Vodafone India, which leaked more than 30,000 prefixes, causing a major disruption of Internet traffic to some services. While some news outlets reported that the incident lasted approximately 10 minutes (starting around 1:50AM UTC or 9:50AM ET), we found that it lasted quite a bit longer—more than an hour in the case of some prefixes. Watch this week’s show to see how it impacted a major CDN provider.

    Facebook Outage Analysis; Plus, Why Cross-Layer Visibility Is a Must for App Experience | Outage Deep Dive

    Facebook Outage Analysis; Plus, Why Cross-Layer Visibility Is a Must for App Experience | Outage Deep Dive
    This is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. We’re back from a short sabbatical to cover an interesting outage at Facebook in what appears to be an application outage compounded by a series of routing issues. On April 8th, for roughly 40 minutes, the Facebook application became unavailable for users around the globe who were attempting to connect to the service. Despite the short-lived nature of the outage, we observed prolonged performance degradation even after the application came back online for users. Suboptimal page load and response times, both of which can impact the user experience, were observed alongside a series of routing changes. This outage reminds us all of the importance of having visibility across network and application layers when troubleshooting and prioritizing issues that are impacting user experience. Catch this week’s episode to hear about the outage from ThousandEyes perspective.

    Northern Connectivity

    Northern Connectivity

    We do everything online—shopping, school, health care. So what happens when our communities don’t have reliable internet? In Episode 13 of No Little Plans, we look at the rapidly evolving digital divide in Canada’s north.

    The pandemic has made it clear that access to a reliable internet connection is necessary to live, work and engage meaningfully in civic life. But for many remote communities, internet isn’t a reliable resource. Canada has pledged to provide high-speed internet access to its hardest-to-reach areas by 2030. But the way we engage online is quickly evolving, along with our networks—and the chasm between the digital haves and have-nots is only growing wider. 

    According to the Canadian Radio-Television & Communications Commission, less than half of rural households have the internet speed required for online learning tools. Meanwhile, the majority of Canada’s north depends on satellite internet, which can be unreliable (the service is slow and spotty) and expensive (monthly bills can soar up to $1,200). Bad weather can put a community out of service completely. This presents a huge challenge for northern communities who need to access education, conduct business and stay connected to friends and family.

    “It's the total loss of connection, which can last several hours or, or even several days. And you never really know when this is going to happen” —Mark Brazeau

    In this episode, host Tokunbo Adegbuyi interviews Andrea Brazeau, a fourth-year student at McGill University’s Faculty of Education. Andrea is originally from Kangiqsualujjuaq, in Nunavik, Quebec, and last fall she wrote an open letter to the premier of Quebec to draw attention to the internet gaps her northern community faces. Unlike some of her classmates, Andrea stayed in Montreal for the fall semester because she knew she wouldn’t be able to access online learning from her home in northern Quebec. “It was difficult because Montreal is the coronavirus hotspot. The one thing I thought about was my mental health—being alone,” she says. “My family is up north and I thought, how am I going to do this? How am I going to make it through the semester? 

    In Kangiqsualujjuaq, connectivity is so unreliable that sometimes Andrea’s family loses internet for days at a time. While making the episode, Andrea asks her dad, Mark, to send a voice memo to the podcast team, and they discover that he’s working with a download speed of 91 kilobytes per second. For context: the government considers 1 megabit per second insufficient for meaningful online engagement; Mark Brazeau—who works as a school principal—is dealing with less than one hundredth that speed. And, looking beyond the bare minimum of being able to work and learn online, Andrea wonders what else might be possible with better connectivity: “There's this big Indigenous community online,” she says. “Imagine how much more connected we could be as Indigenous peoples across Canada if we had a high functioning internet in the north?

    “It opens up a world of opportunity for youth in the north to be able to access the same services that we all take for granted in the south” —Mark Buell

    Later in the episode, we hear from Mark Buell, the regional vice-president for North America at the Internet Society, a non-profit with the goal of securing access to safe and secure internet for everyone in the world. There’s a lot of discussion around how to improve telecommunications in the north. Low-earth-orbit satellites, or LEOS, are one option that shows promise, delivering up to 50 megabits per second. But, according to Buell, the gold standard of connectivity is fibre-optic internet, which delivers 20 times that speed. The problem? Fibre needs infrastructure to operate, and if the infrastructure doesn’t exist, it can cost millions of dollars to build from scratch. 

    “We tended to rely on the private sector to deploy internet access for the first 20 years of the internet. We did a really good job connecting a lot of people to the internet, but it was based on market forces,” he explains. “Canada has some of the highest internet penetration rates in the world. But that's simply because of our geography. The vast majority of Canadians live within 100 kilometres of the U.S. border. Where the market-based approach fails is in those communities where there may not be a return on investment for the private sector to deploy access.”

    Buell speaks about community-led solutions that could help bridge the gap for northern Indigenous populations. He organizes the Indigenous Connectivity Summit, which works to empower Indigenous networkers. After the annual summit, they publish a set of key policy recommendations on how to undertake connectivity projects with Indigenous communities. They argue, "Indigenous voices are critical to conversations about connectivity, especially when the policy outcomes of those conversations will affect Indigenous communities.”

    In our interview, Buell describes how Ulukhaktok, a small community in the Northwest Territories, is on their way to building their own internet network. Residents completed the Internet Society’s training program and plan to launch their internet service provider as a non-profit. “Indigenous people around the globe have all suffered from the effects of colonialism,” he says. “By connecting to each other via the internet, you create this global community of support to share knowledge and stories.”

     

     

     

    What Happened With Verizon’s Recent Outage (Week of Jan. 25-Feb. 1, 2021) | Outage Deep Dive

    What Happened With Verizon’s Recent Outage (Week of Jan. 25-Feb. 1, 2021) | Outage Deep Dive
    On today’s episode, we discuss the recent outage on Verizon’s network that had widespread impacts on users in the US. ThousandEyes Broadband Agents detected an outage starting around 11:30am EST that manifested as packet loss across multiple locations concentrated along Verizon backbone in the US east coast and midwest. While the outage was resolved approximately an hour later, users connecting from the Verizon network across the US experienced varying degrees of impact, depending on the services they were connecting to. This serves as yet another reminder that the context around an outage directly affects the scope of the disruption. Watch this week’s episode to see what this outage looked like from ThousandEyes vantage points.

    DNS security with Quad9

    DNS security with Quad9

    What are you doing to make the internet a safer and more private place?

    This episode, Robby welcomes John Todd, Executive Director of the non-profit organisation Quad9. Quad9 is a free, recursive DNS solution that partners with threat intelligence providers from all over the world to block websites that try to harm our computers (through things like malware, spyware, botnets, phishing sites, etc.).

    John chats with Robby about their DNS system, how they’re different from most paid services, and their charter to making the internet a safer and more private place. He also shares some war stories and discusses what effects they’re seeing from COVID-19 in their feed.

    https://www.quad9.net/

    Technical level: 2/5

    Host: Robby Peralta

    Producer: Paul Jæger

    https://mnemonic.no/podcast 

    What Happened with Slack’s Outage; Plus, Talking Cloud Resiliency with Forrest Brazeal of Cloud Guru (Week of 12/28/20-01/04/21) | Outage Deep Dive

    What Happened with Slack’s Outage; Plus, Talking Cloud Resiliency with Forrest Brazeal of Cloud Guru (Week of 12/28/20-01/04/21) | Outage Deep Dive
    This is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. Despite a quiet last couple of weeks on the Internet, we started off our new year with quite the bang. As droves of mildly-caffeinated workers returned to their home offices on Monday after the holiday break, many were surprised to find that Slack was not available. On today’s episode, we go under the hood of Slack’s Monday outage to see what went wrong and how it was resolved. We’re also excited to be joined by Forrest Brazeal, a cloud architect, writer, speaker and cartoonist, to talk about everyone’s favorite subject: cloud resiliency. Watch this week’s episode to see the interview and hear our outage analysis. Show links: https://forrestbrazeal.com https://acloudguru.com https://cloudirregular.substack.com https://cloudirregular.substack.com/p/the-cold-reality-of-the-kinesis-incident

    About Monday’s Google Outage; Plus, Talking Holiday Internet Traffic Trends with Fastly (Week of Dec. 7-14) | Outage Deep Dive

    About Monday’s Google Outage; Plus, Talking Holiday Internet Traffic Trends with Fastly (Week of Dec. 7-14) | Outage Deep Dive
    In this week's episode of #TheInternetReport... 00:00 Welcome 00:16 Headlines: About Monday’s Google Outage; Plus, Talking Holiday Internet Traffic Trends with Fastly 00:43 Under the Hood: This week, we go under the hood on a recent outage that took down the availability of several Google applications, including YouTube, Gmail and Google Calendar. Yesterday morning at approximately 6:50 AM EST, users around the world were unable to access several Google services for a span of around 40 minutes. While short-lived, the outage was notable in that it occurred during business hours in Europe and toward the beginning of the school day on the US east coast—so, people noticed, to put it bluntly. Catch this week’s episode to hear about the official RCA and what we saw from a network perspective. 10:18 Expert Spotlight: We’re thrilled to be joined by David Belson Senior Director of Data Insights, at Fastly talk about Internet traffic trends related to holiday online shopping and charitable giving. Cyber Five: what we saw during ecommerce's big week- https://www.fastly.com/blog/cyber-five-what-we-saw-during-ecommerces-big-week Decoding the digital divide- https://www.fastly.com/blog/digital-divide 19:14 Outro: We're taking a break for the rest of 2020 but join us on Jan. 05 2021 when we kick off the New Year with Forrest Brazeal: https://forrestbrazeal.com https://cloudirregular.substack.com

    Major AWS Outage Highlights Dependencies within Cloud Providers (Week of Nov. 23-30) | Outage Deep Dive

    Major AWS Outage Highlights Dependencies within Cloud Providers (Week of Nov. 23-30) | Outage Deep Dive
    If you’re an AWS customer or rely on services that use AWS, you might have noticed the major, hours-long outage last week. On November 25th, at approximately 5:15 am PST, users of Kinesis, a real-time processor of streaming data, began to experience service interruptions. The issue was not network-related, and AWS later issued a detailed incident post-mortem analysis identifying an existing operating system configuration issue that was triggered by a maintenance event that involved adding server capacity. Over the course of the day, Amazon attempted several mitigation measures, but the outage was not completely resolved until approximately 10:23 pm PST. What was notable about this outage was its blast radius, which extended far beyond AWS’s direct customers. Several AWS services that use Kinesis, including Cognito and CloudWatch, were affected, as were any user of applications consuming those services (e.g., Ring, iRobot, Adobe). This is a good reminder of the risk of hidden service dependencies, as well as the need for visibility to understand and communicate with customers when something’s gone wrong.

    2020 Election—The Internet Held Strong With a Few App Performance Glitches (Week of Nov. 2-8)

    2020 Election—The Internet Held Strong With a Few App Performance Glitches (Week of Nov. 2-8)
    This is The Internet Report, where we uncover what’s working and what’s breaking on the Internet—and why. This week, we’re pleasantly surprised to say that the network did not break, and there were no major election-night outages to report. However, that’s not to say we didn’t catch performance glitches in the days and weeks around the big night. Watch this week’s episode, as we cover performance issues at a Secretary of State website as well as why CNN’s election map website was so slow to load for many.

    2020 Election Special: Going Under the Hood on State Election Websites (Week of Oct. 19-25)

    2020 Election Special: Going Under the Hood on State Election Websites (Week of Oct. 19-25)
    We’ve got an election coming up here in the US, and over the last several weeks, we have been analyzing a dozen or so state election websites to take a closer look at how they’re hosted (e.g., do they use a CDN or are they self-hosted?) and to monitor them for outages. In this episode, we discuss the pros and cons of each hosting method and dive into some examples we’ve seen where election websites have had unexpected performance degradation. Catch this week’s episode to go under the hood on the websites powering the upcoming presidential election—and don’t forget to get out there and vote!
    Logo

    © 2024 Podcastworld. All rights reserved

    Stay up to date

    For any inquiries, please email us at hello@podcastworld.io