Reading view

There are new articles available, click to refresh the page.

Massive Cloudflare outage was triggered by file that suddenly doubled in size

When a Cloudflare outage disrupted large numbers of websites and online services yesterday, the company initially thought it was hit by a “hyper-scale” DDoS (distributed denial-of-service) attack.

“I worry this is the big botnet flexing,” Cloudflare co-founder and CEO Matthew Prince wrote in an internal chat room yesterday, while he and others discussed whether Cloudflare was being hit by attacks from the prolific Aisuru botnet. But upon further investigation, Cloudflare staff realized the problem had an internal cause: an important file had unexpectedly doubled in size and propagated across the network.

This caused trouble for software that needs to read the file to maintain the Cloudflare bot management system that uses a machine learning model to protect against security threats. Cloudflare’s core CDN, security services, and several other services were affected.

Read full article

Comments

© Getty Images | NurPhoto

Widespread Cloudflare outage blamed on mysterious traffic spike

A Cloudflare outage caused large chunks of the Internet to go dark Tuesday morning, temporarily impacting big platforms like X and ChatGPT.

“A fix has been implemented and we believe the incident is now resolved. We are continuing to monitor for errors to ensure all services are back to normal,” Cloudflare’s status page said. “Some customers may be still experiencing issues logging into or using the Cloudflare dashboard.”

The company initially attributed the widespread outages to “an internal service degradation” and provided updates as it sought a fix over the past two hours.

Read full article

Comments

© NurPhoto / Contributor | NurPhoto

Hackaday Links: November 16, 2025

Hackaday Links Column Banner

We make no claims to be an expert on anything, but we do know that rule number one of working with big, expensive, mission-critical equipment is: Don’t break the big, expensive, mission-critical equipment. Unfortunately, though, that’s just what happened to the Deep Space Network’s 70-meter dish antenna at Goldstone, California. NASA announced the outage this week, but the accident that damaged the dish occurred much earlier, in mid-September. DSS-14, as the antenna is known, is a vital part of the Deep Space Network, which uses huge antennas at three sites (Goldstone, Madrid, and Canberra) to stay in touch with satellites and probes from the Moon to the edge of the solar system. The three sites are located roughly 120 degrees apart on the globe, which gives the network full coverage of the sky regardless of the local time.

Losing the “Mars Antenna,” as DSS-14 is informally known, is a blow to the DSN, a network that was already stretched to the limit of its capabilities, and is likely to be further challenged as the race back to the Moon heats up. As for the cause of the accident, NASA explains that the antenna was “over-rotated, causing stress on the cabling and piping in the center of the structure.” It’s not clear which axis was over-rotated, but based on some specs we found that say the azimuth travel range is ±265 degrees “from wrap center,” we suspect it was the vertical axis in the base. It sounds like the azimuth went past that limit, which wrapped the swags of cables and hoses that run the antenna tightly, causing the damage. We’d have thought there would be a physical stop of some sort to prevent over-rotation, but then again, running a structure that big up against a stop would be very much an “irresistible force, immovable object” scenario. Here’s hoping they can get DSS-14 patched up quickly and back in service.

Speaking of having a bad day on the job, we have to take pity on these Russian engineers for the “demo hell” they went through while revealing the country’s first AI-powered humanoid robot. AIdol, as the bot is known, seemed to struggle from the start, doddering from behind some curtains like a nursing home patient with a couple of nervous-looking fellows flanking it. The bot paused briefly before continuing its drunk-walk, pausing again to deliver a somewhat feeble wave to the crowd before entering the terminal stumble and face-plant part of the demo. The bot’s attendants quickly dragged it away, leaving a pile of parts on the stage while more helpers tried — and failed — to deploy a curtain to hide the scene. It was a pretty sad scene to behold, made worse by the choice of walk-out music (Bill Conti’s iconic “Gonna Fly Now,” better known as the theme from Rocky).

We just noticed that pretty much everything we have to write about this week has a “bad day at work” vibe to it, so to continue on with that theme, witness this absolutely disgusting restoration of a GPU that spent way too many years in a smoker’s house. The card, an Asus 9800GT Matrix, is from 2008, so it may have spent the last 17 years getting caked with tar and nicotine, along with a fair amount of dust and perhaps cat hair, from the look of it. Having spent way too much time cleaning TVs similarly caked with grossness most foul, we couldn’t stomach watching the video of the restoration process, but it’s available in the article if you dare.

And the final entry in our “So you think your job sucks?” roundup, behold the poor saps who have to generate training data for AI-powered domestic robots. The story details the travails of Naveen Kumar, who spends his workday on simple chores such as folding towels, with the twist of doing it with a GoPro strapped to his forehead to capture all the action. The videos are then sent to a U.S. client, who uses them to develop a training model so that humanoid robots can eventually copy the surprisingly complex physical movements needed to perform such a mundane task. Training a robot is all well and good, but how about training them how to move around inside a house made for humans? That’s where it gets really creepy, as an AI startup has partnered with a big real estate company to share video footage captured from those “walk-through” videos real estate agents are so fond of. So if your house has recently been on the market, there’s a non-zero chance that it’s being used to train an army of domestic robots.

And finally, we guess this one fits the rough-day-at-work theme, but only if your job is being a European astronaut, who may someday be chowing down on protein powder made from their own urine. The product is known as Solein — sorry, but have they never seen the movie Soylent Green? — and is made via a gas fermentation process using microbes, electricity, and air. The Earth-based process uses ammonia as a nitrogen source, but in orbit or on long-duration deep-space missions, urea harvested from astronaut pee would be used instead. There’s no word on what Solein tastes like, but from the look of it, and considering the source, we’d be a bit reluctant to dig in.

After another major outage, Alaska Airlines taps Accenture to audit technology systems

An Alaska Airlines plane at Seattle-Tacoma International Airport. (GeekWire File Photo / Kurt Schlosser)

Alaska Airlines said Friday it has hired global consulting firm Accenture to conduct a full audit of its technology systems, part of a broader push to improve reliability after two major IT outages in recent months. The review will include a top-to-bottom examination of the airline’s systems, standards, and processes.

The move follows a major outage last week that grounded flights for eight hours. The Seattle-based company said more than 49,000 passengers had their travel plans disrupted and more than 400 flights were canceled across Alaska Airlines and its subsidiary Horizon Air. The outage was severe enough to postpone the company’s scheduled quarterly earnings call. 

Alaska said the outage was due to a failure at its primary data center and was not related to a cybersecurity incident.

In a new regulatory filing, the airline said it does not plan on rescheduling its third quarter call and will provide updated guidance for its fourth quarter in early December, “once the full financial impact of the recent IT disruptions is understood.”

A separate July outage, caused by a failure of a “critical piece of hardware” at Alaska’s data centers, was expected to reduce earnings by about $0.10 per share, or roughly $12 million.

Alaska said it has boosted IT infrastructure spending by nearly 80% since 2019, investing in redundant data centers and migrating more guest-facing systems to the cloud.

The airline operates a hybrid infrastructure, blending its own data centers with third-party cloud platforms, according to an interview last year with Vikram Baskaran, Alaska’s vice president of IT.

Alaska began migrating workloads to Microsoft Azure around 2015 and continues to maintain its own data centers for critical workloads, according to the interview.

Earlier this week, Alaska had another IT disruption, but this time blamed Microsoft Azure, which itself had an outage that temporarily disrupted operations for customers worldwide. The disruption impacted Alaska’s subsidiary Hawaiian Airlines.

Microsoft’s Azure reports cloud outage, disrupting global customers including Alaska Airlines

Microsoft logo. (GeekWire Photo)

An outage on Microsoft’s Azure cloud services Wednesday morning disrupted operations for customers worldwide including Alaska Airlines, Xbox users and Microsoft 365 subscribers.

The incident came just ahead of Microsoft’s quarterly earnings call today and follows last week’s outage at Amazon Web Services and a failure of Alaska Airlines’ own data center technology.

The latest outage struck at 9 a.m. PT, according to Microsoft, when the system “began experiencing Azure Front Door (AFD) issues resulting in a loss of availability of some services. We suspect that an inadvertent configuration change as the trigger event for this issue.

“We are taking several concurrent actions: Firstly, where we are blocking all changes to the AFD services, this includes customer configuration changes as well. At the same time, we are rolling back our AFD configuration to our last known good state,” the company stated. “As we rollback we want to ensure that the problematic configuration doesn’t re-initiate upon recovery.”

Alaska Airlines posted on X at 10:33 a.m., explaining that the Azure outage was disrupting systems including their website function. Passengers flying on Alaska and Hawaiian airlines who were unable to check-in online were directed to airline agents to receive their boarding passes.

“We apologize for the inconvenience and appreciate your patience as we navigate this issue,” the post said.

Microsoft did not indicate when the outage would be resolved.

“We do not have an ETA for when the rollback will be completed, but we will update this communication within 30 minutes or when we have an update,” the company posted at 10:51 a.m.

UPDATE: At 12:22 p.m. the company shared an update stating it had deployed the “last known good” configuration of the impacted system and customers should start seeing improvements. “[W]e anticipate full mitigation within the next four hours as we continue to recover nodes …. We will provide another update on our progress within two hours, or sooner if warranted,” Microsoft added.

Days after its outage last week, AWS offered a detailed explanation of the event, which was caused by a cascading failure triggered by a rare software bug in one of the company’s most critical systems. The disruption impacted sites and online services around the world.

Alaska Airlines attributed its recent outage to a failure at its primary data center. The company operates a hybrid infrastructure, blending its own data centers with third-party cloud platforms. The incident disrupted travel for more than 49,000 passengers.

Alaska Airlines will ‘diagnose our entire IT infrastructure’ after latest outage disrupts 49,000 passengers

(Alaska Airlines Photo)

Alaska Airlines already tried to shore up its IT infrastructure after an outage in July forced the Seattle-based company to ground flights across the country.

Apparently, it wasn’t enough.

Alaska was hit with another major outage on Thursday, leading to a ground stop that lasted eight hours and resulted in more than 400 flights canceled across Alaska Airlines and its subsidiary Horizon Air.

In a new update Friday afternoon, the company said more than 49,000 passengers had their travel plans disrupted.

The outage was severe enough to postpone the company’s scheduled quarterly earnings call Friday. Shares were down more than 6%.

Alaska said it was still working to normalize operations.

The company has blamed the outage on a failure at its primary data center. It was not due to a cybersecurity incident.

“Following a similar disruption earlier this year, we took action to harden our systems, but this failure underscores the work that remains to be done to ensure system stability,” the company said in its latest update. “We are immediately bringing in outside technical experts to diagnose our entire IT infrastructure to ensure we are as resilient as we need to be. ”

It added: “The reliability of our technology is fundamental to our ability to serve guests and get them to where they need to be.”

Alaska said its July outage was caused by a failure of a “critical piece of hardware” at its data centers.

The airline operates a hybrid infrastructure, blending its own data centers with third-party cloud platforms, according to an interview last year with Vikram Baskaran, Alaska’s vice president of IT.

Alaska began migrating workloads to Microsoft Azure around 2015 and continues to maintain its own data centers for critical workloads, according to the interview.

The company last year partnered with Google Cloud on a generative AI-powered search experience.

The impact of this week’s outage was evident at Sea-Tac Airport on Thursday evening, where long lines wrapped around the concourse and a maze of suitcases piled up in the baggage claim area.

Alaska said Friday it does not have an estimate of the financial impact of the outage. The company’s Hawaiian Airlines subsidiary was not affected.

Alaska said the July outage was expected to reduce earnings by about $0.10 per share, or roughly $12 million.

The company on Thursday reported third quarter revenue of $3.8 billion, up 1.4% year-over-year, while profit dropped 69% to $123 million.

Alaska Airlines cancels 360 flights, says significant IT outage was due to ‘failure’ at a data center

Travelers at Sea-Tac Airport try to find their luggage following a major outage at Alaska Airlines that began Thursday afternoon. (GeekWire Photo / Todd Bishop)

Follow-up: Alaska Airlines will ‘diagnose our entire IT infrastructure’ after latest outage disrupts 49,000 passengers

Alaska Airlines is still working to restore operations following a major outage that forced the Seattle-based company to cancel more than 360 flights on Alaska and its subsidiary Horizon Air.

The outage began Thursday around 3:30 p.m. PT. Alaska grounded planes across the U.S. as it addressed what it described as a “significant IT outage.”

In a statement, Alaska said a “failure occurred at our primary data center.” The outage was not a cybersecurity incident, according to the company.

“The IT outage has impacted several of our key systems that enable us to run various operations, necessitating the implementation of the ground stop to keep our aircraft in position,” Alaska said. “The safety of our flights was never compromised.”

The ground stop was lifted at 11:30 p.m. PT Thursday, but the company is still actively addressing operational impacts that resulted from the disruption.

The company canceled its planned third quarter earnings call on Friday. “We do not yet have an estimate of the financial impact of the operational disruption on our fourth quarter results,” Alaska said in a regulatory filing. The company reported revenue of $3.8 billion, up 1.4% year-over-year, while profit dropped 69% to $123 million.

The impact of the outage was evident at Sea-Tac Airport on Thursday evening, where long lines wrapped around the concourse and a maze of suitcases piled up in the baggage claim area.

The company’s Hawaiian Airlines subsidiary was not affected.

Alaska encouraged customers to check their flight status before heading to the airport, and flagged its flexible travel policy.

It’s Alaska’s second outage in three months. The Seattle-based airline grounded flights after an IT outage in July that lasted about three hours.

How to Monitor Your Email Services

How To Monitor Email Services

Verifying email performance is more than the basic understanding of message flow. Outbound mail in the form of Simple Mail Transfer Protocol (SMTP) and inbound mail through MAPI or Microsoft’s Graph API only parts of email systems to monitor, usually through pings or basic delivery confirmations. Often, once email is moved to Exchange Online, even…

The post How to Monitor Your Email Services appeared first on Exoprise.

Microsoft Outlook Outage on 6th February EX512238

Unable to send, receive, or search email through Exchange Online? Microsoft Outlook suffered an outage for several hours last night, disrupting North America and worldwide email services. Proactive and Early Outage Detection Exoprise sensors first detected and confirmed the outage at 11.03 pm in our London region last night. There was a second outage at…

The post Microsoft Outlook Outage on 6th February EX512238 appeared first on Exoprise.

Microsoft Outage on 25th Jan 2023 MO502273

Microsoft had its corporate earnings call yesterday and posted weaker guidance. But guess what? Several hours later, the tech giant was hit by a networking outage that took down Azure and other services like Teams and Outlook, affecting millions of users globally. Early Detection of Microsoft Teams and Outlook Outage Here’s an email our customers…

The post Microsoft Outage on 25th Jan 2023 MO502273 appeared first on Exoprise.

❌