Two Engineer with laptops are considering the critical infrastructure they work on.

The Infrastructure We Forgot We Built

“A friend sent over an interesting article by Ross Haleluik that opened with ‘Why it’s not just power grid and water, but also tools like Stripe and Twilio that should be defined as critical infrastructure.'”

The point being made is that there are some services (as demonstrated by the recent AWS outage) that cause significant harm if they become unavailable. The definition of critical infrastructure needs to go beyond power, water, or even core ICT networking.

So let’s talk about that outage. On 19 October 2025, an AWS outage (of course it was the DNS) made the Internet wobble. Payments failed. Authentication broke. Delivery systems froze. For a few hours, the digital economy looked a lot less digital and a lot more fragile.

From my perspective, the strangest things failed. I was in the process of boarding a plane to the Internet Identity Workshop. Air traffic control was fine (yay for archaic systems!), but the gate agent couldn’t run the automated bag check tools. The flight purser couldn’t see what catering had been loaded. And my seatmate completely panicked, wondering if it was even safe to fly.

So many things broke. A lot of things didn’t. Everyday people had no idea how to differentiate what mattered. That moment reminded me how fragile modern “resilience” can be.

We used to worry about power grids, water, and transportation—the visible bones of civilization. Now it is APIs, SaaS platforms, and cloud services that keep everything else alive. The outage didn’t just break a few apps; it exposed how invisible dependencies have become the modern equivalents of roads and power lines.

A Digital Identity Digest

The Infrastructure We Forgot We Built

00:00 / 00:16:02

You can Subscribe and Listen to the Podcast on Apple Podcasts, or wherever you listen to Podcasts.

And be sure to leave me a Rating and Review!

The invisible backbone

Modern business runs on other people’s APIs (I’m also looking at you, too, MCP). Stripe handles payments. Twilio delivers authentication codes and customer messages. Okta provides identity. AWS, Google Cloud, and Azure host it all.

These are not niche conveniences. They form the infrastructure of global commerce. When you tap your phone to pay for coffee, when a government portal confirms your tax filing, or when an airline gate agent scans your boarding pass, one of these services is quietly mediating that interaction.

They don’t look like infrastructure. There are no visible grids, transformers, or pipes. They exist as lines of code, data centers, and service contracts in that they are modular, rentable, and ephemeral. Yet they behave like utilities in every meaningful way.

We have replaced public infrastructure with private platforms. The shift brought convenience and innovation, but also a new kind of risk. Infrastructure used to be something we built and maintained. Now it’s something we subscribe to and assume will stay online. We stopped building things to last and started building things to scale by leveraging someone else’s efficiencies. The assumption that the lights will always stay on has not caught up with reality.

The paradox of “resilient” design

Cloud architecture is often described as inherently resilient. Redundancy, failover, and microservices are meant to prevent collapse. But “resilient” in one dimension can mean “fragile” in another. I talked about this in an earlier post, The End of the Global Internet.

Designing for resilience makes sense in a world where the Internet is fragmenting. Companies build multi-region redundancies, on-prem backups, and hybrid clouds to protect themselves from geopolitical risk, supply chain issues, and simple human error. That same design logic—isolating risk, duplicating services, layering complexity—often increases fragility at the systemic level. Resilience is considered important, but efficiency is even better.

Microservices make each node stronger while the overall network becomes more brittle. Every layer of redundancy adds another point of failure and another dependency. A service might survive the loss of one data center but not the failure of a shared authentication API or DNS resolver. Local resilience frequently amplifies global fragility.

The AWS outage demonstrated this clearly. A system built for reliability failed because its dependencies were too successful. Interdependence works in both directions. When everyone relies on the same safety net, everyone falls together.

Utility or vendor?

This raises a larger question: should services like AWS, Stripe, or Twilio be treated as critical infrastructure? Haleluik says yes. I’m trying to decide where I stand on this, which is why I’m writing this series of blog posts.

In the United States, the FTC and FCC have debated for decades whether the Internet itself (aka, “broadband”) qualifies. If you aren’t familiar with that quagmire, you might be interested in “FCC vs FTC: A Primer on Broadband Privacy Oversight.”

The arguments for the designation are clear. Without broadband access, the modern economy falters. The arguments against it are equally clear. Labeling something as critical infrastructure introduces regulation, and regulation remains politically unpopular when applied to the Internet.

To put it another way, declaring something critical brings oversight, compliance requirements, and coordination mandates. Avoiding that label preserves flexibility and profit margins but leaves everyone downstream exposed. The result is an uneasy middle ground. These systems operate as essential infrastructure but remain governed by private interest. Their reach exceeds their obligations.

In traditional utilities, physical constraints limited monopoly power. Another way to look at it, though, is that traditional utilities are monopolized by government agencies (ideally) to the benefit of all. The economics of software, however, reward centralization. Success creates scale, and scale discourages competition. Very few can afford to get there (big enough to mask failures) from here (small enough to feel them).

I think we’re seeing quite a bit of magical thinking when it comes to the stories companies tell themselves when it comes to resilience: When your infrastructure depends on someone else’s business continuity plan, governance becomes an act of faith.

When “public” meets “critical”

While the debate over “critical infrastructure” in the United States often focuses on regulation versus innovation, the rest of the world is having a different but related conversation under the banner of digital public infrastructure (DPI).

Across the G20 and beyond, governments are grappling with whether digital public infrastructure—such as national payment systems, digital identity programs, and data exchange platforms—should be designated as critical information infrastructure (CII). A recent policy brief from a G20 engagement group argues that while both concepts overlap, they represent opposing design instincts: DPI is built for openness, interoperability, and inclusion, whereas CII emphasizes restriction, control, and national security.

That tension is already visible in India, where systems such as the Unified Payments Interface (UPI) have become de facto critical infrastructure. Although UPI has not been formally designated as CII, its scale and centrality to the nation’s payment system have raised similar questions about oversight and control. Its success has increased trust and security expectations, but also heightened concerns about market access for private and foreign participants, as well as the challenges of cross-border interoperability.

The G20 paper calls for ex-ante (early) designation of digital public systems as critical, rather than ex-post (after deployment), to avoid costly retrofits and policy confusion. But the underlying debate remains unresolved: Should public-facing digital infrastructure be treated like essential utilities, or like regulated assets of the state? The answer may depend less on technology and more on who society believes should bear responsibility for keeping the digital lights on. The answer to that won’t be the same everywhere.

Security versus availability

That tension over control doesn’t stop at the policy level. It runs straight through the design decisions companies make every day. When regulation is ambiguous, incentives fill the gap—and the strongest incentive of all is keeping systems online.

Availability has become the real currency of trust. It’s a strange thing, if you think about it logically, but human trust rarely is. (Cryptographic trust is another matter entirely.) Downtime brings backlash, lost revenue, and penalties, so companies do the rational thing: they optimize for uptime above all else. Security comes later. I don’t like it, but I understand why it happens.

Availability wins because it’s visible. Customers notice an outage immediately. They don’t notice an insecure configuration, a quiet policy failure, or a missing audit trail until something goes horribly wrong and the media gets a hold of the after-action report.

That visibility gap distorts priorities. When reliability is measured only by uptime, risk grows quietly in the background. You can’t meaningfully secure systems you don’t control, yet most organizations depend on the illusion that control and accountability can be outsourced while reliability remains intact.

And then there are the incentives, a word I probably use too often, but for good reason. The incentives in this landscape reward continuity, not transparency. Revenue flows as long as the service runs, even if it runs insecurely. Yes, fines exist, but they are exceptions, not deterrents.

What counts as “working” is still negotiated privately, even when the consequences are public. Until those definitions include societal resilience, we’ll continue to mistake uptime for stability.

Regulated by dependence

All of this sounds like arguments for the critical infrastructure label, doesn’t it? But remember, formal regulation is only one kind of control. Dependence is another, because dependence acts as a form of unofficial regulation.

Society already treats many tech platforms as critical infrastructure even without saying so. Governments host essential services on AWS. Health systems use commercial clouds for patient records. Banks rely on private payment APIs to move billions each day.

We trust these companies to act in the public interest, not because they must, but because we lack alternatives. Massive failures result in conversations like this post about whether these companies need to be more thoroughly monitored. This is the logic of “too big to fail,” translated into digital infrastructure. Authentication services, data hosting, and communication gateways now carry systemic risk once reserved for banks.

We have built a layer of critical infrastructure that is privately owned but publicly relied upon. It operates by trust, not by oversight, and that is a fragile foundation for a system this essential.

The illusion of choice

Dependence isn’t only a matter of trust. It’s also the result of market design. The systems we treat as infrastructure are built on platforms that appear competitive but converge around the same few providers.

Vendor neutrality looks fine on a procurement slide but falters in practice.

Ask a CIO whether their organization could migrate off a cloud provider; most will say yes. Ask whether they could do it today, and the answer shortens to silence.

APIs, SDKs, and proprietary integrations make switching possible but painful. That pain enforces dependence. It isn’t necessarily malicious, but it keeps theoretical competition safely theoretical.

Lock-in is the quiet tax on convenience.

The market appears to offer many choices, but those choices often lead back to the same infrastructure. A handful of global providers now underpin authentication, messaging, hosting, and payments.

When a platform failure can delay paychecks, ground flights, or disrupt hospital systems, we’re no longer talking about preference or pricing. We’re talking about public safety.

The same qualities that once made the Internet adaptable—modular APIs, composable services, seamless integration—have made it fragile in aggregate. We built a dependency chain and called it innovation.

That dependency chain doesn’t just reshape markets. It reshapes how societies determine what constitutes essential. When the same few providers sit beneath every major system, “critical infrastructure” stops being a policy category and starts being a description of reality.

The expanding definition of “critical”

What we’re looking at is the challenge that “critical” is just too big a concept. As societies become more technically complex, the definition of critical infrastructure also keeps growing.

Power, water, and transport once defined the baseline. Then came telecommunications. Then the Internet. The stack now includes authentication, payments, communication APIs, and identity services. Each layer improves capability while expanding exposure.

Whether or not you believe that these tools should exist, their failure now extends beyond the control of any single organization. As dependencies multiply, the distinction between convenience and infrastructure fades.

An AWS outage can make it really hard to check in for your flight. A Twilio misconfiguration can interrupt millions of authentication codes. A payment API failure can halt payroll for small businesses. These systems support not only individual companies but also the systems that support those companies.

If we decide that these systems function as critical infrastructure, the next question is what to do about them. Recognition doesn’t come free. It brings oversight, obligations, and trade-offs that neither governments nor providers are fully prepared to bear.

The cost of recognition

Calling an API a utility isn’t about nationalization. It’s about acknowledging that private infrastructure now performs public functions. With that acknowledgment comes responsibility.

Critical infrastructure is what society cannot function without. That definition once focused on physical essentials; now it includes the digital plumbing that supports everything else. Expanding that list has consequences. Every new addition stretches oversight thinner and diffuses accountability.

Resources are finite. Attention is finite. When every system is declared critical, prioritization becomes impossible. The next challenge isn’t deciding whether to protect these dependencies, but how much protection each truly deserves.

I can (and will!) say a lot more on this particular subject. Stay tuned for next week’s post.

Closing thoughts

Ross Haleluik’s observation was an interesting perspective on what utilities look like in modern life. Stripe, Twilio, AWS, and others do not just enable business; they are the business. They have become the unacknowledged utilities of a digital economy.

When I watched Delta’s systems falter during the AWS outage, it was not just an airline glitch. It was a glimpse into the depth of interdependence that defines a modern technical society. If efficiency is the goal, then labeling these systems as critical infrastructure may be the right path. But if resilience is the goal, then perhaps we have other choices to make.

The next outage will not be an exception. It will serve as another reminder that the foundations of the modern world are built on rented infrastructure, and the bill is coming due.

📩 If you’d rather have a notification when a new blog is published rather than hoping to catch the announcement on social media, I have an option for you! Subscribe to get a notification when new blog posts go live. No spam, just announcements of new posts. [Subscribe here]

Transcript

[00:00:29]
Hi and welcome back.

I’m recording this episode while dealing with the cold that everyone seems to have right now — so apologies for it being a little bit late. I had hoped the cold would pass before I picked up the microphone again.

But here we are, and today I want to talk about Critical Infrastructure.

Rethinking What Counts as Critical

A friend of mine recently sent over an article by Ross Havelwick that began with an interesting point:

“It’s not just power grids and water systems that count as critical infrastructure, but also tools like Stripe and Twilio.”

His argument was simple yet powerful — some services have become so essential that when they fail, the impact ripples far beyond their own operations. The AWS outage in October proved that vividly.

Before diving deeper, it’s worth defining what we mean by critical infrastructure.
These are systems and assets so vital that their disruption would harm national security, the economy, or public safety.

In the United States, the Cybersecurity and Infrastructure Security Agency (CISA) identifies 16 sectors, including energy, water, transportation, and communications.
Other countries use similar frameworks, but all share the same idea: protect what society cannot function without — ideally with some level of government oversight.

Yet, as Havelwick and others note, this list keeps expanding.

When AWS Went Down

On October 19, 2025, AWS experienced a major outage in one region. A database error cascaded into failures across DNS, payments, and authentication systems.

For a few hours, the digital economy looked far less digital.

I remember it clearly: I was boarding a flight to the Internet Identity Workshop. Air traffic control was fine — archaic but stable. Yet the gate agent couldn’t check bags, and the purser couldn’t confirm catering. My seatmate was visibly anxious about whether it was even safe to fly.

So many systems failed, yet many didn’t. What struck me most was how few people could tell the difference between what mattered and what didn’t.

Invisible Dependencies and Fragile Resilience

This incident made something clear — modern resilience is fragile because we’ve built it atop invisible dependencies that we rarely acknowledge.

Modern businesses run on other people’s APIs:

Stripe handles payments.
Twilio delivers authentication codes.
Okta manages identity.
AWS, Google Cloud, Azure host nearly everything.

These aren’t niche conveniences anymore — they’re the infrastructure of global commerce. When you tap your phone to pay for coffee or file taxes online, one of these services is working silently in the background.

They may not look like traditional infrastructure — no visible grids or pipes — but they behave like utilities.

In short, we’ve replaced public infrastructure with private platforms.

Innovation and Its Risks

This shift has brought incredible innovation but also new risks.
Infrastructure used to be something we built and maintained. Now it’s something we subscribe to and assume will always work.

We’ve optimized for scale, not longevity.
But our assumptions about resilience haven’t kept pace.

There’s a paradox here:

Cloud architectures are built for redundancy and fault tolerance.
Yet every layer of resilience adds another dependency — and therefore, another potential point of failure.

When a shared DNS resolver or authentication API fails, the entire ecosystem can crumble, no matter how many backups you have.

Interdependence and Oversight

Interdependence cuts both ways. When everyone relies on the same few providers, a failure for one becomes a failure for all.

So the big question arises:
Should services like AWS or Stripe be treated as critical infrastructure?

Havelwick argues yes. I’m not entirely convinced — but I see both sides.

In the U.S., agencies like the FTC and FCC have debated for decades whether the Internet itself qualifies as critical infrastructure.
Supporters argue that broadband is essential to modern life; opponents worry that regulation could slow innovation.

Declaring something “critical” brings oversight and compliance. Avoiding the label keeps flexibility — but also leaves society exposed.

We now have systems that operate like infrastructure yet remain governed by private interests. Their influence extends far beyond their legal obligations.

Digital Public Infrastructure and Global Perspectives

Outside the U.S., this debate continues under the banner of Digital Public Infrastructure (DPI).
Governments across the G20 are exploring whether payment systems, digital identity networks, and data exchange platforms should be classified as Critical Information Infrastructure (CII).

A recent G20 policy brief captured the tension well:

DPI emphasizes openness and inclusion.
CII emphasizes restriction and control.

For example, India’s Unified Payments Interface (UPI) functions as critical infrastructure in practice, even if not in name.

Its success raises key questions:

Who controls access?
How should foreign participants interact?
Can cross-border interoperability be trusted?

The G20’s advice: identify critical systems early, before they become too big to retrofit with proper governance. But again, recognition invites regulation, which can stifle the innovation that made those systems successful.

The Incentive Problem

When regulation lags, incentives take over — and the biggest incentive of all is uptime.

Companies prioritize continuity because:

Downtime is visible.
Security failures often aren’t.

As a result, availability becomes the currency of trust.
Revenue flows as long as systems run — even if they run insecurely.

Until we include societal resilience in our definition of “working,” we’ll keep mistaking uptime for stability.

The Trust Dilemma

Dependency itself already acts as a form of regulation.
Governments host their services on AWS. Hospitals store patient records in the cloud.

We trust these platforms — not because they’re obligated to serve the public interest, but because we have no alternative.

It’s the logic of too big to fail rewritten for the digital era.
We’ve built a layer of infrastructure that’s privately owned yet publicly indispensable — and it’s running on trust, not oversight.

Lock-In and Market Gravity

Dependence isn’t just about trust — it’s about design.

If you ask most CIOs whether they could migrate off a major cloud provider, they’ll say yes.
Ask if they could do it today, and the answer is no.

Proprietary integrations make switching possible but painful. That pain enforces dependence — not maliciously, but through market gravity.

Lock-in is the tax on convenience.
And when a platform failure can delay paychecks, disrupt hospitals, or ground flights, this isn’t about preference — it’s about public safety.

Expanding the Definition of Critical

As technology grows more complex, the concept of critical infrastructure keeps expanding.

Power, water, and transportation were once the baseline.
Then came telecommunications and the Internet.
Now we’re talking about authentication, payments, messaging, and identity services.

Each layer increases capability — but also multiplies exposure.

The real question isn’t whether these systems are critical. They clearly are.
It’s how to manage the responsibilities that come with that recognition.

Responsibility and Resilience

[00:13:10]
Calling an API a utility doesn’t mean nationalizing it. It means acknowledging that private infrastructure now performs public functions, and that recognition carries responsibility.

Yet every new addition to the “critical” list spreads oversight thinner. If everything’s a priority, nothing truly is.

We have to decide which dependencies deserve protection — and which risks we can live with.

Stripe, AWS, and similar services don’t just enable business. They are business. They’ve become the unacknowledged utilities of our digital economy.

When I saw my airline systems falter during the AWS outage, it wasn’t just a glitch — it was a glimpse into how deeply interwoven our dependencies have become.

If your goal is efficiency, labeling these systems as critical may help create stability through regulation.

But if your goal is resilience, perhaps it’s time to design for flexibility — to accept failure as part of stability, and to plan for it.

The next outage will happen. It won’t be an exception. It will simply remind us that the foundations of the modern world run on rented infrastructure, and that rent always comes due.

[00:15:26]
And that’s it for this week’s episode of The Digital Identity Digest.

If it helped make things a little clearer — or at least more interesting — share it with a friend or colleague and connect with me on LinkedIn @HLFLanagan.

If you enjoyed the show, please subscribe and leave a rating or review on Apple Podcasts or wherever you listen.

You can also find the full written post at sphericalcowconsulting.com.

Stay curious, stay engaged — and let’s keep these conversations going.

The Infrastructure We Forgot We Built

Previous Article

Next Article

Heather Flanagan

Leave a ReplyCancel reply

The Infrastructure We Forgot We Built

The Infrastructure We Forgot We Built

The invisible backbone

The paradox of “resilient” design

Utility or vendor?

When “public” meets “critical”

Security versus availability

Regulated by dependence

The illusion of choice

The expanding definition of “critical”

The cost of recognition

Closing thoughts

Transcript

Rethinking What Counts as Critical

When AWS Went Down

Invisible Dependencies and Fragile Resilience

Innovation and Its Risks

Interdependence and Oversight

Digital Public Infrastructure and Global Perspectives

The Incentive Problem

The Trust Dilemma

Lock-In and Market Gravity

Expanding the Definition of Critical

Responsibility and Resilience

Share this:

Like this:

Previous Article

Next Article

Heather Flanagan

Leave a ReplyCancel reply

Discover more from Spherical Cow Consulting