Reading view

There are new articles available, click to refresh the page.

The API vulnerabilities nobody talks about: excessive data exposure

By: Joviane Jardim

28 October 2025 at 09:31

TLDR: Excessive Data Exposure (leaking internal data via API responses) is the silent, pervasive threat that is more dangerous than single dramatic flaws like SQL Injection. It amplifies every other API vulnerability (like BOLA) and happens everywhere because developers prioritize speed over explicit data filtering. Fixing it means systematically checking hundreds of endpoints for unneeded PII and sensitive internal data.

After writing about how API security is different from web app security – one thing that sticks is the idea that APIs can have hundreds of small issues that add up over time, rather than one big dramatic vulnerability.

Let me give you a concrete example of what I mean.

SQL injection is serious. Everyone knows that. But what about APIs that just… hand over sensitive data by design?
I’m not saying it’s worse than SQL injection. But it might be more insidious, because it amplifies every other vulnerability you have.

Excessive data exposure: the silent problem

The patterns even encourage this and you can see it everywhere. You have an endpoint like GET /api/users/123 and it returns something like:

{
    user_id: 42,
    name: "Joviane",
    email: "myemail@gmail.com",
    role: "student"
}

… but also returns

{
   internal_user_id: 64, 
   full_address: "Secure Street, 403", 
   ssn_last_4: 1234, 
   phone_number: "73737-7373"  
}

and a lot of stuff that you weren’t planning to expose. The frontend only displays name and email, but the API is returning EVERYTHING from the database.
You might think, “but only authenticated users can call this endpoint, so it’s fine!”. And yeah, that’s true. But what happens when an attacker compromises ANY user account? When a developer accidentally logs the full response? When a browser extension scrapes the data? When the response gets cached somewhere it shouldn’t be? All of that sensitive data is just sitting there, waiting.

The worst part? This compounded with other vulnerabilities. Say you have a BOLA vulnerability where users can access other users’ data by changing an ID. If your API only returned public fields, the impact would be limited. But if it’s leaking PII, internal IDs, or sensitive business data, now that BOLA just became a massive data breach waiting to happen.

Why this happens everywhere

Here’s the thing: this isn’t malicious. Usually, it’s convenient. Returning the whole object is faster than filtering fields. ORMs don’t help either, they return everything by default unless you explicitly use projection or select specific fields. Sometimes teams are trying to be clever and “future-proof” their APIs with fields they might need later. And sometimes? It’s just copy-paste. One endpoint did it this way, so all the others followed.
It makes sense from a development velocity perspective. I’ve done this myself when shipping features under pressure. You write a quick endpoint, test that the frontend displays correctly, and ship it. The API is returning 20 fields but the UI only uses 3? Nobody notices because it works.

The real-world impact

Let me give you a concrete example I’ve seen play out in a code review. An e-learning platform had an endpoint GET /api/courses/{courseId}/students that returned student enrollment data. Makes sense for instructors to see their students, right? But it wasn’t just returning names and progress percentages. It was also returning full email addresses, enrollment dates, payment status, quiz attempt histories with timestamps, discussion forum activity metrics, and even device information from where students were accessing the course.

The frontend displayed student names and their course completion percentage. That’s it. And if you were a student? You could only see your own status in the UI. But any enrolled student could hit that endpoint directly, change the course ID, and pull data from other courses. Someone could iterate through course IDs and build a complete database of who’s taking what courses, payment patterns, learning behaviors, and personal contact information. They didn’t need to break anything or find some clever exploit. The API was just handing it all over.
Luckily, this got caught before production, but as the feature was working fine in the UI and the API, this could’ve easily slipped through and reached production.

And let’s talk about the PII implications here. That leaked student data? We’re talking full names, email addresses, phone numbers, physical addresses, potentially payment information. In a lot of jurisdictions, that’s a GDPR violation or equivalent waiting to happen. Even if the attacker never uses the data maliciously, you’ve just exposed yourself to regulatory fines, mandatory breach notifications, and a PR nightmare. All because the API returned 15 extra fields that nobody actually needed. The business intelligence leak is bad for competitive reasons, sure. But the PII exposure? That’s the kind of thing that gets you on the front page of technical channels for all the wrong reasons.

Another common pattern: pagination endpoints that leak way too much. You call GET /api/students?page=1&limit=100 expecting a list of students, and you get back not just the students, but also their hashed passwords, API keys, internal permissions, last login times, IP addresses… all stuff that should never leave the backend.

The scale problem

SQL injection is one vulnerability. You can find it, fix it and you are done. Excessive data exposure? That’s hundreds of endpoints, each leaking a little data, compounding over time.

Which one is easier for an attacker to exploit at scale? The one that exists in every single endpoint. They don’t need to find a clever injection payload. They just need to iterate through your API and collect everything you’re giving them for free. And because it’s “technically working as designed,” it might not even trigger your security monitoring. No failed requests, no suspicious payloads, just normal API calls returning way too much information.

Other “boring” vulnerabilities that actually matter

There’s Mass Assignment – where a user sends {"name": "Deckan", "isAdmin": true} and the API just… accepts both fields. No validation on what should be updatable. Suddenly, regular users are admins. Or Improper Rate Limiting. No limits on password reset? Account takeover via brute force. No limits on OTP verification? Bye-bye 2FA. No limits on search? Congrats, someone just scraped your entire database.
And the classic: Predictable Resource IDs. /api/invoices/1001, /api/invoices/1002… you see where this is going. An attacker just iterates and collects everything. Classic BOLA.

What makes this hard

These aren’t the sexy zero-day exploits that make headlines. They’re architectural problems baked into dozens or hundreds of endpoints. Finding them means actually understanding what each endpoint does. You need to know what each endpoint returns, what it needs to return, and what’s just extra baggage. Then multiply that by every endpoint in your API. It’s tedious, but it matters.

This is why API security testing is tricky. You’re not hunting for one big vulnerability. You’re checking every single endpoint for these patterns. Data leaking where it shouldn’t, auth checks that are missing, rate limits that don’t exist. All these problems are everywhere and they add on top of each other. At Detectify, our API scanning handles the tedious part, systematically checking every endpoint for vulnerabilities. That way your team can spend time on the stuff that actually needs human judgment, like business logic vulnerabilities and understanding your specific app’s security context.

How does your team handle this?

And here’s the hard question that we’d love to hear about: when you’re building a new endpoint, how do you make sure developers only return the necessary fields? Code review? Automated checks? Response DTOs that force explicit field selection?

The post The API vulnerabilities nobody talks about: excessive data exposure appeared first on Blog Detectify.

Why API security is different (and why it matters)

Detectify Blog

By: Joviane Jardim

14 October 2025 at 04:23

Two months since I joined Detectify and I’ve realized something: API security is a completely different game from web application security. And honestly? I think a lot of teams don’t see this yet.

APIs are everywhere (but you might not know where)

Let’s look at the modern application. Your mobile app? APIs. Your crucial SaaS integrations? APIs. That complex checkout flow? Probably five or more API calls talking with each other. Modern applications are, fundamentally, just APIs talking to other APIs with a fancy UI layered on top.

But here’s what’s been catching me off guard: many companies don’t even have a complete inventory of their APIs. You’re trying to secure a perimeter you can’t even see the edges of. I have seen:

Shadow APIs: Old endpoints no one remembers deploying.
Zombie APIs: Test/staging endpoints that never got turned off.
Partner APIs: Third-party integrations that extend your attack surface.

How can you secure what you can’t see?

The attack vectors are different

When we talk about web vulnerabilities, usually we’re dealing with XSS, CSRF, clickjacking – stuff that messes with what users see or tricks them into clicking something they shouldn’t. API vulnerabilities are a different beast. We’re talking broken authentication, APIs exposing way too much data, weak rate limiting, injection attacks.

These attacks skip the UI entirely. An attacker doesn’t need to trick a user into clicking something malicious. They just need to understand your API contract and find the weak spots. That’s it. The scary part? They can automate all of this.

Authentication is… well… complicated

Web apps usually use session-based authentication with cookies. It’s pretty standard, most frameworks handle it well, and there are well-known patterns to follow. APIs? That’s where things get messy. OAuth, JWT, API keys, mutual TLS, custom bearer tokens… There are so many different approaches, and each one has its own vulnerability patterns. I’ve been diving deep into the OWASP API Security Top 10, and honestly, the auth issues are wild. Broken Object Level Authorization, Broken Function Level Authorization… these things have scary-long names, but they’re everywhere. Even though everyone knows about them, they still pop up in production all the time.

Why does it matter?

API attacks are growing at an alarming rate for several reasons:

Automation is Easy: APIs return structured data that is easier to parse than HTML, making it suitable for automation. This is great for developers, but even more perfect for attackers.
Weak Rate Limiting: Since APIs need to handle high-volume traffic, rate limiting is often weaker.
Documentation as Blueprints: API documentation, while great for developers, also serves as a perfect attack blueprint, showing adversaries exactly where to poke.

This is exactly why we’re constantly enhancing our API Scanning capabilities at Detectify, because understanding these blind spots is the first step to fixing them.

How does your team handle this?

We’d love to hear how other teams are tackling this complex problem.

How do you maintain a complete, up-to-date inventory of ALL your endpoints, including the “zombie” ones?
What’s your strategy for testing authorization at scale when you have hundreds of different endpoints and authentication methods?
How do you approach API versioning and deprecation without accidentally leaving critical security holes in old versions?
What API security challenges keep you up at night?

FAQ

Q: What is the primary difference between web application security and API security?

A: Web application security often focuses on user-facing vulnerabilities like XSS, while API security is concerned with flaws like broken authentication and weak access control that attackers can exploit by directly interacting with the API endpoints, bypassing the UI.

Q: What are Shadow and Zombie APIs?

A: Shadow APIs are old endpoints that are forgotten but still deployed, while Zombie APIs are test or staging endpoints that were never turned off, and both extend the attack surface without the organization’s knowledge.

Q: Why are API attacks easily automated?

A: API attacks are easily automated because APIs return structured data (like JSON or XML) that is much easier for a script or bot to parse and manipulate than the more complex and varied structure of HTML pages.

The post Why API security is different (and why it matters) appeared first on Blog Detectify.