What monitoring 40K agent skills taught us about the ecosystem

When we started building Aguara Watch, the goal was straightforward: continuously monitor the AI agent ecosystem for emerging threats. Track what MCP servers are being published, what tools they expose, and flag anything that looks suspicious.

What we got was a window into how the AI agent ecosystem is actually evolving — and it's moving faster and more chaotically than most people realize.

Here's what we've learned from monitoring 40,000+ skills across 7 registries.

The ecosystem is exploding, and quality is uneven

The number of MCP servers and agent tools being published is growing exponentially. New registries are appearing. Existing ones are expanding. The variety of what agents can now do — from database management to email to code deployment to financial transactions — is staggering.

But growth without curation creates noise. A significant portion of published tools are:

Duplicative. Multiple implementations of the same capability with wildly different quality levels
Abandoned. Published once, never updated, potentially drifting into insecurity as dependencies age
Poorly specified. Tool descriptions that are ambiguous, incomplete, or misleading — which directly affects agent reliability and security

This is the npm circa 2015 problem, except the stakes are higher because agents act autonomously.

Mutation is the norm, not the exception

One of the most important things we track is change over time. MCP servers aren't static — they update their tool definitions, add new capabilities, modify their behavior. Sometimes these changes are improvements. Sometimes they're not.

We've observed:

Servers that silently add new tools that weren't in the original scope
Tool descriptions that change in ways that subtly alter agent behavior
Authentication mechanisms that get removed in updates (presumably for "developer convenience")
Dependency updates that introduce new vulnerabilities

This is why point-in-time security assessments aren't sufficient. A server you scanned and approved last month might be materially different today. Continuous monitoring isn't a premium feature — it's the minimum viable approach.

The supply chain problem is real and growing

The AI agent supply chain looks like this: a developer searches a registry, finds an MCP server that does what they need, adds it to their agent configuration, and moves on. Sound familiar? It's the same pattern that created the JavaScript dependency crisis, except:

Agent tools have broader capabilities. An npm package processes data. An MCP tool can send emails, modify databases, deploy code.
Agents act autonomously. A vulnerable library needs to be called by your code. A vulnerable MCP tool can be invoked by an agent based on its own reasoning.
There's no lockfile equivalent. When a server updates, your agent gets the new version immediately. No review step, no changelog, no approval.

We're tracking several categories of supply chain risk in Watch:

Typosquatting and namesquatting. Servers published with names similar to popular ones, hoping to catch developers (or agents) that make a typo.

Scope creep. Servers that start with a narrow, useful capability and gradually expand their tool set to include more sensitive operations.

Dependency chains. MCP servers that depend on other services, creating transitive trust relationships that no one maps or monitors.

Patterns that predict risk

After months of data, we're starting to see patterns that correlate with higher security risk:

No versioning. Servers without version numbers or changelogs change unpredictably and silently.

Overly broad tool scopes. A "file management" tool that accepts arbitrary paths. A "database" tool that accepts raw SQL. Broad scope = broad attack surface.

Missing input validation schemas. Tools that accept untyped, unvalidated inputs are far more likely to be exploitable.

Rapid, undocumented updates. Servers that push frequent changes without documentation are either iterating fast (acceptable in early development) or behaving unpredictably (concerning in production).

We're building these patterns into Aguara Scanner's rule set so teams can evaluate new MCP servers before adding them to their agent configurations.

What this means for teams deploying agents

If you're building with AI agents today, here's what the ecosystem data suggests:

1. Treat MCP server selection like vendor selection. Don't just evaluate functionality — evaluate maintenance, security posture, update history, and scope. This is a dependency with autonomous access to your systems.

2. Pin and review. If your MCP framework supports version pinning, use it. If it doesn't, monitor for changes. Don't let your agent's capabilities change without your knowledge.

3. Monitor continuously. The ecosystem changes daily. What was safe last week might not be safe today. This is exactly why we built Aguara Watch — to provide that continuous visibility.

4. Contribute to the ecosystem's security posture. Report suspicious servers. Share security findings. The AI agent ecosystem is young enough that community norms are still forming. We can shape them toward security-first if enough people push in that direction.

The bigger picture

What we're really watching is the birth of a new software ecosystem. It's messy, fast-moving, and full of both incredible potential and real risk. The last time something like this happened was the early days of mobile app stores and package registries.

The teams and communities that established security norms early in those ecosystems — npm audit, Google Play Protect, App Store review — shaped how millions of developers work today.

We have the same opportunity with AI agents. The monitoring infrastructure we build now, the security patterns we establish, the community norms we set — these will define how the agent ecosystem operates for years to come.

That's why we keep watching. And building.