What monitoring 58K agent skills taught us about the ecosystem
When we started building Aguara Watch, the goal was straightforward: continuously monitor the AI agent ecosystem for emerging threats. Track what MCP servers are being published, what tools they expose, and flag anything that looks suspicious.
What we got was a window into how the AI agent ecosystem is actually evolving. It's moving faster and more chaotically than most people realize.
Here's what we've learned from monitoring 58,000+ skills across 7 registries.
The ecosystem is exploding, and quality is uneven
The number of MCP servers and agent tools being published is growing exponentially. New registries are appearing. Existing ones are expanding. The variety of what agents can now do (from database management to email to code deployment to financial transactions) keeps widening.
But growth without curation creates noise. A significant portion of published tools are duplicative: multiple implementations of the same capability with wildly different quality levels. Many are abandoned, published once and never updated, drifting into insecurity as dependencies age. And plenty are poorly specified, with tool descriptions that are ambiguous, incomplete, or misleading, which directly affects agent reliability and security.
This is the npm circa 2015 problem, except the stakes are higher because agents act autonomously.
Mutation is the norm, not the exception
One of the most important things we track is change over time. MCP servers aren't static. They update their tool definitions, add new capabilities, modify their behavior. Sometimes these changes are improvements. Sometimes they're not.
We've observed servers that silently add new tools outside the original scope. Tool descriptions that change in ways that subtly alter agent behavior. Authentication mechanisms that get removed in updates (presumably for "developer convenience"). Dependency updates that introduce new vulnerabilities.
This is why point-in-time security assessments aren't sufficient. A server you scanned and approved last month might be materially different today. Continuous monitoring isn't a premium feature. It's the minimum viable approach.
The supply chain problem is real and growing
The AI agent supply chain looks like this: a developer searches a registry, finds an MCP server that does what they need, adds it to their agent configuration, and moves on. Sound familiar? It's the same pattern that created the JavaScript dependency crisis, with two key differences.
Agent tools have broader capabilities than npm packages. An npm package processes data. An MCP tool can send emails, modify databases, deploy code. And agents act autonomously. A vulnerable library needs to be called by your code. A vulnerable MCP tool can be invoked by an agent based on its own reasoning. There's also no lockfile equivalent. When a server updates, your agent gets the new version immediately. No review step, no changelog, no approval.
We're tracking several categories of supply chain risk in Watch. There's typosquatting and namesquatting, where servers are published with names similar to popular ones, hoping to catch developers (or agents) that make a typo. We see scope creep: servers that start with a narrow, useful capability and gradually expand their tool set to include more sensitive operations. And we see dependency chains, where MCP servers depend on other services, creating transitive trust relationships that no one maps or monitors.
Patterns that predict risk
After months of data, we're starting to see patterns that correlate with higher security risk.
Servers without version numbers or changelogs change unpredictably and silently. Tools with overly broad scopes (a "file management" tool that accepts arbitrary paths, a "database" tool that accepts raw SQL) carry proportionally broad attack surfaces. Tools that accept untyped, unvalidated inputs are far more likely to be exploitable. And servers that push frequent changes without documentation are either iterating fast (acceptable in early development) or behaving unpredictably (concerning in production).
We're building these patterns into Aguara Scanner's rule set so teams can evaluate new MCP servers before adding them to their agent configurations.
What this means for teams deploying agents
If you're building with AI agents today, here's what the ecosystem data suggests.
Treat MCP server selection like vendor selection. Don't just evaluate functionality. Evaluate maintenance, security posture, update history, and scope. This is a dependency with autonomous access to your systems.
Pin and review. If your MCP framework supports version pinning, use it. If it doesn't, monitor for changes. Don't let your agent's capabilities change without your knowledge.
Monitor continuously. The ecosystem changes daily. What was safe last week might not be safe today. This is exactly why we built Aguara Watch: to provide that continuous visibility.
Contribute to the ecosystem's security posture. Report suspicious servers. Share security findings. The AI agent ecosystem is young enough that community norms are still forming. We can shape them toward security-first if enough people push in that direction.
The bigger picture
What we're really watching is the birth of a new software ecosystem. It's messy, fast-moving, and full of both potential and real risk. The last time something like this happened was the early days of mobile app stores and package registries.
The teams and communities that established security norms early in those ecosystems (npm audit, Google Play Protect, App Store review, PyPI malware detection) shaped how millions of developers work today.
We have the same opportunity with AI agents. The monitoring infrastructure and security patterns we establish now will define how the agent ecosystem operates for years.
That's why we keep watching. And building.
Does this resonate with what you're building?
Schedule a call