Introducing Plural Sentinels: Automated Infrastructure Validation for Kubernetes

Every platform engineer knows the feeling. You're trying to upgrade a fleet of Kubernetes clusters, and the process that follows is a familiar ritual of hope and anxiety. An engineer makes the change in a development environment and then begins the painstaking process of manual validation.

They’ll squint at Datadog dashboards, tail the logs of various Kubernetes operators, and meticulously work through a Confluence checklist of kubectl commands. This manual, gut-feel approach is slow, prone to human error, and fundamentally unscalable. As teams grow, these checklists are often skipped, replaced by the hope that nothing breaks.

This manual validation bottleneck is a direct threat to velocity, reliability, and compliance. To truly scale operations, teams need to move beyond the checklist and adopt a system that provides packaged, automated, and auditable infrastructure integration testing.

That’s why today, we’re excited to announce the launch of Plural Sentinels.

What are Plural Sentinels?

A Sentinel is a packaged and repeatable way of performing a deep infrastructure integration test. We built Sentinels to replace the manual, ad-hoc validation that happens after any significant change to a complex environment like a Kubernetes cluster. Instead of relying on a human to poke around and confirm that things seem healthy, a Sentinel runs a predefined suite of checks to prove it.

This isn’t just about running a simple health check. A Sentinel is a composite of different validation methods that work together to provide a comprehensive picture of cluster health. It combines AI-assisted analysis with full-blown integration tests to confirm everything from the container scheduling layer to the storage and network layers are functioning as expected.

How Sentinels Work: The Three Pillars of Validation

We designed Sentinels to be both powerful and flexible, built on three distinct types of checks that can be combined to create a robust validation workflow.

1. AI-Assisted Kubernetes Resource Validation

The first and fastest check is an AI-assisted deep dive into your Kubernetes resources. After a change, you can tell a Sentinel to inspect a specific deployment, StatefulSet, or other resource. The system then does what a senior SRE would do: it investigates everything associated with that resource, analyzes its logs, examines related non-Kubernetes components, and presents its findings to an AI to deliver a simple pass or fail grade.

This process is guided by customizable rule files written in natural language. You can provide a set of rules that tells the AI to “ignore this known error” or “consider this a failure.” This allows you to codify the tribal knowledge of your team and apply it consistently every time, replacing the “human squinting effort” with fast, intelligent analysis.

2. Intelligent Log Tailing

While a resource check is great for point-in-time analysis, some issues only reveal themselves over time. The second type of check is an intelligent log tail. You can configure a Sentinel to watch the logs in specific namespaces for a set period, like one full minute after a change.

It doesn’t just dump all the logs. It filters for error logs and applies the same natural language rule files to analyze the output. This is perfect for catching intermittent issues or slow-starting failures that might not be immediately apparent. It’s the automated equivalent of an engineer running `kubectl logs -f` and waiting to see if anything bad happens.

3. Deep Integration Testing

The final and most important check is the ability to run full-blown integration tests. This is where you go beyond observing the cluster and start actively exercising it. We provide a default suite of tests that perform fundamental Kubernetes operations to ensure the cluster is fully functional, from starting a StatefulSet and binding storage volumes to confirming network connectivity between pods.

You can also bring your own tests. The runner can execute custom test suites, such as those written in frameworks like Terratest, allowing you to integrate your existing validation logic directly into the Sentinel. This provides a deep, customizable way to exercise the Kubernetes API on demand and confirm that the cluster is not just running, but fully operational.

Designed for Your Workflow: GitOps Native and API-First

A new tool is only useful if it fits into your existing workflow. We designed Sentinels to be a first-class citizen in a modern GitOps environment. A Sentinel is not a feature that only exists in a UI; it is completely API-driven and GitOps-native.

You define your entire Sentinel (its checks, its rules, its integration tests) as a Kubernetes Custom Resource Definition (CRD) in a Git repository. This means your infrastructure tests are versioned, auditable, and managed with the same rigor as your application code.

The entire system is API-driven. This is not just a tool you have to click; it’s a composable building block for your automation pipelines. You can trigger a Sentinel run from a GitHub Actions workflow, a GitLab pipeline, or any other CI/CD system. This allows you to integrate these deep validation checks directly into your existing change management processes, making automated, comprehensive testing a seamless part of every infrastructure change.

Automated Compliance and Audit Trails

This API-driven, auditable approach has another massive benefit, particularly for enterprise teams: it solves a major compliance headache. In many organizations, change management procedures require an attestation that testing has occurred after every significant infrastructure change.

Manually producing these reports is tedious and error-prone. A Sentinel automates this entirely. You can hit the API, get the structured test output in a format like JUnit, and automatically forward it to whatever compliance system your organization uses.

This also solves the data retention problem. When an auditor asks, “Where’s the test from four months ago?”, the answer is no longer a frantic search through old Slack messages or Confluence pages. The test results are stored, versioned, and easily retrievable via the API.

From Manual Toil to Automated Confidence

The practice of manually validating complex infrastructure is a relic of a past era. It doesn’t scale, it’s not reliable, and it burns out your best people.

We built Plural Sentinels to break the bottleneck. By packaging a combination of AI-assisted analysis and deep integration testing into a single, automated, and API-driven workflow, we’re providing a modern solution to the age-old problem of infrastructure validation. This isn’t just about saving time or running tests faster; it’s about fundamentally changing the way teams manage risk. It’s about moving from a world of checklists and manual toil to a world of automated guarantees and deep, auditable confidence in your infrastructure.