Skip to main content

Validation API

Query your entire stack like a database

With a single API call, Stacktic validates every database, every topic, every service — tracking each command's success or failure. Get the deep status of your entire stack in a few lines, not a few days.

What makes this unique?

Traditional monitoring tells you if something is up. The Validation API tells you if your stack actually works — end to end, across every relationship.

Think of it as querying your stack like a database. Instead of SSHing into 20 services and running manual checks, you hit one endpoint and get the full picture — every database connection, every Kafka topic, every route, every certificate — validated against real commands, not just health pings.

Stacktic knows your topology. It knows what's connected to what. So the validation pipeline is generated automatically from your stack design — with almost no code.

FEW LINES

A single validate-all.sh or API call replaces hundreds of manual checks across your entire stack.

DEEP DEPTH

Not just "is it up" — validates schemas exist, users have access, topics are created, connectors are running, certs are valid.

ZERO CODE

Validation pipelines are auto-generated from your stack topology. Add a component — its tests appear automatically.

How It Works

$STACKTIC_OUTPUT/scripts/stacktic/validate-all.sh Single command — full stack validation

Reads your topology — auto-discovers every component, sub-component & link

PostgreSQL

PASS SELECT 1

PASS write perms

PASS cluster healthy

PASS pods running

Kafka

PASS cluster ready

PASS topics exist

FAIL sink connector

PASS produce E2E

APISIX

PASS routes 2xx

PASS SSL valid

PASS response <5s

FAIL TLS cert exp

Valkey

PASS PING→PONG

PASS write/read

PASS memory ok

PASS clients

Grafana

PASS health ok

PASS datasources

PASS dashboards

PASS Prom query

Pipeline Result
CNPG 4/4
Kafka 3/4
APISIX 3/4
Valkey 4/4
Grafana 4/4
18/20 PASSED

Real validation depth — per component

PostgreSQL / CNPG

  • Database connectivity (SELECT 1)
  • Write permissions (CREATE/INSERT)
  • Connection count from pg_stat_activity
  • Database size
  • CNPG cluster health & phase
  • Primary/replica pod status

Kafka

  • Cluster & KafkaConnect ready
  • All topics created (dynamic)
  • All connectors running
  • No failed connector tasks
  • Produce to topic E2E
  • Sink E2E (produce → wait → verify in DB)
  • Consumer groups listed

APISIX (Ingress)

  • Route HTTP status (2xx/3xx)
  • SSL certificate valid & expiry
  • Response time under 5s
  • CORS headers present
  • WebSocket upgrade
  • TLS 1.2+ supported
  • Security headers (HSTS, X-Frame)
  • Backend pods running

Valkey (Redis)

  • PING connectivity
  • Write/read test with TTL
  • Memory usage & limits
  • Connected client count
  • Key count (DBSIZE)
  • Server version

Grafana

  • Health check (/api/health)
  • Datasource exists per link
  • Datasource URL matches namespace
  • Datasource health probe
  • Prometheus query via proxy
  • Loki label query
  • Dashboard count

+ ClickHouse, SeaweedFS, Qdrant, OTel, Cert-Manager...

Every component Stacktic supports gets its own deep validation suite — auto-generated from your topology. Add a component, its validation pipeline appears.


Plug It Anywhere

The Validation API is a simple script and endpoint — it runs anywhere. Add it to your CI/CD pipeline, your GitHub Actions, your upgrade flow, or run it manually. A few lines and you have full stack validation.

CI/CD & GitHub Actions

Add validate-all.sh as a post-deploy step in any pipeline — GitHub Actions, GitLab CI, Jenkins, ArgoCD hooks. If validation fails, the pipeline fails.

Versioning & Upgrades

Run validation before and after a stack version upgrade. Compare results to confirm nothing broke — databases still connected, topics still flowing, routes still serving.

Migration Validation

Migrating from managed services or VMs? Run validation after each step to confirm data landed, connections work, and services are healthy — before going live.

Security Assessments

Validate TLS certificates, security headers (HSTS, X-Frame-Options), CORS configuration, and RBAC — automatically, on every deploy. No manual audit required.

Scheduled Health Checks

Run on a cron schedule to catch drift — expired certificates, failed connectors, unhealthy replicas — before your users notice.


API Examples

Every test is a single POST to /metadata/q. The API reads your topology, resolves variables like {namespace} and {database} automatically, executes the command on the right target, and returns passed or failed with diagnostics.

Request Structure
curl -s -X POST http://localhost:8080/metadata/q \
-H "Content-Type: application/json" \
-d '{
  "source":      "type:cnpg",           // which component to target
  "target":      "sub_components",       // component | sub_components | links_to | links_from
  "command":     "SELECT 1",            // the actual test — exit 0 = passed
  "description": "Connect {database}",  // label in the report
  "severity":    "critical",            // critical | warning | info
  "on_failure":  "kubectl get pods ...",  // runs when command fails
  "on_success":  "SELECT current_db()"   // runs when command passes
}'

Example: Kafka Version Upgrade — Validate All Topics & Sinks

You just upgraded Kafka from 3.6 to 3.7. Before calling it done, run these 4 calls and know in seconds if everything survived the upgrade.

1
Cluster Ready after upgrade
curl -s -X POST http://localhost:8080/metadata/q -H 'Content-Type: application/json' \
-d '{
  "source": "type:kafka", "target": "component",
  "command": "kubectl get kafka -n {namespace} -o jsonpath=\"{.items[0].status.conditions[?(@.type==\\\"Ready\\\")].status}\" | grep -q True",
  "description": "Kafka cluster ready",
  "severity": "critical",
  "on_success": "kubectl get kafka -n {namespace} -o jsonpath=\"{.items[0].metadata.name}: replicas={.items[0].spec.kafka.replicas}, version={.items[0].spec.kafka.version}\""
}' | jq
2
All Topics still ready
curl -s -X POST http://localhost:8080/metadata/q -H 'Content-Type: application/json' \
-d '{
  "source": "type:kafka", "target": "component",
  "command": "not_ready=$(kubectl get kafkatopic -n {namespace} -o jsonpath=\"{range .items[*]}{.metadata.name}:{.status.conditions[?(@.type==\\\"Ready\\\")].status}{\\\"\\\\n\\\"}{end}\" | grep -v True | grep -v \"^$\"); if [ -n \"$not_ready\" ]; then echo NOT READY: $not_ready; exit 1; else echo All topics ready; fi",
  "description": "All topics ready",
  "severity": "critical",
  "on_failure": "kubectl get kafkatopic -n {namespace}"
}' | jq
3
All Sink Connectors running
curl -s -X POST http://localhost:8080/metadata/q -H 'Content-Type: application/json' \
-d '{
  "source": "type:kafka", "target": "component",
  "command": "echo \"=== Sink Connectors ===\"; for connector in $(kubectl get kafkaconnector -n {namespace} -o jsonpath=\"{.items[*].metadata.name}\"); do state=$(kubectl get kafkaconnector $connector -n {namespace} -o jsonpath=\"{.status.connectorStatus.connector.state}\"); task=$(kubectl get kafkaconnector $connector -n {namespace} -o jsonpath=\"{.status.connectorStatus.tasks[0].state}\"); topic=$(kubectl get kafkaconnector $connector -n {namespace} -o jsonpath=\"{.spec.config.topics}\"); echo \"$connector: state=$state task=$task topic=$topic\"; done",
  "description": "All sinks summary",
  "severity": "info"
}' | jq
4
E2E — Produce message, verify it lands in PostgreSQL sink
curl -s -X POST http://localhost:8080/metadata/q -H 'Content-Type: application/json' \
-d '{
  "source": "type:kafka", "target": "component",
  "command": "broker=$(kubectl get pods -n {namespace} -l strimzi.io/component-type=kafka -o jsonpath=\"{.items[0].metadata.name}\"); bootstrap=$(kubectl get kafka -n {namespace} -o jsonpath=\"{.items[0].status.listeners[0].bootstrapServers}\"); connector=$(kubectl get kafkaconnector -n {namespace} --no-headers -o custom-columns=NAME:.metadata.name,CLASS:.spec.class | grep -i postgresql | head -1 | awk \"{print \\$1}\"); topic=$(kubectl get kafkaconnector $connector -n {namespace} -o jsonpath=\"{.spec.config.topics}\"); test_id=\"e2e-$(date +%s)\"; echo \"{\\\"username\\\":\\\"$test_id\\\",\\\"city\\\":\\\"test\\\"}\" | kubectl exec -n {namespace} -i $broker -- /opt/kafka/bin/kafka-console-producer.sh --bootstrap-server $bootstrap --topic $topic 2>/dev/null && echo Produced: $test_id to $topic",
  "verify_delay": 5,
  "on_success": "pg_pod=$(kubectl get pods -n cnpg -l cnpg.io/cluster -o jsonpath=\"{.items[0].metadata.name}\"); kubectl exec -n cnpg $pg_pod -- psql -c \"SELECT count(*) FROM pg_stat_activity\" 2>/dev/null || echo PostgreSQL query executed",
  "on_failure": "kubectl logs -n {namespace} -l strimzi.io/name=kafka-connect --tail=30",
  "description": "PostgreSQL sink E2E",
  "severity": "warning"
}' | jq

4 calls. Full Kafka validation. Cluster health, all topics, all connectors, end-to-end data flow — zero hardcoded names. Same calls work on any stack, any environment.


Query Reference

Beyond running tests, you can query your stack metadata directly — list components, inspect links, dry-run commands before executing.

List all databases

curl -s -X POST http://localhost:8080/metadata/q \
-H "Content-Type: application/json" \
-d '{
  "source": "type:cnpg",
  "target": "sub_components",
  "select": ["database", "username", "consumers"]
}' | jq

List all APISIX routes

curl -s -X POST http://localhost:8080/metadata/q \
-H "Content-Type: application/json" \
-d '{
  "source": "apisix",
  "target": "links_to",
  "select": ["subdomain", "domain", "cors", "websocket"]
}' | jq

Dry run — preview without executing

curl -s -X POST http://localhost:8080/metadata/q \
-H "Content-Type: application/json" \
-d '{
  "source": "apisix",
  "target": "links_to",
  "command": "curl -sk https://{subdomain}.{domain}",
  "dry_run": true
}' | jq '.results[] | {name, command}'

Who connects to Valkey?

curl -s -X POST http://localhost:8080/metadata/q \
-H "Content-Type: application/json" \
-d '{
  "source": "valkey",
  "target": "links_from",
  "select": ["link_name", "link_type"]
}' | jq

Available for every component: PostgreSQL, Kafka, APISIX, Valkey, Grafana, ClickHouse, SeaweedFS, Qdrant, OpenTelemetry, Cert-Manager — and any component you add to your topology.