Troubleshooting¶
Common errors and how to fix them — covering deployment, gateway policy, registry, analytics, and connectivity issues.
Server deployment¶
tool_side_effect_unknown¶
The gateway returned this error on a tool call.
Cause: The tool name in the grant or the session does not match a tool listed in
the server's .mcp/servers.yaml metadata. The gateway cannot look up the tool's
side-effect class, so it denies the call.
Fix:
# 1. Check what tools are in the metadata
cat .mcp/servers.yaml
# 2. Check what tools the running server actually exposes
mcp-runtime server init --from-server http://localhost:8088 --force
# 3. Validate the grant against the metadata
mcp-runtime server validate --metadata-dir .mcp --grant-file grant.yaml
# 4. Re-apply the corrected grant
mcp-runtime access grant apply --file grant.yaml
tool_not_granted¶
The agent tried to call a tool that is not in any allow rule in the active grant.
Fix: Add the tool to the grant with --tool <name> and re-apply.
Server stuck in Pending or NotReady¶
mcp-runtime server status --namespace mcp-team-<slug>
kubectl describe pod -n mcp-team-<slug> -l app=<server-name>
kubectl get events -n mcp-team-<slug> --sort-by='.lastTimestamp'
Common causes:
| Symptom | Cause | Fix |
|---|---|---|
ImagePullBackOff |
Registry credentials stale | Re-push image; check pull secret |
Pending (no node) |
Cluster resource exhaustion | Scale nodes or reduce replicas |
CrashLoopBackOff |
Server crashes on start | mcp-runtime server logs <name> --use-kube |
registry push returns 401¶
# Check the registry pull secret is valid
kubectl get secret mcp-runtime-registry-pull -n mcp-team-<slug>
# Re-login and retry
mcp-runtime auth login --api-url https://platform.example.com
mcp-runtime registry push --image ...
Access control¶
Grant applied but calls still denied¶
-
Confirm the grant exists:
mcp-runtime access grant list --namespace mcp-team-<slug> -
Confirm the session exists and is not expired or revoked:
mcp-runtime access session list --namespace mcp-team-<slug> -
Check the gateway logs for the denial reason:
mcp-runtime server logs <server-name> --namespace mcp-team-<slug> \ --use-kube 2>&1 | grep -E "deny|allow|session|grant" -
Inspect the effective policy:
mcp-runtime server policy inspect <server-name> \ --namespace mcp-team-<slug>
Session expired or session_not_found¶
The adapter auto-refreshes sessions when started with --auto-refresh. If you are
using manual sessions:
# Create a new session
mcp-runtime access session init new-session \
--server <name> --namespace mcp-team-<slug> \
--agent-id cursor --trust low --expires-in 4h \
--output session.yaml
MCP_PLATFORM_API_PROFILE=admin \
mcp-runtime access session apply --file session.yaml
Registry and images¶
x509: certificate signed by unknown authority¶
The cluster node does not trust the registry's TLS certificate.
For bundled-https mode, the registry uses the internal mcp-runtime-ca. Nodes
must trust this CA. See Cluster Readiness for distribution-
specific node trust configuration.
no basic auth credentials on image pull¶
The mcp-runtime-registry-pull pull secret in the server's namespace has stale
credentials. This happens after a mcp-runtime setup rerun that rotates API keys.
# Check secret exists
kubectl get secret mcp-runtime-registry-pull -n mcp-team-<slug>
# Re-run setup or re-create the secret manually:
ADMIN_KEY=$(kubectl get secret mcp-sentinel-secrets -n mcp-sentinel \
-o jsonpath='{.data.UI_API_KEY}' | base64 -d)
kubectl create secret docker-registry mcp-runtime-registry-pull \
-n mcp-team-<slug> \
--docker-server=registry.example.com \
--docker-username=platform-service \
--docker-password="$ADMIN_KEY" \
--dry-run=client -o yaml | kubectl apply -f -
Analytics and observability¶
Tool calls not showing in the Analytics dashboard¶
-
Check the ingest service is receiving events:
If you seeKUBECONFIG=~/.kube/config mcp-runtime sentinel logs ingest --since 5m401errors, the analytics API key in the gateway sidecar is stale — re-runsetupor restart the analytics deployments. -
Check the processor is consuming from Kafka:
KUBECONFIG=~/.kube/config mcp-runtime sentinel logs processor --since 5m -
Check ClickHouse has the
mcp.eventstopic:Ifkubectl exec -n mcp-sentinel kafka-0 -- \ kafka-topics --list --bootstrap-server localhost:9092mcp.eventsis missing, re-runsetup.
Sentinel API returns 401¶
The mcp-sentinel-api pods may have started with stale API keys from a previous
setup run.
# Restart to pick up current keys
kubectl rollout restart deployment/mcp-sentinel-api -n mcp-sentinel
kubectl rollout status deployment/mcp-sentinel-api -n mcp-sentinel --timeout=120s
Platform and cluster health¶
cluster doctor reports failures¶
KUBECONFIG=~/.kube/config mcp-runtime cluster doctor
The doctor runs 37 checks and prints a remedy for each failure. Follow the printed
instructions — most failures point to missing ingress, stale certificates, or
image pull errors with specific kubectl commands to fix them.
Setup pre-flight check blocked by stale Certificate¶
ERROR Stale Certificate "registry-cert" has DNS names [registry.local]
but the expected registry host is "registry.example.com"
kubectl delete certificate -n registry registry-cert
kubectl delete certificaterequest -n registry --all
# Re-run setup
Namespace stuck in Terminating¶
kubectl patch ns <namespace> \
-p '{"metadata":{"finalizers":null}}' \
--type=merge
Getting more help¶
- Run
mcp-runtime <command> --helpfor flag reference - Run
mcp-runtime cluster doctorfor a full 37-point cluster diagnostic - Check GitHub Issues
- Contributor Troubleshooting for development-environment specific issues