check-ceph-healthlisted
Install: claude install-skill diegosouzapw/awesome-omni-skill
# Check Ceph Health
Use this guide to diagnose and remediate Ceph storage issues on OpenShift clusters running OCS/ODF (OpenShift Data Foundation).
## 1. Ceph Cluster Health
```bash
# Quick health status
kubectl -n openshift-storage get cephcluster -o jsonpath='{.items[*].status.ceph.health}'
# Detailed health with error messages
kubectl -n openshift-storage get cephcluster -o jsonpath='{.items[*].status.ceph.details}' | python3 -m json.tool
# Capacity overview (bytesAvailable, bytesUsed, bytesTotal)
kubectl -n openshift-storage get cephcluster -o jsonpath='{.items[*].status.ceph.capacity}' | python3 -m json.tool
```
Health states:
- `HEALTH_OK` -- cluster is healthy
- `HEALTH_WARN` -- degraded but functional (backfillfull, nearfull, degraded PGs)
- `HEALTH_ERR` -- critical, writes may be blocked (full OSDs, too few OSDs, down PGs)
## 2. Running Ceph Commands
OCS/ODF clusters may not have a rook-ceph-tools pod deployed. Use a mon pod to run ceph commands directly.
```bash
# Find the mon pod and its service address
MON_POD=$(kubectl -n openshift-storage get pods -l app=rook-ceph-mon -o jsonpath='{.items[0].metadata.name}')
MON_ADDR=$(kubectl -n openshift-storage get pod $MON_POD -o jsonpath='{.spec.containers[0].env[?(@.name=="ROOK_CEPH_MON_HOST")].value}' | sed 's/\[//;s/\]//')
# Run any ceph command via the mon pod
kubectl -n openshift-storage exec $MON_POD -c mon -- \
ceph -m $MON_ADDR --keyring /etc/ceph/keyring-store/keyring status
```
Useful ceph commands to