phi-desensitizelisted

Use this skill before any prompt, log message, error trace, test fixture, or external API call that may include patient health information (PHI) or personal identifying information (PII) in a medical-data SaaS context. Replaces sensitive tokens with reversible synthetic substitutes, emits a desensitize map (encrypted), and produces a residual-risk report. Mandatory pre-stage before any L3/L4 data ever enters a model context window. Chinese trigger examples: "脱敏", "PHI 脱敏", "数据脱敏", "去标识化", "patient_id 替换", "把这段日志脱敏", "把样例匿名化". Do NOT use as a one-way redactor when the workflow needs round-trip mapping; do NOT use for fully synthetic data that contains no real source values (use test-data-generation instead). Success = every L3/L4 token in input is replaced, output passes mcp-phi-detector with zero hits, reversal map is encrypted with KMS-managed key, and residual-risk report is empty or explicitly approved.
charliehzm/medharness · ★ 101 · AI & Automation · score 81

Install: claude install-skill charliehzm/medharness

# PHI Desensitize The single most important runtime gate in the entire AI Coding system: **no L3/L4 token should ever enter a model context window without first passing through this skill.** ## Core mental model Desensitization is **not redaction**. Redaction destroys information; desensitization replaces it with reversible placeholders so the downstream LLM can still reason about structure and relationships, while the operator can reverse the mapping on a vetted output in a controlled environment. ``` "Patient 张三 (ID 110101199001011234) seen 2026-03-12" ↓ desensitize "Patient {{PT_A1}} (ID {{ID_B7}}) seen {{DATE_C3}}" ↓ LLM reasons, produces analysis referencing {{PT_A1}} ↓ controlled reversal in approved environment "Patient 张三 (ID 110101199001011234) seen 2026-03-12 — diagnosis: ..." ``` ## What this skill produces For each invocation: 1. `desensitized_payload` — sanitized text / JSON / fixture 2. `<source>.desensitize_map.json.enc` — encrypted reversal map (AES-256-GCM, key from KMS) 3. `residual_risk_report.md` — listing any tokens the classifier was uncertain about ## When NOT to use this skill Skip for: - Already fully synthetic data (use `test-data-generation` instead) - Purely L1 / public content (no PHI possible) - One-way logging where reversal is never needed (use simple redaction) - Encryption / hashing at storage layer (that's data-at-rest concern, different domain) ## Active context bundle **Always load first** 1. This `SKILL.md` 2. `referenc