bio-annotationlisted
Install: claude install-skill fmschulz/omics-skills
# Bio Annotation
Functional annotation and taxonomy inference from sequence homology.
## Instructions
1. Read `docs/README.md` and the relevant tool guides before running anything.
2. When a nucleotide assembly, MAG, genome, or contig FASTA is available, run `/tracking-taxonomy-updates` first for the BBTools-container QuickClade `percontig` domain screen. Use that routing table to choose the right taxonomy/QC path before interpreting protein annotations.
3. For InterProScan, read `docs/interproscan-usage.md` and validate the exact CLI with `--help` or `--version`. Current stable is v5.77-108.0; InterProScan 6 (Nextflow-based) is a forward-looking migration target.
4. Run InterProScan for domain/family annotation.
5. Run eggNOG-mapper v2.1.13+ for orthology-based annotation.
6. Run sequence-vs-database search and resolve taxonomy with TaxonKit v0.20.0+ (required for the March 2025 NCBI rank update that replaces "superkingdom" with "domain" and adds "realm" for viruses).
- Default CPU path: DIAMOND v2.1.20+. For any search against NCBI **nr**, prefer a clustered nr database (e.g., a `clusterednr` build under `$BIO_DB_ROOT`) — it is dramatically faster than full nr at comparable sensitivity for most annotation tasks. Check whether a clusterednr build is available under the reference root; if not, build one with `diamond makedb` from a clustered FASTA (MMseqs2/CD-HIT-reduced nr) or fall back to full nr and record the choice in the run log.
- GPU node available (CUDA Turi