← ClaudeAtlas

pysam-genomic-fileslisted

Read/write SAM/BAM/CRAM, VCF/BCF, FASTA/FASTQ. Region queries, pileup, variant filtering, read groups. Python htslib wrapper exposing samtools/bcftools CLI. Use STAR/BWA for alignment; GATK/DeepVariant for variant calling.
jaechang-hits/SciAgent-Skills · ★ 183 · Data & Documents · score 81
Install: claude install-skill jaechang-hits/SciAgent-Skills
# Pysam — Genomic File Toolkit ## Overview Pysam provides a Pythonic interface to htslib for reading, manipulating, and writing genomic data files. It handles SAM/BAM/CRAM alignments, VCF/BCF variants, and FASTA/FASTQ sequences with efficient region-based random access. Also exposes samtools and bcftools as callable Python functions. ## When to Use - Reading and querying BAM/CRAM alignment files (region extraction, read filtering) - Analyzing VCF/BCF variant files (genotype access, variant filtering, annotation) - Extracting reference sequences from indexed FASTA files - Calculating per-base coverage and pileup statistics - Building custom bioinformatics pipelines that combine alignment + variant + sequence data - Quality control of NGS data (mapping quality, flag filtering, coverage) - For **alignment from FASTQ** (read mapping), use STAR, BWA, or minimap2 instead - For **variant calling from BAM**, use GATK or DeepVariant instead ## Prerequisites ```bash pip install pysam ``` **Note**: Requires htslib C library (bundled with pip install on most platforms). On some Linux systems, may need `libhts-dev` or equivalent. Index files (`.bai`, `.tbi`, `.fai`) required for random access — create with `pysam.index()`, `pysam.tabix_index()`, or `pysam.faidx()`. ## Quick Start ```python import pysam # Read BAM file, fetch reads in a region with pysam.AlignmentFile("sample.bam", "rb") as bam: for read in bam.fetch("chr1", 1000, 2000): print(f"{read.query_name}: pos