← ClaudeAtlas

bio-read-qc-quality-reportslisted

Generate and interpret quality reports from FASTQ files using FastQC and MultiQC. Assess per-base quality, adapter content, GC bias, duplication levels, and overrepresented sequences. Use when performing initial QC on raw sequencing data or validating preprocessing results.
majiayu000/claude-skill-registry-data · ★ 3 · Data & Documents · score 60
Install: claude install-skill majiayu000/claude-skill-registry-data
# Quality Reports Generate quality reports for FASTQ files using FastQC and aggregate multiple reports with MultiQC. ## FastQC - Single Sample Reports ### Basic Usage ```bash # Single file fastqc sample.fastq.gz # Multiple files fastqc *.fastq.gz # Specify output directory fastqc -o qc_reports/ sample_R1.fastq.gz sample_R2.fastq.gz # Set threads fastqc -t 4 *.fastq.gz ``` ### Output Files FastQC produces two files per input: - `sample_fastqc.html` - Interactive HTML report - `sample_fastqc.zip` - Data files and images ### Key Modules | Module | What It Shows | Warning Signs | |--------|---------------|---------------| | Per base sequence quality | Quality scores across read | Drop below Q20 at 3' end | | Per sequence quality | Quality score distribution | Bimodal distribution | | Per base sequence content | Nucleotide composition | Imbalance at start (normal) | | Per sequence GC content | GC distribution | Secondary peak (contamination) | | Per base N content | Unknown bases | High N content | | Sequence length distribution | Read lengths | Unexpected variation | | Sequence duplication | Duplicate reads | High duplication (PCR) | | Overrepresented sequences | Common sequences | Adapter contamination | | Adapter content | Adapter sequences | Visible adapter curves | ### Extract Data from ZIP ```bash # Unzip to access raw data unzip sample_fastqc.zip # View summary cat sample_fastqc/summary.txt # Get per-base quality cat sample_fastqc/fastqc_data.txt | grep -A 50