tiledbvcf

Solid

Efficient storage and retrieval of genomic variant data using TileDB. Scalable VCF/BCF ingestion, incremental sample addition, compressed storage, parallel queries, and export capabilities for population genomics.

AI & Automation 28,028 stars 2882 forks Updated today MIT

Install

View on GitHub

Quality Score: 96/100

Stars 20%
100
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# TileDB-VCF ## Overview TileDB-VCF is a high-performance C++ library with Python and CLI interfaces for efficient storage and retrieval of genomic variant-call data. Built on TileDB's sparse array technology, it enables scalable ingestion of VCF/BCF files, incremental sample addition without expensive merging operations, and efficient parallel queries of variant data stored locally or in the cloud. ## When to Use This Skill This skill should be used when: - Learning TileDB-VCF concepts and workflows - Prototyping genomics analyses and pipelines - Working with small-to-medium datasets (< 1000 samples) - Need incremental addition of new samples to existing datasets - Require efficient querying of specific genomic regions across many samples - Working with cloud-stored variant data (S3, Azure, GCS) - Need to export subsets of large VCF datasets - Building variant databases for cohort studies - Educational projects and method development - Performance is critical for variant data operations ## Quick Start ### Installation **Preferred Method: Conda/Mamba** ```bash # Enter the following two lines if you are on a M1 Mac CONDA_SUBDIR=osx-64 conda config --env --set subdir osx-64 # Create the conda environment conda create -n tiledb-vcf "python<3.10" conda activate tiledb-vcf # Mamba is a faster and more reliable alternative to conda conda install -c conda-forge mamba # Install TileDB-Py and TileDB-VCF, align with other useful libraries mamba install -y -c conda-forge -c bi...

Details

Author
K-Dense-AI
Repository
K-Dense-AI/scientific-agent-skills
Created
7 months ago
Last Updated
today
Language
Python
License
MIT

Similar Skills

Semantically similar based on skill content — not just same category