← ClaudeAtlas

arboretolisted

Infer gene regulatory networks (GRNs) from gene expression data using scalable algorithms (GRNBoost2, GENIE3). Use when analyzing transcriptomics data (bulk RNA-seq, single-cell RNA-seq) to identify transcription factor-target gene relationships and regulatory interactions. Supports distributed computation for large-scale datasets.
aiskillstore/marketplace · ★ 350 · Data & Documents · score 80
Install: claude install-skill aiskillstore/marketplace
# Arboreto ## Overview Arboreto is a computational library for inferring gene regulatory networks (GRNs) from gene expression data using parallelized algorithms that scale from single machines to multi-node clusters. **Core capability**: Identify which transcription factors (TFs) regulate which target genes based on expression patterns across observations (cells, samples, conditions). ## Quick Start Install arboreto: ```bash uv pip install arboreto ``` Basic GRN inference: ```python import pandas as pd from arboreto.algo import grnboost2 if __name__ == '__main__': # Load expression data (genes as columns) expression_matrix = pd.read_csv('expression_data.tsv', sep='\t') # Infer regulatory network network = grnboost2(expression_data=expression_matrix) # Save results (TF, target, importance) network.to_csv('network.tsv', sep='\t', index=False, header=False) ``` **Critical**: Always use `if __name__ == '__main__':` guard because Dask spawns new processes. ## Core Capabilities ### 1. Basic GRN Inference For standard GRN inference workflows including: - Input data preparation (Pandas DataFrame or NumPy array) - Running inference with GRNBoost2 or GENIE3 - Filtering by transcription factors - Output format and interpretation **See**: `references/basic_inference.md` **Use the ready-to-run script**: `scripts/basic_grn_inference.py` for standard inference tasks: ```bash python scripts/basic_grn_inference.py expression_data.tsv output_network.tsv -