pytdclisted
Install: claude install-skill aiskillstore/marketplace
# PyTDC (Therapeutics Data Commons)
## Overview
PyTDC is an open-science platform providing AI-ready datasets and benchmarks for drug discovery and development. Access curated datasets spanning the entire therapeutics pipeline with standardized evaluation metrics and meaningful data splits, organized into three categories: single-instance prediction (molecular/protein properties), multi-instance prediction (drug-target interactions, DDI), and generation (molecule generation, retrosynthesis).
## When to Use This Skill
This skill should be used when:
- Working with drug discovery or therapeutic ML datasets
- Benchmarking machine learning models on standardized pharmaceutical tasks
- Predicting molecular properties (ADME, toxicity, bioactivity)
- Predicting drug-target or drug-drug interactions
- Generating novel molecules with desired properties
- Accessing curated datasets with proper train/test splits (scaffold, cold-split)
- Using molecular oracles for property optimization
## Installation & Setup
Install PyTDC using pip:
```bash
uv pip install PyTDC
```
To upgrade to the latest version:
```bash
uv pip install PyTDC --upgrade
```
Core dependencies (automatically installed):
- numpy, pandas, tqdm, seaborn, scikit_learn, fuzzywuzzy
Additional packages are installed automatically as needed for specific features.
## Quick Start
The basic pattern for accessing any TDC dataset follows this structure:
```python
from tdc.<problem> import <Task>
data = <Task>(name='<Dat