alterlab-datamol

Solid

Wraps RDKit in a Pythonic datamol interface with sensible defaults for standard drug discovery — SMILES parsing, molecule standardization, descriptors, fingerprints, clustering, 3D conformer generation, and parallel processing, returning native rdkit.Chem.Mol objects. Use when running everyday cheminformatics on molecules with minimal boilerplate; for advanced control or custom parameters, use rdkit directly. Part of the AlterLab Academic Skills suite.

AI & Automation 27 stars 4 forks Updated today MIT

Install

View on GitHub

Quality Score: 87/100

Stars 20%
48
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# Datamol Cheminformatics Skill ## Overview Datamol is a Python library that provides a lightweight, Pythonic abstraction layer over RDKit for molecular cheminformatics. Simplify complex molecular operations with sensible defaults, efficient parallelization, and modern I/O capabilities. All molecular objects are native `rdkit.Chem.Mol` instances, ensuring full compatibility with the RDKit ecosystem. **Key capabilities**: - Molecular format conversion (SMILES, SELFIES, InChI) - Structure standardization and sanitization - Molecular descriptors and fingerprints - 3D conformer generation and analysis - Clustering and diversity selection - Scaffold and fragment analysis - Chemical reaction application - Visualization and alignment - Batch processing with parallelization - Cloud storage support via fsspec ## Installation and Setup Guide users to install datamol: ```bash uv pip install datamol ``` Examples here are verified against **datamol 0.12.x** (pulls in RDKit automatically). The descriptor key names below are stable in this line; pin if you depend on them: `uv pip install 'datamol>=0.12,<0.13'`. **Import convention**: ```python import datamol as dm ``` ## Core Workflows ### 1. Basic Molecule Handling **Creating molecules from SMILES**: ```python import datamol as dm # Single molecule mol = dm.to_mol("CCO") # Ethanol # From list of SMILES smiles_list = ["CCO", "c1ccccc1", "CC(=O)O"] mols = [dm.to_mol(smi) for smi in smiles_list] # Error handling mol = dm.to_mol...

Details

Author
AlterLab-IEU
Repository
AlterLab-IEU/AlterLab-Academic-Skills
Created
2 months ago
Last Updated
today
Language
Python
License
MIT

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category