extracting-keywordslisted
Install: claude install-skill oaustegard/claude-skills
# Extracting Keywords
Extract keywords from text using YAKE (Yet Another Keyword Extractor), an unsupervised statistical keyword extraction algorithm.
## Installation
**First time only:** Install YAKE with optimized dependencies to avoid unnecessary downloads.
```bash
cd /home/claude
uv venv yake-venv --system-site-packages
uv pip install yake --python yake-venv/bin/python --no-deps
uv pip install jellyfish segtok regex --python yake-venv/bin/python
```
This reuses system packages (numpy, networkx) instead of downloading them (~0.08s vs ~5s).
## Stopwords Configuration
**Built-in YAKE stopwords (34 languages):** Use `lan="<code>"` parameter
- See Parameters section below for all 34 supported language codes
- English (`lan="en"`) is the default
**Custom domain stopwords (bundled in `assets/`):**
**AI/ML:** `stopwords_ai.txt`
- English stopwords + 783 AI/ML domain-specific terms (1357 total)
- Filters AI/ML methodology noise (model, training, network, algorithm, parameter)
- Filters ML boilerplate (dataset, baseline, benchmark, experiment, evaluation)
- Filters technical terms (transformer, embedding, attention, optimization, inference)
- Includes full lemmatization (train/trains/trained/training/trainer)
- Use for AI/ML papers, technical reports, machine learning literature
- **Performance impact:** +4-5% runtime vs English stopwords
**Life Sciences:** `stopwords_ls.txt`
- English stopwords + 719 life sciences domain-specific terms (1293 total)
- Filters research met