model-merging

Featured

Merge multiple fine-tuned models using mergekit to combine capabilities without retraining. Use when creating specialized models by blending domain-specific expertise (math + coding + chat), improving performance beyond single models, or experimenting rapidly with model variants. Covers SLERP, TIES-Merging, DARE, Task Arithmetic, linear merging, and production deployment strategies.

AI & Automation 27,984 stars 2901 forks Updated today MIT

Install

View on GitHub

Quality Score: 99/100

Stars 20%
100
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# Model Merging: Combining Pre-trained Models ## When to Use This Skill Use Model Merging when you need to: - **Combine capabilities** from multiple fine-tuned models without retraining - **Create specialized models** by blending domain-specific expertise (math + coding + chat) - **Improve performance** beyond single models (often +5-10% on benchmarks) - **Reduce training costs** - no GPUs needed, merges run on CPU - **Experiment rapidly** - create new model variants in minutes, not days - **Preserve multiple skills** - merge without catastrophic forgetting **Success Stories**: Marcoro14-7B-slerp (best on Open LLM Leaderboard 02/2024), many top HuggingFace models use merging **Tools**: mergekit (Arcee AI), LazyMergekit, Model Soup ## Installation ```bash # Install mergekit git clone https://github.com/arcee-ai/mergekit.git cd mergekit pip install -e . # Or via pip pip install mergekit # Optional: Transformer library pip install transformers torch ``` ## Quick Start ### Simple Linear Merge ```yaml # config.yml - Merge two models with equal weights merge_method: linear models: - model: mistralai/Mistral-7B-v0.1 parameters: weight: 0.5 - model: teknium/OpenHermes-2.5-Mistral-7B parameters: weight: 0.5 dtype: bfloat16 ``` ```bash # Run merge mergekit-yaml config.yml ./merged-model --cuda # Use merged model python -m transformers.models.auto --model_name_or_path ./merged-model ``` ### SLERP Merge (Best for 2 Models) ```yaml # config.yml - Sph...

Details

Author
davila7
Repository
davila7/claude-code-templates
Created
11 months ago
Last Updated
today
Language
Python
License
MIT

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category