← ClaudeAtlas

hugging-face-datasetslisted

Create and manage datasets on Hugging Face Hub. Supports initializing repos, defining configs/system prompts, streaming row updates, and SQL-based dataset querying/transformation. Designed to work alongside HF MCP server for comprehensive dataset workflows.
tayyabexe/skills · ★ 3 · Data & Documents · score 76
Install: claude install-skill tayyabexe/skills
# Overview This skill provides tools to manage datasets on the Hugging Face Hub with a focus on creation, configuration, content management, and SQL-based data manipulation. It is designed to complement the existing Hugging Face MCP server by providing dataset editing and querying capabilities. ## Integration with HF MCP Server - **Use HF MCP Server for**: Dataset discovery, search, and metadata retrieval - **Use This Skill for**: Dataset creation, content editing, SQL queries, data transformation, and structured data formatting # Version 2.1.0 # Dependencies # This skill uses PEP 723 scripts with inline dependency management # Scripts auto-install requirements when run with: uv run scripts/script_name.py - uv (Python package manager) - Getting Started: See "Usage Instructions" below for PEP 723 usage # Core Capabilities ## 1. Dataset Lifecycle Management - **Initialize**: Create new dataset repositories with proper structure - **Configure**: Store detailed configuration including system prompts and metadata - **Stream Updates**: Add rows efficiently without downloading entire datasets ## 2. SQL-Based Dataset Querying (NEW) Query any Hugging Face dataset using DuckDB SQL via `scripts/sql_manager.py`: - **Direct Queries**: Run SQL on datasets using the `hf://` protocol - **Schema Discovery**: Describe dataset structure and column types - **Data Sampling**: Get random samples for exploration - **Aggregations**: Count, histogram, unique values analysis - **Transformations