dev-tpu-ray

Solid

Use the legacy `scripts/ray/dev_tpu.py` workflow to allocate a temporary Ray-backed TPU VM for fast debugging, testing, and benchmark iteration. Use only when you specifically need the Ray-backed dev TPU path.

Testing & QA 840 stars 103 forks Updated 4 days ago Apache-2.0

Install

View on GitHub

Quality Score: 84/100

Stars 20%
97
Recency 20%
100
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# Skill: Legacy Ray Dev TPU Use this skill only when you specifically need the legacy Ray-backed dev TPU workflow. Prefer `.agents/skills/dev-tpu/SKILL.md` for the current Iris-backed path. `scripts/ray/dev_tpu.py` can reserve a temporary TPU VM, sync the repo, and run commands remotely. It is good for: - quick test and benchmark loops, - memory debugging, - profiling and trace capture, - short experiments where you want direct shell access. It is a bad fit for long unattended experiments or many concurrent TPU commands. ## Critical concurrency rule Run at most one TPU job at a time on a given dev TPU VM. Do not launch concurrent TPU commands from separate shells, tmux panes, or background jobs against the same dev TPU. ## Commands - `allocate`: reserve a TPU VM and keep it alive while the command runs. This also writes an SSH alias into `~/.ssh/config`. - `connect`: open an interactive shell on the TPU. - `execute`: sync local files to remote `~/marin/` unless `--no-sync`, then run one command. - `watch`: rsync + restart on local file changes. ## Prerequisites 1. Authenticate to GCP and set up the Marin development environment. ```bash gcloud auth login gcloud config set project hai-gcp-models gcloud auth application-default login make dev_setup ``` 2. Ensure your SSH public key is in project metadata: `https://console.cloud.google.com/compute/metadata?resourceTab=sshkeys&project=hai-gcp-models&scopeTab=projectMetadata` ## Quick Start Allocate: ```bash RAY_...

Details

Author
marin-community
Repository
marin-community/marin
Created
2 years ago
Last Updated
4 days ago
Language
Python
License
Apache-2.0

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category

Code & Development Listed

beam

Use this skill to move an active local coding session and directory onto a new remote machine. This remote machine can also have GPUs.

16 Updated 1 months ago
xeophon
DevOps & Infrastructure Listed

terradev-gpu-cloud

Cross-cloud GPU provisioning, K8s cluster creation, and inference overflow. Get real-time pricing across 11+ cloud providers, provision the cheapest GPUs in seconds, spin up production K8s clusters, and burst to cloud when your local GPU maxes out. BYOAPI — your keys never leave your machine.

10 Updated 1 months ago
theoddden
AI & Automation Solid

runtime-communication

Use this skill when working inside the research_mvp tmux runtime with fixed agents (`leader`, `researcher`, `trainer`) and you need to read shared thread messages, inspect per-agent inboxes, delegate tasks between agents, or follow the repository's runtime communication contract. This skill is specifically for the file-backed runtime CLI under `research_mvp/runtime_cli.py`.

50 Updated 1 weeks ago
lhwcv
Testing & QA Featured

browser-testing-with-devtools

Tests in real browsers. Use when building or debugging anything that runs in a browser. Use when you need to inspect the DOM, capture console errors, analyze network requests, profile performance, or verify visual output with real runtime data via Chrome DevTools MCP.

10,628 Updated 4 days ago
addyosmani
AI & Automation Solid

prd-taskmaster

Smart PRD generator with TaskMaster integration. Detects existing PRDs and offers execute/update/replace options. Generates comprehensive technical PRDs optimized for task breakdown, validates with 13 automated checks, and optionally executes tasks autonomously with datetime tracking and rollback support. Use when user requests "PRD", "product requirements", or mentions task-driven development. Defaults to PRD generation with handoff to TaskMaster. Optionally supports autonomous execution with 4 modes.

182 Updated 1 months ago
anombyte93