fine-tuning-openvla-oft

Solid

Fine-tunes and evaluates OpenVLA-OFT and OpenVLA-OFT+ policies for robot action generation with continuous action heads, LoRA adaptation, and FiLM conditioning on LIBERO simulation and ALOHA real-world setups. Use when reproducing OpenVLA-OFT paper results, training custom VLA action heads (L1 or diffusion), deploying server-client inference for ALOHA, or debugging normalization, LoRA merge, and cross-GPU issues.

AI & Automation 9,609 stars 724 forks Updated 1 months ago MIT

Install

View on GitHub

Quality Score: 94/100

Stars 20%
100
Recency 20%
75
Frontmatter 20%
70
Documentation 15%
100
Issue Health 10%
50
License 10%
100
Description 5%
100

Skill Content

# OpenVLA-OFT Fine-tuning and evaluation workflows for OpenVLA-OFT and OpenVLA-OFT+ from the official `openvla-oft` codebase. Covers blank-machine setup plus LoRA-based adaptation of OpenVLA for robot action generation with continuous action prediction heads. ## Quick start Clone the public repo, follow the official setup, then evaluate a pretrained LIBERO checkpoint: ```bash git clone https://github.com/moojink/openvla-oft.git cd openvla-oft python experiments/robot/libero/run_libero_eval.py \ --pretrained_checkpoint moojink/openvla-7b-oft-finetuned-libero-spatial \ --task_suite_name libero_spatial \ --center_crop True \ --num_trials_per_task 50 \ --seed 7 ``` ## Core concepts **What OpenVLA-OFT changes**: Standard OpenVLA tokenizes continuous actions into discrete bins, losing precision. OFT replaces this with dedicated continuous action heads (L1 regression or diffusion) while keeping the VLA backbone frozen and adapting via LoRA. **OFT vs OFT+ variants**: | Variant | FiLM | Images | Typical use | |---------|------|--------|-------------| | OFT | Off | 2 (front + wrist) | LIBERO simulation | | OFT+ | On | 3 (high + left + right wrist) | ALOHA real-world | **Key architecture choices**: - **LoRA adaptation**: Rank-32 LoRA on VLA backbone (no full fine-tuning needed) - **Continuous actions**: L1 regression head (default) or diffusion head - **FiLM conditioning**: Feature-wise Linear Modulation for stronger language grounding in OFT+ - **Multi-image input**:...

Details

Author
Orchestra-Research
Repository
Orchestra-Research/AI-Research-SKILLs
Created
7 months ago
Last Updated
1 months ago
Language
TeX
License
MIT

Integrates with

Similar Skills

Semantically similar based on skill content — not just same category