Skip to content

First-time Setup

Before running training you need three things: a HuggingFace account with a token, access to the gated base models, and your dataset uploaded to HF Hub. This is a one-time process per account.

1. HuggingFace token

Create an account at huggingface.co if you don't have one. Generate a token with write permission at huggingface.co/settings/tokens — write access is needed to push trained checkpoints back to the Hub.

export HF_TOKEN=hf_xxx   # add to your shell profile or .env

Warning

Never commit HF_TOKEN to the repo. Add .env to your .gitignore.

2. Accept the base model licenses

VLASH fine-tunes PaliGemma-based models that are gated. You must accept the license on the HF website once before your token can download the weights.

Visit each model page and click Agree and access repository:

Acceptance is per-account and propagates immediately. The container downloads weights automatically on first run and caches them in $SCRATCH/.cache/huggingface.

3. Upload your dataset

Your dataset must be in LeRobot format (data/, videos/, meta/ folders with a meta/info.json).

# Authenticate
huggingface-cli login --token $HF_TOKEN

# Create the repo
huggingface-cli repo create your-dataset-name --type dataset --private

# Upload — preserves directory structure exactly
huggingface-cli upload your-hf-username/your-dataset-name \
  /path/to/local/lerobot/dataset/ \
  --repo-type dataset

Set DATASET_REPO_ID=your-hf-username/your-dataset-name when running training.

Team datasets

Create a HuggingFace organisation, push under your-org/your-dataset-name, and invite collaborators at huggingface.co/your-org → Settings → Members.

4. Push trained checkpoints (optional)

Set push_to_hub: true and repo_id: your-org/your-model in your training config to automatically upload the final checkpoint after training. Recommended for HPC clusters where scratch storage is purged periodically.