-
Notifications
You must be signed in to change notification settings - Fork 58
Open
Description
First of all, thank you so much for your work and for open-sourcing it!
Currently, we have a Franka Research 3 set up to solve a single task. We first tried different out-of-the-box policies, which resulted in an expected 0% success rate. Then, we implemented a script to extract demonstrations of the specific task from the DROID dataset, and we fine-tuned OpenVLA on these demonstrations. The robot performed much better but still could not solve the task even once. We then recorded some demonstrations ourselves to train a new checkpoint on OpenVLA,
which led us to think:
- Could we improve the success rate by fine-tuning on both the DROID-extracted demonstrations and the recorded demonstrations, which both solve the same task?
- If we wanted to fine-tune a checkpoint on both sets of demonstrations, what would be the ideal way to do so, given that the magnitude of the actions differ greatly?
- Should we fine-tune on the DROID demonstrations first, then fine-tune the resulting checkpoint on our demonstrations?
- Or would it be better to create a mixed dataset combining both sets of demonstrations?
- If so, would it make sense to normalize the actions for both sets before fine-tuning to circumvent the gap in the action scale between the two sets?
Thank you very much. I appreciate all input on this.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels