lerobot/aloha_sim_insertion_human_image
Backend | VCodec | MSE ↓ | PSNR ↑ | SSIM ↑ |
---|---|---|---|---|
PyAV | libsvtav1 | 5.15E-05 | 43.22 | 0.9948 |
TorchCodec | libsvtav1 | 5.10E-05 | 43.27 | 0.9949 |
PyAV | libx264 | 1.59E-04 | 40.96 | 0.9784 |
TorchCodec | libx264 | 1.58E-04 | 40.99 | 0.9785 |
PyAV | libx265 | 1.85E-04 | 39.84 | 0.9802 |
TorchCodec | libx265 | 1.42E-04 | 40.74 | 0.9815 |
lerobot/aloha_sim_transfer_cube_human_image
Backend | VCodec | MSE ↓ | PSNR ↑ | SSIM ↑ |
---|---|---|---|---|
PyAV | libsvtav1 | 5.47E-05 | 44.62 | 0.9950 |
TorchCodec | libsvtav1 | 5.18E-05 | 44.68 | 0.9950 |
PyAV | libx264 | 1.71E-04 | 41.84 | 0.9795 |
TorchCodec | libx264 | 1.68E-04 | 41.92 | 0.9793 |
PyAV | libx265 | 2.23E-04 | 40.21 | 0.9805 |
TorchCodec | libx265 | 1.46E-04 | 41.60 | 0.9826 |
lerobot/pusht_image
Backend | VCodec | MSE ↓ | PSNR ↑ | SSIM ↑ |
---|---|---|---|---|
PyAV | libsvtav1 | 1.77E-04 | 37.79 | 0.9894 |
TorchCodec | libsvtav1 | 1.82E-04 | 37.70 | 0.9891 |
PyAV | libx264 | 2.88E-04 | 37.23 | 0.9826 |
TorchCodec | libx264 | 2.88E-04 | 37.21 | 0.9826 |
PyAV | libx265 | 4.34E-04 | 35.59 | 0.9782 |
TorchCodec | libx265 | 3.34E-04 | 36.45 | 0.9802 |
To reproduce the full results, you can run:
python benchmarks/video/run_video_benchmark.py \\
--output-dir outputs/video_benchmark \\
--repo-ids \\
lerobot/aloha_sim_transfer_cube_human_image \\
lerobot/pusht_image \\
lerobot/aloha_sim_insertion_human_image \\
--vcodec libsvtav1 libx265 libx264 \\
--pix-fmt yuv420p \\
--g 1 2 3 4 5 6 10 15 20 40 None \\
--crf 0 5 10 15 20 25 30 40 50 None \\
--timestamps-modes 1_frame 2_frames 6_frames \\
--backends torchcodec-cpu pyav \\
--num-samples 50 \\
--num-workers 4 \\
--save-frames 1
Or see the full csv file here:
https://drive.google.com/file/d/1AErjcDxi-DdLuBxD5DIHUAxdbCl_Gskv/view?usp=sharing