Benchmark Results Torchcodec

📊 Backend Performance Comparison for Different VCodecs

🛠 Dataset: `lerobot/aloha_sim_insertion_human_image`

Backend	VCodec	MSE ↓	PSNR ↑	SSIM ↑
PyAV	libsvtav1	5.15E-05	43.22	0.9948
TorchCodec	libsvtav1	5.10E-05	43.27	0.9949
PyAV	libx264	1.59E-04	40.96	0.9784
TorchCodec	libx264	1.58E-04	40.99	0.9785
PyAV	libx265	1.85E-04	39.84	0.9802
TorchCodec	libx265	1.42E-04	40.74	0.9815

🛠 Dataset: `lerobot/aloha_sim_transfer_cube_human_image`

Backend	VCodec	MSE ↓	PSNR ↑	SSIM ↑
PyAV	libsvtav1	5.47E-05	44.62	0.9950
TorchCodec	libsvtav1	5.18E-05	44.68	0.9950
PyAV	libx264	1.71E-04	41.84	0.9795
TorchCodec	libx264	1.68E-04	41.92	0.9793
PyAV	libx265	2.23E-04	40.21	0.9805
TorchCodec	libx265	1.46E-04	41.60	0.9826

🛠 Dataset: `lerobot/pusht_image`

Backend	VCodec	MSE ↓	PSNR ↑	SSIM ↑
PyAV	libsvtav1	1.77E-04	37.79	0.9894
TorchCodec	libsvtav1	1.82E-04	37.70	0.9891
PyAV	libx264	2.88E-04	37.23	0.9826
TorchCodec	libx264	2.88E-04	37.21	0.9826
PyAV	libx265	4.34E-04	35.59	0.9782
TorchCodec	libx265	3.34E-04	36.45	0.9802

To reproduce the full results, you can run:

python benchmarks/video/run_video_benchmark.py \\
    --output-dir outputs/video_benchmark \\
    --repo-ids \\
        lerobot/aloha_sim_transfer_cube_human_image \\
        lerobot/pusht_image \\
        lerobot/aloha_sim_insertion_human_image \\
    --vcodec libsvtav1 libx265 libx264 \\
    --pix-fmt yuv420p \\
    --g 1 2 3 4 5 6 10 15 20 40 None \\
    --crf 0 5 10 15 20 25 30 40 50 None \\
    --timestamps-modes 1_frame 2_frames 6_frames \\
    --backends torchcodec-cpu pyav \\
    --num-samples 50 \\
    --num-workers 4 \\
    --save-frames 1

Or see the full csv file here:

https://drive.google.com/file/d/1AErjcDxi-DdLuBxD5DIHUAxdbCl_Gskv/view?usp=sharing

📊 Backend Performance Comparison for Different VCodecs

🛠 Dataset: lerobot/aloha_sim_insertion_human_image

🛠 Dataset: lerobot/aloha_sim_transfer_cube_human_image

🛠 Dataset: lerobot/pusht_image

🛠 Dataset: `lerobot/aloha_sim_insertion_human_image`

🛠 Dataset: `lerobot/aloha_sim_transfer_cube_human_image`

🛠 Dataset: `lerobot/pusht_image`