wchai/AuroraCap-7B-VID
Updated
•
23
•
2
Efficient, Performant Video Detailed Captioning and a New Benchmark
Note The VDC benchmark contains 1,027 videos with captions averaging over 500 words.
Note over 20M image and video data collection for AuroraCap training with vicuna and llama-3 pre-tokenize.
Note video data recaptioned by AuroraCap.