- 文章标题:Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models
- 文章地址:https://arxiv.org/abs/2306.05424
- ACL 2024
- 数据:自己构建的指令微调数据集
- 指标:自己提出的benchmark;open-ended 视频QA
- 硬件:8 A100(40G)/bs32
- 开源:https://github.com/mbzuai-oryx/Video-ChatGPT