2025.03.19 | 动态序列建模优势，视频生成理解挑战 - HuggingFace 每日AI论文速递

本期的 15 篇论文如下：

00:21 🦢 RWKV-7 "Goose" with Expressive Dynamic State Evolution（RWKV-7 "Goose"：具有表达性动态状态演化的序列建模）

00:55 🤯 Impossible Videos（不可能的视频）

01:38 🎨 Creation-MMBench: Assessing Context-Aware Creative Intelligence in MLLM（Creation-MMBench：评估多模态大型语言模型中具有上下文感知能力的创造性智能）

02:17 🤖 DAPO: An Open-Source LLM Reinforcement Learning System at Scale（DAPO：一个大规模的开源LLM强化学习系统）

02:58 🧠 DeepPerception: Advancing R1-like Cognitive Visual Perception in MLLMs for Knowledge-Intensive Visual Grounding（DeepPerception：提升多模态大型语言模型中类R1认知视觉感知能力，用于知识密集型视觉定位）

03:39 🖼 CapArena: Benchmarking and Analyzing Detailed Image Captioning in the LLM Era（CapArena：LLM时代下详细图像描述的基准测试与分析）

04:25 🤖 Infinite Mobility: Scalable High-Fidelity Synthesis of Articulated Objects via Procedural Generation（无限可动性：通过程序生成实现可伸缩的高保真铰接物体合成）

05:13 🧠 Frac-Connections: Fractional Extension of Hyper-Connections（Frac-Connections：超连接的分数扩展）

05:52 🌍 Cosmos-Transfer1: Conditional World Generation with Adaptive Multimodal Control（宇宙-迁移1：基于自适应多模态控制的条件世界生成）

06:30 🧐 MPBench: A Comprehensive Multimodal Reasoning Benchmark for Process Errors Identification（MPBench：用于过程错误识别的综合多模态推理基准）

07:13 🤖 Aligning Multimodal LLM with Human Preference: A Survey（多模态大语言模型与人类偏好对齐：一项综述）

07:51 ⏱ Measuring AI Ability to Complete Long Tasks（衡量人工智能完成长时任务的能力）

08:38 🎭 Concat-ID: Towards Universal Identity-Preserving Video Synthesis（Concat-ID：面向通用身份保持的视频合成）

09:13 🖼 FlexWorld: Progressively Expanding 3D Scenes for Flexiable-View Synthesis（FlexWorld: 用于灵活视角合成的渐进式扩展3D场景）

09:50 🤔 Temporal Consistency for LLM Reasoning Process Error Identification（LLM推理过程错误识别的时序一致性方法）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递