2024.12.11 每日AI论文 | 代码模型评估改进,视频生成技术突破

2024.12.11 每日AI论文 | 代码模型评估改进,视频生成技术突破

17分钟 ·
播放数65
·
评论数0

本期的 23 篇论文如下:

00:25 🧑 Evaluating and Aligning CodeLLMs on Human Preference(评估与对齐代码大语言模型的人类偏好)

01:19 🎥 STIV: Scalable Text and Image Conditioned Video Generation(STIV:可扩展的文本与图像条件视频生成)

01:59 🎨 DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation(DiffSensei:连接多模态大语言模型与扩散模型以实现定制化漫画生成)

02:39 🔒 Hidden in the Noise: Two-Stage Robust Watermarking for Images(隐藏在噪声中:图像的两阶段鲁棒水印技术)

03:19 🎥 UniReal: Universal Image Generation and Editing via Learning Real-world Dynamics(UniReal:通过学习真实世界动态实现通用图像生成与编辑)

04:04 📄 OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations(全向文档基准:多样PDF文档解析的综合评估)

04:50 🎨 FiVA: Fine-grained Visual Attribute Dataset for Text-to-Image Diffusion Models(FiVA:用于文本到图像扩散模型的细粒度视觉属性数据集)

05:32 🎥 3DTrajMaster: Mastering 3D Trajectory for Multi-Entity Motion in Video Generation(3D轨迹大师:掌握视频生成中的多实体三维运动)

06:09 🧠 Frame Representation Hypothesis: Multi-Token LLM Interpretability and Concept-Guided Text Generation(框架表示假设:多标记语言模型的可解释性与概念引导文本生成)

06:55 🧠 Perception Tokens Enhance Visual Reasoning in Multimodal Language Models(感知令牌增强多模态语言模型的视觉推理能力)

07:41 🎥 Video Motion Transfer with Diffusion Transformers(基于扩散变换器的视频运动迁移)

08:23 🚀 EMOv2: Pushing 5M Vision Model Frontier(EMOv2:推动5M规模视觉模型前沿)

09:02 🛡 Granite Guardian(花岗岩守护者)

09:44 🌟 ILLUME: Illuminating Your LLMs to See, Draw, and Self-Enhance(ILLUME:让您的LLMs看见、绘制并自我增强)

10:30 🎥 ObjCtrl-2.5D: Training-free Object Control with Camera Poses(ObjCtrl-2.5D:无需训练的对象控制与相机姿态)

11:21 🚀 LoRA.rar: Learning to Merge LoRAs via Hypernetworks for Subject-Style Conditioned Image Generation(LoRA.rar:通过超网络学习合并LoRA以实现主题-风格条件图像生成)

12:12 📱 MoViE: Mobile Diffusion for Video Editing(MoViE:移动设备上的扩散模型视频编辑)

12:46 🧬 Chimera: Improving Generalist Model with Domain-Specific Experts(奇美拉:通过特定领域专家提升通用模型)

13:28 🌐 Fully Open Source Moxin-7B Technical Report(全开源Moxin-7B技术报告)

14:09 📱 Mobile Video Diffusion(移动视频扩散)

14:45 🤖 Contextualized Counterspeech: Strategies for Adaptation, Personalization, and Evaluation(情境化反驳言论:适应、个性化与评估策略)

15:24 🤖 Maximizing Alignment with Minimal Feedback: Efficiently Learning Rewards for Visuomotor Robot Policy Alignment(最大化对齐与最小化反馈:高效学习视觉运动机器人策略对齐的奖励)

16:15 🔒 A New Federated Learning Framework Against Gradient Inversion Attacks(一种对抗梯度反演攻击的新型联邦学习框架)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递