Collections
Discover the best community collections!
Collections including paper arxiv:2309.02591
-
AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling
Paper • 2402.12226 • Published • 40 -
M2-CLIP: A Multimodal, Multi-task Adapting Framework for Video Action Recognition
Paper • 2401.11649 • Published • 3 -
Gen4Gen: Generative Data Pipeline for Generative Multi-Concept Composition
Paper • 2402.15504 • Published • 21 -
EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions
Paper • 2402.17485 • Published • 185
-
Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning
Paper • 2309.02591 • Published • 14 -
LMDX: Language Model-based Document Information Extraction and Localization
Paper • 2309.10952 • Published • 63 -
FreshLLMs: Refreshing Large Language Models with Search Engine Augmentation
Paper • 2310.03214 • Published • 17 -
Lemur: Harmonizing Natural Language and Code for Language Agents
Paper • 2310.06830 • Published • 30
-
SLiMe: Segment Like Me
Paper • 2309.03179 • Published • 29 -
Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning
Paper • 2309.02591 • Published • 14 -
Efficient Memory Management for Large Language Model Serving with PagedAttention
Paper • 2309.06180 • Published • 25 -
LEAP Hand: Low-Cost, Efficient, and Anthropomorphic Hand for Robot Learning
Paper • 2309.06440 • Published • 9