-
The Goldilocks of Pragmatic Understanding: Fine-Tuning Strategy Matters for Implicature Resolution by LLMs
Paper • 2210.14986 • Published • 4 -
Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2
Paper • 2311.10702 • Published • 18 -
Large Language Models as Optimizers
Paper • 2309.03409 • Published • 75 -
From Sparse to Dense: GPT-4 Summarization with Chain of Density Prompting
Paper • 2309.04269 • Published • 32
Collections
Discover the best community collections!
Collections including paper arxiv:2403.09611
-
Building and better understanding vision-language models: insights and future directions
Paper • 2408.12637 • Published • 109 -
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model
Paper • 2408.11039 • Published • 54 -
Mini-Omni: Language Models Can Hear, Talk While Thinking in Streaming
Paper • 2408.16725 • Published • 49 -
Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
Paper • 2408.15998 • Published • 81
-
VILA^2: VILA Augmented VILA
Paper • 2407.17453 • Published • 38 -
Octopus v4: Graph of language models
Paper • 2404.19296 • Published • 118 -
Octo-planner: On-device Language Model for Planner-Action Agents
Paper • 2406.18082 • Published • 47 -
Recursive Introspection: Teaching Language Model Agents How to Self-Improve
Paper • 2407.18219 • Published • 3
-
Needle In A Multimodal Haystack
Paper • 2406.07230 • Published • 52 -
OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
Paper • 2406.08418 • Published • 28 -
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Paper • 2405.09818 • Published • 125 -
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training
Paper • 2403.09611 • Published • 123
-
Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation
Paper • 2404.19752 • Published • 22 -
How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites
Paper • 2404.16821 • Published • 53 -
MoAI: Mixture of All Intelligence for Large Language and Vision Models
Paper • 2403.07508 • Published • 75 -
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training
Paper • 2403.09611 • Published • 123
-
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases
Paper • 2402.14905 • Published • 107 -
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 590 -
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training
Paper • 2403.09611 • Published • 123 -
Jamba: A Hybrid Transformer-Mamba Language Model
Paper • 2403.19887 • Published • 103
-
Jamba: A Hybrid Transformer-Mamba Language Model
Paper • 2403.19887 • Published • 103 -
sDPO: Don't Use Your Data All at Once
Paper • 2403.19270 • Published • 38 -
ViTAR: Vision Transformer with Any Resolution
Paper • 2403.18361 • Published • 51 -
Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models
Paper • 2403.18814 • Published • 44