L's picture

L

abunchofrandomwords

·

AI & ML interests

None yet

Organizations

None yet

abunchofrandomwords's activity

upvoted a paper 8 days ago

General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Paper • 2409.01704 • Published 17 days ago • 72

upvoted an article about 2 months ago

Article

Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models

Jun 24

• 166

upvoted an article 2 months ago

Article

Announcing Finance Commons and the Bad Data Toolbox: Pioneering Open Data and Advanced Document Processing

By

•

Jul 19

• 17

upvoted 2 papers 3 months ago

Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks

Paper • 2311.06242 • Published Nov 10, 2023 • 77

ShareGPT4Video: Improving Video Understanding and Generation with Better Captions

Paper • 2406.04325 • Published Jun 6 • 71

upvoted a collection 5 months ago

📀 Dataset comparison models

1.8B models trained on 350BT to compare different pretraining datasets • 8 items • Updated Jun 12 • 27

upvoted a paper 11 months ago

From Sparse to Dense: GPT-4 Summarization with Chain of Density Prompting

Paper • 2309.04269 • Published Sep 8, 2023 • 32

upvoted 53 papers about 1 year ago

Large Language Models as Optimizers

Paper • 2309.03409 • Published Sep 7, 2023 • 75

FocalFormer3D : Focusing on Hard Instance for 3D Object Detection

Paper • 2308.04556 • Published Aug 8, 2023 • 8

JEN-1: Text-Guided Universal Music Generation with Omnidirectional Diffusion Models

Paper • 2308.04729 • Published Aug 9, 2023 • 31

Shepherd: A Critic for Language Model Generation

Paper • 2308.04592 • Published Aug 8, 2023 • 29

PDE-Refiner: Achieving Accurate Long Rollouts with Neural PDE Solvers

Paper • 2308.05732 • Published Aug 10, 2023 • 8

Alexa, play with robot: Introducing the First Alexa Prize SimBot Challenge on Embodied AI

Paper • 2308.05221 • Published Aug 9, 2023 • 9

Flexible Isosurface Extraction for Gradient-Based Mesh Optimization

Paper • 2308.05371 • Published Aug 10, 2023 • 10

Follow Anything: Open-set detection, tracking, and following in real-time

Paper • 2308.05737 • Published Aug 10, 2023 • 11

OpenProteinSet: Training data for structural biology at scale

Paper • 2308.05326 • Published Aug 10, 2023 • 10

Trustworthy LLMs: a Survey and Guideline for Evaluating Large Language Models' Alignment

Paper • 2308.05374 • Published Aug 10, 2023 • 27

AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining

Paper • 2308.05734 • Published Aug 10, 2023 • 36

OpenFlamingo: An Open-Source Framework for Training Large Autoregressive Vision-Language Models

Paper • 2308.01390 • Published Aug 2, 2023 • 31

Retroformer: Retrospective Large Language Agents with Policy Gradient Optimization

Paper • 2308.02151 • Published Aug 4, 2023 • 18

Mirror-NeRF: Learning Neural Radiance Fields for Mirrors with Whitted-Style Ray Tracing

Paper • 2308.03280 • Published Aug 7, 2023 • 6

Tiny LVLM-eHub: Early Multimodal Experiments with Bard

Paper • 2308.03729 • Published Aug 7, 2023 • 9

TPTU: Task Planning and Tool Usage of Large Language Model-based AI Agents

Paper • 2308.03427 • Published Aug 7, 2023 • 14

Seeing through the Brain: Image Reconstruction of Visual Perception from Human Brain Signals

Paper • 2308.02510 • Published Jul 27, 2023 • 21

AgentBench: Evaluating LLMs as Agents

Paper • 2308.03688 • Published Aug 7, 2023 • 24

UniversalNER: Targeted Distillation from Large Language Models for Open Named Entity Recognition

Paper • 2308.03279 • Published Aug 7, 2023 • 21

AvatarVerse: High-quality & Stable 3D Avatar Creation from Text and Pose

Paper • 2308.03610 • Published Aug 7, 2023 • 23

ReCLIP: Refine Contrastive Language Image Pre-Training with Source Free Domain Adaptation

Paper • 2308.03793 • Published Aug 4, 2023 • 10

SILO Language Models: Isolating Legal Risk In a Nonparametric Datastore

Paper • 2308.04430 • Published Aug 8, 2023 • 9

3D Gaussian Splatting for Real-Time Radiance Field Rendering

Paper • 2308.04079 • Published Aug 8, 2023 • 165

Simple synthetic data reduces sycophancy in large language models

Paper • 2308.03958 • Published Aug 7, 2023 • 21

Ambient Adventures: Teaching ChatGPT on Developing Complex Stories

Paper • 2308.01734 • Published Aug 3, 2023 • 6

The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World

Paper • 2308.01907 • Published Aug 3, 2023 • 10

HANDAL: A Dataset of Real-World Manipulable Object Categories with Pose Annotations, Affordances, and Reconstructions

Paper • 2308.01477 • Published Aug 2, 2023 • 11

Multimodal Neurons in Pretrained Text-Only Transformers

Paper • 2308.01544 • Published Aug 3, 2023 • 15

DeepSpeed-Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-like Models at All Scales

Paper • 2308.01320 • Published Aug 2, 2023 • 44

Training Data Protection with Compositional Diffusion Models

Paper • 2308.01937 • Published Aug 2, 2023 • 5

Scaling Clinical Trial Matching Using Large Language Models: A Case Study in Oncology

Paper • 2308.02180 • Published Aug 4, 2023 • 9

Getting the Ball Rolling: Learning a Dexterous Policy for a Biomimetic Tendon-Driven Hand with Rolling Contact Joints

Paper • 2308.02453 • Published Aug 4, 2023 • 8

Convolutions Die Hard: Open-Vocabulary Segmentation with Single Frozen Convolutional CLIP

Paper • 2308.02487 • Published Aug 4, 2023 • 12

MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities

Paper • 2308.02490 • Published Aug 4, 2023 • 16

Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Dataset for Pre-training and Benchmarks

Paper • 2306.04362 • Published Jun 7, 2023 • 2

MobileNMT: Enabling Translation in 15MB and 30ms

Paper • 2306.04235 • Published Jun 7, 2023 • 3

ARTIC3D: Learning Robust Articulated 3D Shapes from Noisy Web Image Collections

Paper • 2306.04619 • Published Jun 7, 2023 • 4

LLMZip: Lossless Text Compression using Large Language Models

Paper • 2306.04050 • Published Jun 6, 2023 • 4

M^3IT: A Large-Scale Dataset towards Multi-Modal Multilingual Instruction Tuning

Paper • 2306.04387 • Published Jun 7, 2023 • 8

Mixture-of-Supernets: Improving Weight-Sharing Supernet Training with Architecture-Routed Mixture-of-Experts

Paper • 2306.04845 • Published Jun 8, 2023 • 4

Modular Visual Question Answering via Code Generation

Paper • 2306.05392 • Published Jun 8, 2023 • 2

Unsupervised Compositional Concepts Discovery with Text-to-Image Generative Models

Paper • 2306.05357 • Published Jun 8, 2023 • 3

Optimizing ViViT Training: Time and Memory Reduction for Action Recognition

Paper • 2306.04822 • Published Jun 7, 2023 • 2

LU-NeRF: Scene and Pose Estimation by Synchronizing Local Unposed NeRFs

Paper • 2306.05410 • Published Jun 8, 2023 • 2

Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models

Paper • 2306.05424 • Published Jun 8, 2023 • 7

Improving Open Language Models by Learning from Organic Interactions

Paper • 2306.04707 • Published Jun 7, 2023 • 3

MIMIC-IT: Multi-Modal In-Context Instruction Tuning

Paper • 2306.05425 • Published Jun 8, 2023 • 11

SyncDiffusion: Coherent Montage via Synchronized Joint Diffusions

Paper • 2306.05178 • Published Jun 8, 2023 • 6

Background Prompting for Improved Object Depth

Paper • 2306.05428 • Published Jun 8, 2023 • 3

Matting Anything

Paper • 2306.05399 • Published Jun 8, 2023 • 6

Tracking Everything Everywhere All at Once

Paper • 2306.05422 • Published Jun 8, 2023 • 10

Simple and Controllable Music Generation

Paper • 2306.05284 • Published Jun 8, 2023 • 141

Embodied Executable Policy Learning with Language-based Scene Summarization

Paper • 2306.05696 • Published Jun 9, 2023 • 3