-
Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping
Paper • 2402.14083 • Published • 47 -
Linear Transformers are Versatile In-Context Learners
Paper • 2402.14180 • Published • 7 -
Training-Free Long-Context Scaling of Large Language Models
Paper • 2402.17463 • Published • 23 -
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 627
Collections
Discover the best community collections!
Collections including paper arxiv:2405.01535
-
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 85 -
OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models
Paper • 2402.01739 • Published • 28 -
LLM Agent Operating System
Paper • 2403.16971 • Published • 72 -
Poro 34B and the Blessing of Multilinguality
Paper • 2404.01856 • Published • 15
-
TinyLlama: An Open-Source Small Language Model
Paper • 2401.02385 • Published • 95 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 48 -
SliceGPT: Compress Large Language Models by Deleting Rows and Columns
Paper • 2401.15024 • Published • 73 -
Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling
Paper • 2401.16380 • Published • 50
-
G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment
Paper • 2303.16634 • Published • 3 -
miracl/miracl-corpus
Viewer • Updated • 77.2M • 2.13k • 51 -
Judging LLM-as-a-judge with MT-Bench and Chatbot Arena
Paper • 2306.05685 • Published • 39 -
How is ChatGPT's behavior changing over time?
Paper • 2307.09009 • Published • 24
-
A Zero-Shot Language Agent for Computer Control with Structured Reflection
Paper • 2310.08740 • Published • 15 -
AgentTuning: Enabling Generalized Agent Abilities for LLMs
Paper • 2310.12823 • Published • 36 -
AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors
Paper • 2308.10848 • Published • 1 -
CLEX: Continuous Length Extrapolation for Large Language Models
Paper • 2310.16450 • Published • 10
-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 23 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 85 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 151 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 25
-
Fusion-Eval: Integrating Evaluators with LLMs
Paper • 2311.09204 • Published • 6 -
Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorer
Paper • 2311.06720 • Published • 9 -
Safurai 001: New Qualitative Approach for Code LLM Evaluation
Paper • 2309.11385 • Published • 2 -
Assessment of Pre-Trained Models Across Languages and Grammars
Paper • 2309.11165 • Published • 1
-
UFOGen: You Forward Once Large Scale Text-to-Image Generation via Diffusion GANs
Paper • 2311.09257 • Published • 47 -
VideoPoet: A Large Language Model for Zero-Shot Video Generation
Paper • 2312.14125 • Published • 46 -
TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones
Paper • 2312.16862 • Published • 31 -
VideoDrafter: Content-Consistent Multi-Scene Video Generation with LLM
Paper • 2401.01256 • Published • 21
-
JudgeLM: Fine-tuned Large Language Models are Scalable Judges
Paper • 2310.17631 • Published • 35 -
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models
Paper • 2310.08491 • Published • 55 -
Generative Judge for Evaluating Alignment
Paper • 2310.05470 • Published • 1 -
Calibrating LLM-Based Evaluator
Paper • 2309.13308 • Published • 12
-
SMOTE: Synthetic Minority Over-sampling Technique
Paper • 1106.1813 • Published • 1 -
Scikit-learn: Machine Learning in Python
Paper • 1201.0490 • Published • 1 -
Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation
Paper • 1406.1078 • Published • 1 -
Distributed Representations of Sentences and Documents
Paper • 1405.4053 • Published
-
Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping
Paper • 2402.14083 • Published • 47 -
Linear Transformers are Versatile In-Context Learners
Paper • 2402.14180 • Published • 7 -
Training-Free Long-Context Scaling of Large Language Models
Paper • 2402.17463 • Published • 23 -
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 627
-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 23 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 85 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 151 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 25
-
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 85 -
OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models
Paper • 2402.01739 • Published • 28 -
LLM Agent Operating System
Paper • 2403.16971 • Published • 72 -
Poro 34B and the Blessing of Multilinguality
Paper • 2404.01856 • Published • 15
-
Fusion-Eval: Integrating Evaluators with LLMs
Paper • 2311.09204 • Published • 6 -
Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorer
Paper • 2311.06720 • Published • 9 -
Safurai 001: New Qualitative Approach for Code LLM Evaluation
Paper • 2309.11385 • Published • 2 -
Assessment of Pre-Trained Models Across Languages and Grammars
Paper • 2309.11165 • Published • 1
-
TinyLlama: An Open-Source Small Language Model
Paper • 2401.02385 • Published • 95 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 48 -
SliceGPT: Compress Large Language Models by Deleting Rows and Columns
Paper • 2401.15024 • Published • 73 -
Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling
Paper • 2401.16380 • Published • 50
-
UFOGen: You Forward Once Large Scale Text-to-Image Generation via Diffusion GANs
Paper • 2311.09257 • Published • 47 -
VideoPoet: A Large Language Model for Zero-Shot Video Generation
Paper • 2312.14125 • Published • 46 -
TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones
Paper • 2312.16862 • Published • 31 -
VideoDrafter: Content-Consistent Multi-Scene Video Generation with LLM
Paper • 2401.01256 • Published • 21
-
G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment
Paper • 2303.16634 • Published • 3 -
miracl/miracl-corpus
Viewer • Updated • 77.2M • 2.13k • 51 -
Judging LLM-as-a-judge with MT-Bench and Chatbot Arena
Paper • 2306.05685 • Published • 39 -
How is ChatGPT's behavior changing over time?
Paper • 2307.09009 • Published • 24
-
JudgeLM: Fine-tuned Large Language Models are Scalable Judges
Paper • 2310.17631 • Published • 35 -
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models
Paper • 2310.08491 • Published • 55 -
Generative Judge for Evaluating Alignment
Paper • 2310.05470 • Published • 1 -
Calibrating LLM-Based Evaluator
Paper • 2309.13308 • Published • 12
-
A Zero-Shot Language Agent for Computer Control with Structured Reflection
Paper • 2310.08740 • Published • 15 -
AgentTuning: Enabling Generalized Agent Abilities for LLMs
Paper • 2310.12823 • Published • 36 -
AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors
Paper • 2308.10848 • Published • 1 -
CLEX: Continuous Length Extrapolation for Large Language Models
Paper • 2310.16450 • Published • 10
-
SMOTE: Synthetic Minority Over-sampling Technique
Paper • 1106.1813 • Published • 1 -
Scikit-learn: Machine Learning in Python
Paper • 1201.0490 • Published • 1 -
Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation
Paper • 1406.1078 • Published • 1 -
Distributed Representations of Sentences and Documents
Paper • 1405.4053 • Published