Adapting Large Language Models via Reading Comprehension
OpenBA: An Open-sourced 15B Bilingual Asymmetric seq2seq Model Pre-trained from Scratch
PDFTriage: Question Answering over Long, Structured Documents
Sorted LLaMA: Unlocking the Potential of Intermediate Layers of Large Language Models for Dynamic Inference Using Sorted Fine-Tuning (SoFT)
An Empirical Study of Scaling Instruct-Tuned Large Multimodal Models
MindAgent: Emergent Gaming Interaction
Struc-Bench: Are Large Language Models Really Good at Generating Complex Structured Data?
Recovering from Privacy-Preserving Masking with Large Language Models
S3-DST: Structured Open-Domain Dialogue Segmentation and State Tracking in the Era of LLMs
Augmenting text for spoken language understanding with Large Language Models
Language Modeling Is Compression
Baichuan 2: Open Large-scale Language Models
Stabilizing RLHF through Advantage Model and Selective Rehearsal
Chain-of-Verification Reduces Hallucination in Large Language Models
LMDX: Language Model-based Document Information Extraction and Localization
SlimPajama-DC: Understanding Data Combinations for LLM Training
Contrastive Decoding Improves Reasoning in Large Language Models
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages
A Data Source for Reasoning Embodied Agents
Leveraging Contextual Information for Effective Entity Salience Detection
LASER: LLM Agent with State-Space Exploration for Web Navigation
Sparse Autoencoders Find Highly Interpretable Features in Language Models
Investigating Answerability of LLMs for Long-Form Question Answering
Scaling Laws for Sparsely-Connected Foundation Models
Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers
Ambiguity-Aware In-Context Learning with Large Language Models
Are Large Language Model-based Evaluators the Solution to Scaling Up Multilingual Evaluation?
Clinical Text Summarization: Adapting Large Language Models Can Outperform Human Experts
Agents: An Open-source Framework for Autonomous Language Agents
Statistical Rejection Sampling Improves Preference Optimization
Large Language Models for Compiler Optimization
AstroLLaMA: Towards Specialized Foundation Models in Astronomy
Large Language Model for Science: A Study on P vs. NP
Efficient Memory Management for Large Language Model Serving with PagedAttention
FIAT: Fusing learning paradigms with Instruction-Accelerated Tuning
Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs
When Less is More: Investigating Data Pruning for Pretraining LLMs at Scale
Neurons in Large Language Models: Dead, N-gram, Positional
Textbooks Are All You Need II: phi-1.5 technical report
DrugChat: Towards Enabling ChatGPT-Like Capabilities on Drug Molecule Graphs
From Sparse to Dense: GPT-4 Summarization with Chain of Density Prompting
XGen-7B Technical Report
DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models
GPT Can Solve Mathematical Problems Without a Calculator
Large Language Models as Optimizers
Efficient RLHF: Reducing the Memory Usage of PPO
ModelScope-Agent: Building Your Customizable Agent System with Open-source Large Language Models
Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback [Summary]
Llama 2: Open Foundation and Fine-Tuned Chat Models [Commentary]
Challenges and Applications of Large Language Models [Summary]
LoraHub: Efficient Cross-Task Generalization Via Dynamic LoRA Composition [Commentary]
ToolLLM: Facilitating Large Language Models To Master 16000+ Real-World APIs [Commentary]
FacTool: Factuality Detection in Generative AI -- A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios
Graph of Thoughts: Solving Elaborate Problems with Large Language Models
Efficient Guided Generation for Large Language Models
Predicting transcriptional outcomes of novel multigene perturbations with GEARS
A Survey on Model Compression for Large Language Models
From Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases Leading to Unfair NLP Models
LLM As DBA
Self-Alignment with Instruction Backtranslation
RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback
Large Content And Behavior Models To Understand, Simulate, And Optimize Content And Behavior
BioCoder: A Benchmark for Bioinformatics Code Generation with Contextual Pragmatic Knowledge
The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants
Can Programming Languages Boost Each Other via Instruction Tuning?
WeatherBench 2: A benchmark for the next generation of data-driven global weather models
Jais and Jais-chat: Arabic-Centric Foundation and Instruction-Tuned Open Generative Large Language Models
MedAlign: A Clinician-Generated Dataset for Instruction Following with Electronic Medical Records
SoTaNa: The Open-Source Software Development Assistant
Teach LLMs to Personalize -- An Approach inspired by Writing Education
RAVEN: In-Context Learning with Retrieval Augmented Encoder-Decoder Language Models
Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification
The Devil is in the Errors: Leveraging Large Language Models for Fine-grained Machine Translation Evaluation
CausalLM is not optimal for in-context learning
Platypus: Quick, Cheap, and Powerful Refinement of LLMs
OctoPack: Instruction Tuning Code Large Language Models
Enhancing Network Management Using Code Generated by Large Language Models
Improving Joint Speech-Text Representations Without Alignment
BOLAA: Benchmarking and Orchestrating LLM-augmented Autonomous Agents
PIPPA: A Partially Synthetic Conversational Dataset
Self-Alignment with Instruction Backtranslation
OpenProteinSet: Training data for structural biology at scale
Trustworthy LLMs: a Survey and Guideline for Evaluating Large Language Models' Alignment
Accelerating LLM Inference with Staged Speculative Decoding
Shepherd: A Critic for Language Model Generation
SILO Language Models: Isolating Legal Risk In a Nonparametric Datastore
Simple synthetic data reduces sycophancy in large language models
Language Modeling Is Compression
A Survey on Model Compression for Large Language Models
Are Large Language Model-based Evaluators the Solution to Scaling Up Multilingual Evaluation?
The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants
WeatherBench 2: A benchmark for the next generation of data-driven global weather models
Trustworthy LLMs: a Survey and Guideline for Evaluating Large Language Models' Alignment
A Survey on Model Compression for Large Language Models
Trustworthy LLMs: a Survey and Guideline for Evaluating Large Language Models' Alignment
From Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases Leading to Unfair NLP Models
Trustworthy LLMs: a Survey and Guideline for Evaluating Large Language Models' Alignment
Simple synthetic data reduces sycophancy in large language models