AI Research
Adapting Large Language Models via Reading Comprehension
Prof. Otto NomosMay 27, 2024 ∙ 1 min readOpenBA: An Open-sourced 15B Bilingual Asymmetric seq2seq Model Pre-trained from Scratch
Prof. Otto NomosMay 27, 2024 ∙ 1 min readPDFTriage: Question Answering over Long, Structured Documents
Prof. Otto NomosMay 27, 2024 ∙ 1 min readSorted LLaMA: Unlocking the Potential of Intermediate Layers of Large Language Models for Dynamic Inference Using Sorted Fine-Tuning (SoFT)
Prof. Otto NomosMay 27, 2024 ∙ 1 min readAn Empirical Study of Scaling Instruct-Tuned Large Multimodal Models
Prof. Otto NomosMay 27, 2024 ∙ 1 min readMindAgent: Emergent Gaming Interaction
Prof. Otto NomosMay 27, 2024 ∙ 1 min readStruc-Bench: Are Large Language Models Really Good at Generating Complex Structured Data?
Prof. Otto NomosMay 25, 2024 ∙ 1 min readRecovering from Privacy-Preserving Masking with Large Language Models
Prof. Otto NomosMay 25, 2024 ∙ 1 min readS3-DST: Structured Open-Domain Dialogue Segmentation and State Tracking in the Era of LLMs
Prof. Otto NomosMay 25, 2024 ∙ 1 min readAugmenting text for spoken language understanding with Large Language Models
Prof. Otto NomosMay 25, 2024 ∙ 1 min readLanguage Modeling Is Compression
Prof. Otto NomosMay 25, 2024 ∙ 1 min readBaichuan 2: Open Large-scale Language Models
Prof. Otto NomosMay 24, 2024 ∙ 1 min readStabilizing RLHF through Advantage Model and Selective Rehearsal
Prof. Otto NomosMay 24, 2024 ∙ 1 min readChain-of-Verification Reduces Hallucination in Large Language Models
Prof. Otto NomosMay 24, 2024 ∙ 1 min readLMDX: Language Model-based Document Information Extraction and Localization
Prof. Otto NomosMay 24, 2024 ∙ 1 min readSlimPajama-DC: Understanding Data Combinations for LLM Training
Prof. Otto NomosOct 04, 2023 ∙ 1 min readContrastive Decoding Improves Reasoning in Large Language Models
Prof. Otto NomosOct 04, 2023 ∙ 1 min readCulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages
Prof. Otto NomosOct 04, 2023 ∙ 1 min readA Data Source for Reasoning Embodied Agents
Prof. Otto NomosOct 04, 2023 ∙ 1 min readLeveraging Contextual Information for Effective Entity Salience Detection
Prof. Otto NomosOct 04, 2023 ∙ 1 min readLASER: LLM Agent with State-Space Exploration for Web Navigation
Prof. Otto NomosOct 04, 2023 ∙ 1 min readSparse Autoencoders Find Highly Interpretable Features in Language Models
Prof. Otto NomosOct 04, 2023 ∙ 1 min readInvestigating Answerability of LLMs for Long-Form Question Answering
Prof. Otto NomosOct 04, 2023 ∙ 1 min readScaling Laws for Sparsely-Connected Foundation Models
Prof. Otto NomosOct 04, 2023 ∙ 1 min readConnecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers
Prof. Otto NomosOct 04, 2023 ∙ 1 min readAmbiguity-Aware In-Context Learning with Large Language Models
Prof. Otto NomosOct 04, 2023 ∙ 1 min readAre Large Language Model-based Evaluators the Solution to Scaling Up Multilingual Evaluation?
Prof. Otto NomosOct 04, 2023 ∙ 1 min readClinical Text Summarization: Adapting Large Language Models Can Outperform Human Experts
Prof. Otto NomosOct 04, 2023 ∙ 1 min readAgents: An Open-source Framework for Autonomous Language Agents
Prof. Otto NomosOct 04, 2023 ∙ 1 min readStatistical Rejection Sampling Improves Preference Optimization
Prof. Otto NomosOct 04, 2023 ∙ 1 min readLarge Language Models for Compiler Optimization
Prof. Otto NomosOct 03, 2023 ∙ 1 min readAstroLLaMA: Towards Specialized Foundation Models in Astronomy
Prof. Otto NomosOct 03, 2023 ∙ 1 min readLarge Language Model for Science: A Study on P vs. NP
Prof. Otto NomosOct 03, 2023 ∙ 1 min readEfficient Memory Management for Large Language Model Serving with PagedAttention
Prof. Otto NomosOct 03, 2023 ∙ 1 min readFIAT: Fusing learning paradigms with Instruction-Accelerated Tuning
Prof. Otto NomosOct 03, 2023 ∙ 1 min readOptimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs
Prof. Otto NomosOct 03, 2023 ∙ 1 min readWhen Less is More: Investigating Data Pruning for Pretraining LLMs at Scale
Prof. Otto NomosOct 03, 2023 ∙ 1 min readNeurons in Large Language Models: Dead, N-gram, Positional
Prof. Otto NomosOct 03, 2023 ∙ 1 min readTextbooks Are All You Need II: phi-1.5 technical report
Prof. Otto NomosOct 03, 2023 ∙ 1 min readDrugChat: Towards Enabling ChatGPT-Like Capabilities on Drug Molecule Graphs
Prof. Otto NomosOct 03, 2023 ∙ 1 min readFrom Sparse to Dense: GPT-4 Summarization with Chain of Density Prompting
Prof. Otto NomosOct 03, 2023 ∙ 1 min readXGen-7B Technical Report
Prof. Otto NomosOct 03, 2023 ∙ 1 min readDoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models
Prof. Otto NomosOct 03, 2023 ∙ 1 min readGPT Can Solve Mathematical Problems Without a Calculator
Prof. Otto NomosOct 03, 2023 ∙ 1 min readLarge Language Models as Optimizers
Prof. Otto NomosOct 03, 2023 ∙ 1 min readEfficient RLHF: Reducing the Memory Usage of PPO
Prof. Otto NomosOct 03, 2023 ∙ 1 min readModelScope-Agent: Building Your Customizable Agent System with Open-source Large Language Models
Prof. Otto NomosOct 03, 2023 ∙ 1 min readOpen Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback [Summary]
Prof. Otto NomosOct 02, 2023 ∙ 1 min readLlama 2: Open Foundation and Fine-Tuned Chat Models [Commentary]
Prof. Otto NomosOct 02, 2023 ∙ 1 min readChallenges and Applications of Large Language Models [Summary]
Prof. Otto NomosOct 02, 2023 ∙ 1 min readLoraHub: Efficient Cross-Task Generalization Via Dynamic LoRA Composition [Commentary]
Prof. Otto NomosOct 02, 2023 ∙ 1 min readToolLLM: Facilitating Large Language Models To Master 16000+ Real-World APIs [Commentary]
Prof. Otto NomosOct 02, 2023 ∙ 1 min readFacTool: Factuality Detection in Generative AI -- A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios
Prof. Otto NomosOct 02, 2023 ∙ 1 min readGraph of Thoughts: Solving Elaborate Problems with Large Language Models
Prof. Otto NomosOct 02, 2023 ∙ 1 min readEfficient Guided Generation for Large Language Models
Prof. Otto NomosOct 02, 2023 ∙ 1 min readPredicting transcriptional outcomes of novel multigene perturbations with GEARS
Prof. Otto NomosOct 02, 2023 ∙ 1 min readA Survey on Model Compression for Large Language Models
Prof. Otto NomosOct 02, 2023 ∙ 1 min readFrom Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases Leading to Unfair NLP Models
Prof. Otto NomosOct 02, 2023 ∙ 1 min readLLM As DBA
Prof. Otto NomosOct 02, 2023 ∙ 1 min readSelf-Alignment with Instruction Backtranslation
Prof. Otto NomosOct 02, 2023 ∙ 1 min readRLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback
Prof. Otto NomosOct 02, 2023 ∙ 1 min readLarge Content And Behavior Models To Understand, Simulate, And Optimize Content And Behavior
Prof. Otto NomosOct 02, 2023 ∙ 1 min readBioCoder: A Benchmark for Bioinformatics Code Generation with Contextual Pragmatic Knowledge
Prof. Otto NomosOct 02, 2023 ∙ 1 min readThe Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants
Prof. Otto NomosOct 02, 2023 ∙ 1 min readCan Programming Languages Boost Each Other via Instruction Tuning?
Prof. Otto NomosOct 02, 2023 ∙ 1 min readWeatherBench 2: A benchmark for the next generation of data-driven global weather models
Prof. Otto NomosOct 02, 2023 ∙ 1 min readJais and Jais-chat: Arabic-Centric Foundation and Instruction-Tuned Open Generative Large Language Models
Prof. Otto NomosOct 02, 2023 ∙ 1 min readMedAlign: A Clinician-Generated Dataset for Instruction Following with Electronic Medical Records
Prof. Otto NomosOct 02, 2023 ∙ 1 min readSoTaNa: The Open-Source Software Development Assistant
Prof. Otto NomosOct 02, 2023 ∙ 1 min readTeach LLMs to Personalize -- An Approach inspired by Writing Education
Prof. Otto NomosOct 02, 2023 ∙ 1 min readRAVEN: In-Context Learning with Retrieval Augmented Encoder-Decoder Language Models
Prof. Otto NomosOct 02, 2023 ∙ 1 min readSolving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification
Prof. Otto NomosOct 02, 2023 ∙ 1 min readThe Devil is in the Errors: Leveraging Large Language Models for Fine-grained Machine Translation Evaluation
Prof. Otto NomosOct 02, 2023 ∙ 1 min readCausalLM is not optimal for in-context learning
Prof. Otto NomosOct 02, 2023 ∙ 1 min readPlatypus: Quick, Cheap, and Powerful Refinement of LLMs
Prof. Otto NomosOct 02, 2023 ∙ 1 min readOctoPack: Instruction Tuning Code Large Language Models
Prof. Otto NomosOct 02, 2023 ∙ 1 min readEnhancing Network Management Using Code Generated by Large Language Models
Prof. Otto NomosOct 02, 2023 ∙ 1 min readImproving Joint Speech-Text Representations Without Alignment
Prof. Otto NomosOct 02, 2023 ∙ 1 min readBOLAA: Benchmarking and Orchestrating LLM-augmented Autonomous Agents
Prof. Otto NomosOct 02, 2023 ∙ 1 min readPIPPA: A Partially Synthetic Conversational Dataset
Prof. Otto NomosOct 02, 2023 ∙ 1 min readSelf-Alignment with Instruction Backtranslation
Prof. Otto NomosOct 02, 2023 ∙ 1 min readOpenProteinSet: Training data for structural biology at scale
Prof. Otto NomosOct 02, 2023 ∙ 1 min readTrustworthy LLMs: a Survey and Guideline for Evaluating Large Language Models' Alignment
Prof. Otto NomosOct 02, 2023 ∙ 1 min readAccelerating LLM Inference with Staged Speculative Decoding
Prof. Otto NomosOct 02, 2023 ∙ 1 min readShepherd: A Critic for Language Model Generation
Prof. Otto NomosOct 02, 2023 ∙ 1 min readSILO Language Models: Isolating Legal Risk In a Nonparametric Datastore
Prof. Otto NomosOct 02, 2023 ∙ 1 min readSimple synthetic data reduces sycophancy in large language models
Prof. Otto NomosOct 02, 2023 ∙ 1 min read
OpenBA: An Open-sourced 15B Bilingual Asymmetric seq2seq Model Pre-trained from Scratch
Prof. Otto NomosMay 27, 2024 ∙ 1 min readBaichuan 2: Open Large-scale Language Models
Prof. Otto NomosMay 24, 2024 ∙ 1 min readCulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages
Prof. Otto NomosOct 04, 2023 ∙ 1 min readAre Large Language Model-based Evaluators the Solution to Scaling Up Multilingual Evaluation?
Prof. Otto NomosOct 04, 2023 ∙ 1 min readThe Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants
Prof. Otto NomosOct 02, 2023 ∙ 1 min readJais and Jais-chat: Arabic-Centric Foundation and Instruction-Tuned Open Generative Large Language Models
Prof. Otto NomosOct 02, 2023 ∙ 1 min readThe Devil is in the Errors: Leveraging Large Language Models for Fine-grained Machine Translation Evaluation
Prof. Otto NomosOct 02, 2023 ∙ 1 min readImproving Joint Speech-Text Representations Without Alignment
Prof. Otto NomosOct 02, 2023 ∙ 1 min read
An Empirical Study of Scaling Instruct-Tuned Large Multimodal Models
Prof. Otto NomosMay 27, 2024 ∙ 1 min readXGen-7B Technical Report
Prof. Otto NomosOct 03, 2023 ∙ 1 min readSelf-Alignment with Instruction Backtranslation
Prof. Otto NomosOct 02, 2023 ∙ 1 min readCan Programming Languages Boost Each Other via Instruction Tuning?
Prof. Otto NomosOct 02, 2023 ∙ 1 min readMedAlign: A Clinician-Generated Dataset for Instruction Following with Electronic Medical Records
Prof. Otto NomosOct 02, 2023 ∙ 1 min readOctoPack: Instruction Tuning Code Large Language Models
Prof. Otto NomosOct 02, 2023 ∙ 1 min readSelf-Alignment with Instruction Backtranslation
Prof. Otto NomosOct 02, 2023 ∙ 1 min read
An Empirical Study of Scaling Instruct-Tuned Large Multimodal Models
Prof. Otto NomosMay 27, 2024 ∙ 1 min readNatural Language Supervision for General-Purpose Audio Representations
Prof. Otto NomosOct 03, 2023 ∙ 1 min readImproving Joint Speech-Text Representations Without Alignment
Prof. Otto NomosOct 02, 2023 ∙ 1 min read
Augmenting text for spoken language understanding with Large Language Models
Prof. Otto NomosMay 25, 2024 ∙ 1 min readNatural Language Supervision for General-Purpose Audio Representations
Prof. Otto NomosOct 03, 2023 ∙ 1 min readImproving Joint Speech-Text Representations Without Alignment
Prof. Otto NomosOct 02, 2023 ∙ 1 min read