Dialogue

[*] = found in both arXiv and HF search   [HF] = found via HF semantic search

written on 2026-06-06

title authors categories displaydate upvotes
An Infectious Disease Spread Simulation Based on Large Language Model Decision Making Yonchanok Khaokaew, Ruochen Kong, Andreas Zufle, Hao Xue, Taylor Anderson, Chandini Raina MacIntyre, Matthew Scotch, Flora D. Salim, David J Heslop cs.AI 2026-06-04  
Beyond tokens: a unified framework for latent communication in LLM-based multi-agent systems Yingzhuo Liu cs.CL 2026-06-04  
VASO: Formally Verifiable Self-Evolving Skills for Physical AI Agents Yunhao Yang, Neel P. Bhatt, Kevin Wang, Samuel Tetteh, Zhangyang Wang, Ufuk Topcu cs.RO, cs.AI 2026-06-03  
Context-as-AI-Service: Surfacing Cross-File Dependency Chains for LLM-Generated Developer Documentation Ameya Gawde, Vyzantinos Repantis, Harshvardhan Singh, Lucy Moys cs.SE, cs.IR 2026-06-03  
Rethinking Sales Lead Scoring with LLM-based Hierarchical Preference Ranking Chenyu Zhang, Yiwen Liu, Yin Sun, Xinyuan Zhang, Yuji Cao, Junming Jiao, Juyi Qiao cs.IR, cs.AI 2026-06-03  
Organizational Control Layer: Governance Infrastructure at the Execution Boundary of LLM Agent Systems Tianyu Shi, Yang Mo, Yiou Liu, Zhuonan Hao, Yin Wang, Wenzhuo Hu, Nan Yu, Meng Zhou, Jiangbo Yu cs.MA 2026-06-03  
Efficient ASR Training with Conversations that Never Happened Máté Gedeon, Péter Mihajlik cs.CL, cs.AI, cs.SD, eess.AS 2026-06-02  
StepFinder: A Temporal Semantic Framework for Failure Attribution in Multi-Agent Systems Taiyu Zhu, Yifan Wu, Weilin Jin, Ying Li, Gang Huang cs.AI 2026-06-02  
RUBAS: Rubric-Based Reinforcement Learning for Agent Safety Xian Qi Loye, Qinglin Su, Zhexin Zhang, Shiyao Cui, Qi Zhu, Fei Mi, Hongning Wang, Minlie Huang cs.LG, cs.AI, cs.CR 2026-06-02  
Chatbots Output Meaningful (but Problematic) Language Matthew Stone, Una Stojnić cs.CL 2026-06-02  
Topics as Proxies for Sociodemographics: How Conversational Context Affects LLM Answers Vera Neplenbroek, Gabriele Sarti, Arianna Bisazza, Raquel Fernández cs.CL 2026-06-01  
Trust-Calibrated Code Review: A Participatory Design Study of Review Workflows for LLM-Generated Multi-File Changes Lo Gullstrand Heander, Agnia Sergeyuk, Ilya Zakharov, Emma Söderberg, Nikita Mukhortov cs.SE, cs.HC 2026-06-01  
BraveGuard: From Open-World Threats to Safer Computer-Use Agents Yunhao Feng, Xiaohu Du, Xinhao Deng, Yifan Ding, Ming Wen, Yixu Wang, Yuxiang Xie, Baihui Zheng, Yingshui Tan, Yige Li, Yutao Wu, Kerui Cao, Wenke Huang, Yanming Guo, Xingjun Ma, Yu-Gang Jiang cs.CR, cs.CL 2026-05-31  
SkillRevise: Improving LLM-Authored Agent Skills via Trace-Conditioned Skill Revision Yuxuan Liu, Zhaochen Su, Lingyun Xie, Yuhao Zhang, Qing Zong, Jiahe Guo, Zhongwei Xie, Yiyan Ji, Yauwai Yim, Hongyu Luo, Xiyu Ren, Ruan Chenyu, Haoran Li, Yangqiu Song cs.AI 2026-05-31  
Hybrid Verified Decoding: Learning to Allocate Verification in Speculative Decoding Xin Su, Dawid Majchrowski, Fangyuan Yu, Vanshil Atul Shah, Sebastian Rogawski, Pawel Morkisz, Anahita Bhiwandiwalla, Phillip Howard cs.CL, cs.AI 2026-05-31  
Agentic Authoring of Interactive Multiview Visualizations in Genomics Astrid van den Brandt, Kiroong Choe, Sehi L’Yi, Devin Lange, Nils Gehlenborg cs.HC, cs.AI 2026-05-29  
Preference-Aware Rubric Learning for Personalized Evaluation Yilun Qiu, Xiaoyan Zhao, Yang Zhang, Yuxin Chen, Cilin Yan, Jiayin Cai, Xiaolong Jiang, Yao Hu, Yoko Yamakata, Tat-Seng Chua cs.CL 2026-05-29  
Unifying Temporal and Structural Credit Assignment in LLM-Based Multi-Agent Prompt Optimization Wenwu Li, Yuran Song, Mingze Zhao, Bo Jin, Wenhao Li cs.MA, cs.AI 2026-05-28  
Does The Way You Plan Matter? An Empirical Study of Planning Representations for LLM Web Agents Alejandra Zambrano, Sara Vera Marjanovic, Imene Kerboua, Xing Han Lù, Leila Kosseim cs.CL, cs.AI, cs.LG 2026-05-28  
Minimal Prompt Perturbations Lead to Code Vulnerabilities: Prompt Fragility and Hidden-State Signals in Coding LLMs Alexander Sternfeld, Andrei Kucharavy, Ljiljana Dolamic cs.CR, cs.CL, cs.SE 2026-05-28  
AgentCVR: Active Multi-Agent Cross-Video Reasoning via Script-Simulated Reinforcement Learning Yilun Qiu, Jiahe Wang, Cilin Yan, Jiayin Cai, Xiaolong Jiang, Yao Hu, Chun Yuan cs.CV, cs.MA 2026-05-28  
Improving Collaborative Storytelling with a Multi-Agent Framework Based on Large Language Models Arturo Valdivia, Paolo Burelli cs.AI 2026-05-28  
LLM-ALSO: LLM-Driven Adaptive Learning-Signal Optimization for Multi-Agent Reinforcement Learning Xiaoguang Wu, Zhi Zheng, Hui Xiong cs.MA 2026-05-28  
When LLM Reward Design Fails: Diagnostic-Driven Refinement for Sparse Structured RL Youting Wang, Yuan Tang, Bowen Liu, Xuan Liu, Dingyan Shang cs.LG, cs.AI, cs.IR 2026-05-27  
Evaluating the Realism of LLM-powered Social Agents: A Case Study of Reactions to Spanish Online News Alejandro Buitrago López, Alberto Ortega Pastor, Javier Pastor-Galindo, José A. Ruipérez-Valiente cs.CL, cs.AI 2026-05-27  
Beyond One Path: Evaluating and Enhancing Divergent Thinking in Interactive LLM Agents Jihyeong Park, Ingeol Baek, Jeonghyun Park, Hwanhee Lee cs.CL 2026-05-27  
OR-Space: A Full-Lifecycle Workspace Benchmark for Industrial Optimization Agents Chenyu Zhou, Xinyun Lu, Jiangyue Zhao, Jianghao Lin, Dongdong Ge, Yinyu Ye cs.AI 2026-05-27  
Personality, Role, and Expressive Style in Large Language Models: An Interactionist Analysis Moe Nagao, Koichiro Terao, Mikio Nakano, Naoto Iwahashi cs.CL 2026-05-27  
Keyphrase Generative Representation of Youth Crisis Conversations Beyond Static Taxonomies Abeer Badawi, Will Aitken, Lydia Sequeira, Jocelyn Rankin, Maia Norman, Elham Dolatabadi cs.CL, cs.HC 2026-05-26  
Agentic Separation Logic Specification Synthesis Tarun Suresh, David Korczynski, Julien Vanegue cs.PL, cs.CL, cs.SE 2026-05-26  
TADDLE: A Tool-Augmented Agent for Detecting Deficient LLM-Generated Peer Reviews Hanqi Duan, Xiang Li cs.AI 2026-05-26  
Knowledge Graphs as the Missing Data Layer for LLM-Based Industrial Asset Operations Madhulatha Mandarapu, Sandeep Kunkunuru cs.DB, cs.AI, cs.LG 2026-05-26  
Causal methods for LLM development and evaluation Dennis Frauen, Marie Brockschmidt, Konstantin Hess, Haorui Ma, Yuchen Ma, Abdurahman Maarouf, Maresa Schröder, Jonas Schweisthal, Yuxin Wang, Athiya Deviyani, Sonali Parbhoo, Rahul G. Krishnan, Stefan Feuerriegel cs.LG 2026-05-25  
Truthful Online Preference Aggregation for LLM Fine-Tuning in Mobile Crowdsourcing Shugang Hao, Lingjie Duan cs.LG, cs.AI 2026-05-22  
HawkesLLM: Semantic Uncertainty Propagation in Agentic Text Simulation Zewei Deng, Tinghan Ye, Liyan Xie cs.CL, stat.ML 2026-05-21  
Self-Evolving Multi-Agent Systems via Decentralized Memory Guangya Hao, Yunbo Long, Zhuokai Zhao cs.MA 2026-05-21  
Boiling the Frog: A Multi-Turn Benchmark for Agentic Safety Piercosma Bisconti, Matteo Prandi, Federico Pierucci, Federico Sartore, Enrico Panai, Laura Caroli, Yue Zhu, Adam Leon Smith, Luca Nannini, Marcello Galisai, Susanna Cifani, Francesco Giarrusso, Marcantonio Bracale Syrnikov, Daniele Nardi cs.CL 2026-05-21  
Contractual Skills: A GovernSpec Design Framework for Enterprise AI Agents Ting Liu cs.SE, cs.AI 2026-05-21  
Polite on the Surface, Wrong in Practice: A Curated Dataset for Fixing Honorific Failures in Multilingual Bangla Generation Md. Asaduzzaman Shuvo, Mahedi Hasan, Md. Tashin Parvez, Azizul Haque Noman, Md. Shafayet Hossain Ovi cs.CL 2026-05-21  
SWE-Mutation: Can LLMs Generate Reliable Test Suites in Software Engineering? Yuxuan Sun, Yuze Zhao, Yufeng Wang, Yao Du, Zhiyuan Ma, Jinbo Wang, Mengdi Zhang, Kai Zhang, Zhenya Huang cs.SE, cs.AI 2026-05-21  
RankJudge: A Multi-Turn LLM-as-a-Judge Synthetic Benchmark Generator Zhenwei Tang, Zhaoyan Liu, Rasa Hosseinzadeh, Tongzi Wu, Keyvan Golestan, Jesse C. Cresswell cs.CL 2026-05-20  
OrgForge-IT: A Verifiable Synthetic Benchmark for LLM-Based Insider Threat Detection Jeffrey Flynt cs.CR, cs.LG 2026-03-23  
Optimizing Multi-Agent Weather Captioning via Text Gradient Descent: A Training-Free Approach with Consensus-Aware Gradient Fusion Shixu Liu cs.CL 2026-03-23  
Emergent Formal Verification: How an Autonomous AI Ecosystem Independently Discovered SMT-Based Safety Across Six Domains Octavian Untila cs.SE, cs.AI, cs.MA 2026-03-22  
Reasoning Gets Harder for LLMs Inside A Dialogue Ivan Kartáč, Mateusz Lango, Ondřej Dušek cs.CL 2026-03-20  
An Agentic Approach to Generating XAI-Narratives Yifan He, David Martens cs.CL 2026-03-20  
Semantic Delta: An Interpretable Signal Differentiating Human and LLMs Dialogue Riccardo Scantamburlo, Mauro Mezzanzana, Giacomo Buonanno, Francesco Bertolotti cs.CL, cs.AI 2026-03-20  
Skilled AI Agents for Embedded and IoT Systems Development Yiming Li, Yuhan Cheng, Mingchen Ma, Yihang Zou, Ningyuan Yang, Wei Cheng, Hai “Helen” Li, Yiran Chen, Tingjun Chen cs.SE, cs.AI 2026-03-20  
Mi:dm K 2.5 Pro KT Tech innovation Group cs.CL, cs.AI 2026-03-19  
Expert Personas Improve LLM Alignment but Damage Accuracy: Bootstrapping Intent-Based Persona Routing with PRISM Zizhao Hu, Mohammad Rostami, Jesse Thomason cs.AI 2026-03-19  
When Only the Final Text Survives: Implicit Execution Tracing for Multi-Agent Attribution Yi Nian, Haosen Cao, Shenzhe Zhu, Henry Peng Zou, Qingqing Luan, Yue Zhao cs.AI, cs.CL 2026-03-18  
Graph-Native Cognitive Memory for AI Agents: Formal Belief Revision Semantics for Versioned Memory Architectures Young Bin Park cs.AI, cs.IR, cs.LO 2026-03-18  
Evaluating LLM-Simulated Conversations in Modeling Inconsistent and Uncollaborative Behaviors in Human Social Interaction Ryo Kamoi, Ameya Godbole, Longqi Yang, Rui Zhang, Mengting Wan, Pei Zhou cs.CL 2026-03-17  
Differential Harm Propensity in Personalized LLM Agents: The Curious Case of Mental Health Disclosure Caglar Yildirim cs.AI 2026-03-17  
Proactive Rejection and Grounded Execution: A Dual-Stage Intent Analysis Paradigm for Safe and Efficient AIoT Smart Homes Xinxin Jin, Zhengwei Ni, Zhengguo Sheng, Victor C. M. Leung cs.AI 2026-03-17  
VIBEPASS: Can Vibe Coders Really Pass the Vibe Check? Srijan Bansal, Jiao Fangkai, Yilun Zhou, Austin Xu, Shafiq Joty, Semih Yavuz cs.SE, cs.AI 2026-03-16  
Practicing with Language Models Cultivates Human Empathic Communication Aakriti Kumar, Nalin Poungpeth, Diyi Yang, Bruce Lambert, Matthew Groh cs.CL, cs.HC 2026-03-16  
OrgForge: A Multi-Agent Simulation Framework for Verifiable Synthetic Corporate Corpora Jeffrey Flynt cs.CL, cs.AI, cs.IR 2026-03-16  
GNNVerifier: Graph-based Verifier for LLM Task Planning Yu Hao, Qiuyu Wang, Cheng Yang, Yawen Li, Zhiqiang Zhang, Chuan Shi cs.LG 2026-03-16  
GameUIAgent: An LLM-Powered Framework for Automated Game UI Design with Structured Intermediate Representation Wei Zeng, Fengwei An, Zhen Liu, Jian Zhao cs.AI 2026-03-16  
CangjieBench: Benchmarking LLMs on a Low-Resource General-Purpose Programming Language Junhang Cheng, Fang Liu, Jia Li, Chengru Wu, Nanxiang Jiang, Li Zhang cs.SE, cs.AI, cs.CL 2026-03-15  
Infinite Problem Generator: Verifiably Scaling Physics Reasoning Data with Agentic Workflows Aditya Sharan, Sriram Hebbale, Dhruv Kumar cs.CL, cs.AI 2026-03-15  
QChunker: Learning Question-Aware Text Chunking for Domain RAG via Multi-Agent Debate Jihao Zhao, Daixuan Li, Pengfei Li, Shuaishuai Zu, Biao Qin, Hongyan Liu cs.CL 2026-03-12  
[HF] End-to-End Chatbot Evaluation with Adaptive Reasoning and Uncertainty Filtering Nhi Dang, Tung Le, Huy Tien Nguyen   2026-03-11  
SPAR-K: Scheduled Periodic Alternating Early Exit for Spoken Language Models Hsiao-Ying Huang, Cheng-Han Chiang, Hung-yi Lee cs.CL, eess.AS 2026-03-10  
SCALAR: Learning and Composing Skills through LLM Guided Symbolic Planning and Deep RL Grounding Renos Zabounidis, Yue Wu, Simon Stepputtis, Woojun Kim, Yuanzhi Li, Tom Mitchell, Katia Sycara cs.LG 2026-03-10  
Memory for Autonomous LLM Agents:Mechanisms, Evaluation, and Emerging Frontiers Pengfei Du cs.AI 2026-03-08  
FireBench: Evaluating Instruction Following in Enterprise and API-Driven LLM Applications Yunfan Zhang, Yijie Bei, Jetashree Ravi, Pawel Garbacki cs.CL, cs.SE 2026-03-05  
EchoGuard: An Agentic Framework with Knowledge-Graph Memory for Detecting Manipulative Communication in Longitudinal Dialogue Ratna Kandala, Niva Manchanda, Akshata Kishore Moharir, Ananth Kandala cs.AI 2026-03-05  
Agentics 2.0: Logical Transduction Algebra for Agentic Data Workflows Alfio Massimiliano Gliozzo, Junkyu Lee, Nahuel Defosse cs.AI, cs.LG 2026-03-04  
Assessing the Effectiveness of LLMs in Delivering Cognitive Behavioral Therapy Navdeep Singh Bedi, Ana-Maria Bucur, Noriko Kando, Fabio Crestani cs.CL 2026-03-04  
BLUFF: Benchmarking the Detection of False and Synthetic Content across 58 Low-Resource Languages Jason Lucas, Matt Murtagh-White, Adaku Uchendu, Ali Al-Lawati, Michiharu Yamashita, Dominik Macko, Ivan Srba, Robert Moro, Dongwon Lee cs.CL 2026-02-28  
LLM-Driven Multi-Turn Task-Oriented Dialogue Synthesis for Realistic Reasoning Yu Zhu, Kai Yang cs.CL, cs.AI 2026-02-27  
Agentic AI as a Cybersecurity Attack Surface: Threats, Exploits, and Defenses in Runtime Supply Chains Xiaochong Jiang, Shiqi Yang, Wenting Yang, Yichen Liu, Cheng Ji cs.CR, cs.AI 2026-02-23  
TherapyGym: Evaluating and Aligning Clinical Fidelity and Safety in Therapy Chatbots Fangrui Huang, Souhad Chbeir, Arpandeep Khatua, Sheng Wang, Sijun Tan, Kenan Ye, Lily Bailey, Merryn Daniel, Ryan Louie, Sanmi Koyejo, Ehsan Adeli cs.CL, cs.AI, cs.CY 2026-02-23  
NIMMGen: Learning Neural-Integrated Mechanistic Digital Twins with LLMs Zihan Guan, Rituparna Datta, Mengxuan Hu, Shunshun Liu, Aiying Zhang, Prasanna Balachandran, Sheng Li, Anil Vullikanti cs.LG, cs.AI, cs.CL 2026-02-20  
What Do LLMs Associate with Your Name? A Human-Centered Black-Box Audit of Personal Data Dimitri Staufer, Kirsten Morehouse cs.HC, cs.AI, cs.CL, cs.CY 2026-02-19  
From Labor to Collaboration: A Methodological Experiment Using AI Agents to Augment Research Perspectives in Taiwan’s Humanities and Social Sciences Yi-Chih Huang cs.AI, cs.CL, cs.CY 2026-02-19  
Evaluating Collective Behaviour of Hundreds of LLM Agents Richard Willis, Jianing Zhao, Yali Du, Joel Z. Leibo cs.MA 2026-02-18  
AREG: Adversarial Resource Extraction Game for Evaluating Persuasion and Resistance in Large Language Models Adib Sakhawat, Fardeen Sadab cs.CL 2026-02-18  
LLM-to-Speech: A Synthetic Data Pipeline for Training Dialectal Text-to-Speech Models Ahmed Khaled Khamis, Hesham Ali cs.CL 2026-02-17  
AgriWorld:A World Tools Protocol Framework for Verifiable Agricultural Reasoning with Code-Executing LLM Agents Zhixing Zhang, Jesen Zhang, Hao Liu, Qinhan Lv, Jing Yang, Kaitong Cai, Keze Wang cs.AI 2026-02-17  
Frontier AI Risk Management Framework in Practice: A Risk Analysis Technical Report v1.5 Dongrui Liu, Yi Yu, Jie Zhang, Guanxu Chen, Qihao Lin, Hanxi Zhu, Lige Huang, Yijin Zhou, Peng Wang, Shuai Shao, Boxuan Zhang, Zicheng Liu, Jingwei Sun, Yu Li, Yuejin Xie, Jiaxuan Guo, Jia Xu, Chaochao Lu, Bowen Zhou, Xia Hu, Jing Shao cs.AI, cs.CL, cs.CV, cs.CY, cs.LG 2026-02-16  
TruthStance: An Annotated Dataset of Conversations on Truth Social Fathima Ameen, Danielle Brown, Manusha Malgareddy, Amanul Haque cs.CL, cs.AI 2026-02-16  
An end-to-end agentic pipeline for smart contract translation and quality evaluation Abhinav Goel, Chaitya Shah, Agostino Capponi, Alfio Gliozzo cs.AI, cs.SE 2026-02-14  
Never say never: Exploring the effects of available knowledge on agent persuasiveness in controlled physiotherapy motivation dialogues Stephan Vonschallen, Rahel Häusler, Theresa Schmiedel, Friederike Eyssel cs.HC, cs.AI 2026-02-13  
WavBench: Benchmarking Reasoning, Colloquialism, and Paralinguistics for End-to-End Spoken Dialogue Models Yangzhuo Li, Shengpeng Ji, Yifu Chen, Tianle Liang, Haorong Ying, Yule Wang, Junbo Li, Jun Fang, Zhou Zhao cs.CL 2026-02-12  
Evaluating AGENTS.md: Are Repository-Level Context Files Helpful for Coding Agents? Thibaud Gloaguen, Niels Mündler, Mark Müller, Veselin Raychev, Martin Vechev cs.SE, cs.AI 2026-02-12  
Do Large Language Models Adapt to Language Variation across Socioeconomic Status? Elisa Bassignana, Mike Zhang, Dirk Hovy, Amanda Cercas Curry cs.CL 2026-02-12  
RELATE: A Reinforcement Learning-Enhanced LLM Framework for Advertising Text Generation Jinfang Wang, Jiajie Liu, Jianwei Wu, Ziqin Luo, Zhen Chen, Chunlei Li, Biao Han, Tao Deng, Yi Li, Shuanglong Li, Lin Liu cs.AI 2026-02-12  
AIR: Improving Agent Safety through Incident Response Zibo Xiao, Jun Sun, Junjie Chen cs.AI 2026-02-12  
TRACER: Trajectory Risk Aggregation for Critical Episodes in Agentic Reasoning Sina Tayebati, Divake Kumar, Nastaran Darabi, Davide Ettori, Ranganath Krishnan, Amit Ranjan Trivedi cs.AI 2026-02-11  
Learning to Compose for Cross-domain Agentic Workflow Generation Jialiang Wang, Shengxiang Xu, Hanmo Liu, Jiachuan Wang, Yuyu Luo, Shimin Di, Min-Ling Zhang, Lei Chen cs.MA, cs.AI, cs.LG, cs.SE 2026-02-11  
AlphaForgeBench: Benchmarking End-to-End Trading Strategy Design with Large Language Models Wentao Zhang, Mingxuan Zhao, Jincheng Gao, Jieshun You, Huaiyu Jia, Yilei Zhao, Bo An, Shuo Sun q-fin.TR, cs.AI 2026-02-10  
Towards Poisoning Robustness Certification for Natural Language Generation Mihnea Ghitu, Matthew Wicker cs.LG 2026-02-10  
Large Language Models for Designing Participatory Budgeting Rules Nguyen Thach, Xingchen Sha, Hau Chan cs.LG 2026-02-10  
Accelerating Social Science Research via Agentic Hypothesization and Experimentation Jishu Sen Gupta, Harini SI, Somesh Kumar Singh, Syed Mohamad Tawseeq, Yaman Kumar Singla, David Doermann, Rajiv Ratn Shah, Balaji Krishnamurthy cs.AI, cs.CL 2026-02-08  
Exploring AI-Augmented Sensemaking of Patient-Generated Health Data: A Mixed-Method Study with Healthcare Professionals in Cardiac Risk Reduction Pavithren V S Pakianathan, Rania Islambouli, Diogo Branco, Albrecht Schmidt, Tiago Guerreiro, Jan David Smeddinck cs.HC, cs.AI 2026-02-05  
Generative Ontology: When Structured Knowledge Learns to Create Benny Cheung cs.AI, cs.CL 2026-02-05  
Data-Centric Interpretability for LLM-based Multi-Agent Reinforcement Learning John Yan, Michael Yu, Yuqi Sun, Alexander Duffy, Tyler Marques, Matthew Lyle Olson cs.LG, cs.AI 2026-02-05  
RA-QA: Towards Respiratory Audio-based Health Question Answering Gaia A. Bertolino, Yuwei Zhang, Tong Xia, Domenico Talia, Cecilia Mascolo cs.SD, cs.LG, eess.AS 2026-02-04  
ProxyWar: Dynamic Assessment of LLM Code Generation in Game Arenas Wenjun Peng, Xinyu Wang, Qi Wu cs.SE, cs.AI 2026-02-04  
A$^2$-LLM: An End-to-end Conversational Audio Avatar Large Language Model Xiaolin Hu, Hang Yuan, Xinzhu Sang, Binbin Yan, Zhou Yu, Cong Huang, Kai Chen cs.LG, cs.AI, cs.SD 2026-02-04  
From Crafting Text to Crafting Thought: Grounding AI Writing Support to Writing Center Pedagogy Yijun Liu, John Gallagher, Sarah Sterman, Tal August cs.HC 2026-02-03  
The Necessity of a Unified Framework for LLM-Based Agent Evaluation Pengyu Zhu, Li Sun, Philip S. Yu, Sen Su cs.AI 2026-02-03  
GuideWeb: A Benchmark for Automatic In-App Guide Generation on Real-World Web UIs Chengguang Gan, Yoshihiro Tsujii, Yunhao Liang, Tatsunori Mori, Shiwen Ni, Hiroki Itoh cs.CL 2026-02-02  
Wiki Live Challenge: Challenging Deep Research Agents with Expert-Level Wikipedia Articles Shaohan Wang, Benfeng Xu, Licheng Zhang, Mingxuan Du, Chiwei Zhu, Xiaorui Wang, Zhendong Mao, Yongdong Zhang cs.CL 2026-02-02  
PedagoSense: A Pedology Grounded LLM System for Pedagogical Strategy Detection and Contextual Response Generation in Learning Dialogues Shahem Sultan, Shahem Fadi, Yousef Melhim, Ibrahim Alsarraj, Besher Hassan cs.CL 2026-02-01  
PaperBanana: Automating Academic Illustration for AI Scientists Dawei Zhu, Rui Meng, Yale Song, Xiyu Wei, Sujian Li, Tomas Pfister, Jinsung Yoon cs.CL, cs.CV 2026-01-30  
WebArbiter: A Principle-Guided Reasoning Process Reward Model for Web Agents Yao Zhang, Shijie Tang, Zeyu Li, Zhen Han, Volker Tresp cs.AI 2026-01-29  
Embodied Task Planning via Graph-Informed Action Generation with Large Language Model Xiang Li, Ning Yan, Masood Mortazavi cs.CL 2026-01-29  
More Code, Less Reuse: Investigating Code Quality and Reviewer Sentiment towards AI-generated Pull Requests Haoming Huang, Pongchai Jaisri, Shota Shimizu, Lingfeng Chen, Sota Nakashima, Gema Rodríguez-Pérez cs.SE, cs.AI, cs.HC 2026-01-29  
Planner-Auditor Twin: Agentic Discharge Planning with FHIR-Based LLM Planning, Guideline Recall, Optional Caching and Self-Improvement Kaiyuan Wu, Aditya Nagori, Rishikesan Kamaleswaran cs.AI, cs.MA 2026-01-28  
A Dialectic Pipeline for Improving LLM Robustness Sara Candussio cs.CL, cs.MA 2026-01-28  
RobustExplain: Evaluating Robustness of LLM-Based Explanation Agents for Recommendation Guilin Zhang, Kai Zhao, Jeffrey Friedman, Xu Chu cs.IR, cs.AI, cs.LG 2026-01-27  
Assessing the Quality of Mental Health Support in LLM Responses through Multi-Attribute Human Evaluation Abeer Badawi, Md Tahmid Rahman Laskar, Elahe Rahimi, Sheri Grach, Lindsay Bertrand, Lames Danok, Frank Rudzicz, Jimmy Huang, Elham Dolatabadi cs.AI, cs.HC 2026-01-26  
LegalMALR:Multi-Agent Query Understanding and LLM-Based Reranking for Chinese Statute Retrieval Yunhan Li, Mingjie Xie, Gaoli Kang, Zihan Gong, Gengshen Wu, Min Yang cs.IR, cs.CL 2026-01-25  
Status Hierarchies in Language Models Emilio Barkett cs.HC, cs.AI, cs.CL 2026-01-24  
The Shadow Self: Intrinsic Value Misalignment in Large Language Model Agents Chen Chen, Kim Young Il, Yuan Yang, Wenhao Su, Yilin Zhang, Xueluan Gong, Qian Wang, Yongsen Zheng, Ziyao Liu, Kwok-Yan Lam cs.CL 2026-01-24  
On the Insecurity of Keystroke-Based AI Authorship Detection: Timing-Forgery Attacks Against Motor-Signal Verification David Condrey cs.CR, cs.AI, cs.HC 2026-01-24  
LLMs Got Rhythm? Hybrid Phonological Filtering for Greek Poetry Rhyme Detection and Generation Stergios Chatzikyriakidis cs.CL 2026-01-14  
Efficient Multilingual Dialogue Processing via Translation Pipelines and Distilled Language Models Santiago Martínez Novoa, Nicolás Rozo Fajardo, Diego Alejandro González Vargas, Nicolás Bedoya Figueroa cs.CL 2026-01-14  
Can LLMs interpret figurative language as humans do?: surface-level vs representational similarity Samhita Bollepally, Aurora Sloman-Moll, Takashi Yamauchi cs.CL, cs.AI 2026-01-14  
OpenMic: A Multi-Agent-Based Stand-Up Comedy Generation System Yuyang Wu, Hanzhong Cao, Jianhao Chen, Yufei Li cs.AI 2026-01-13  
Order in the Evaluation Court: A Critical Analysis of NLG Evaluation Trends Jing Yang, Nils Feldhus, Salar Mohtaj, Leonhard Hennig, Qianli Wang, Eleni Metheniti, Sherzod Hakimov, Charlott Jakob, Veronika Solopova, Konrad Rieck, David Schlangen, Sebastian Möller, Vera Schmitt cs.CL 2026-01-12  
PsyCLIENT: Client Simulation via Conversational Trajectory Modeling for Trainee Practice and Model Evaluation in Mental Health Counseling Huachuan Qiu, Zhaoming Chen, Yuqian Chen, Yuan Xie, Yu Lu, Zhenzhong Lan cs.CL 2026-01-12  
Agents of Diffusion: Enhancing Diffusion Language Models with Multi-Agent Reinforcement Learning for Structured Data Generation (Extended Version) Aja Khanal, Kaushik T. Ranade, Rishabh Agrawal, Kalyan S. Basu, Apurva Narayan cs.MA 2026-01-12  
Can a Unimodal Language Agent Provide Preferences to Tune a Multimodal Vision-Language Model? Sazia Tabasum Mim, Jack Morris, Manish Dhakal, Yanming Xiu, Maria Gorlatova, Yi Ding cs.CL 2026-01-10  
STELP: Secure Transpilation and Execution of LLM-Generated Programs Swapnil Shinde, Sahil Wadhwa, Andy Luo, Akshay Gupta, Mohammad Shahed Sorower cs.SE, cs.AI 2026-01-09  
A Preliminary Agentic Framework for Matrix Deflation Paimon Goulart, Evangelos E. Papalexakis cs.LG 2026-01-06  
The Path Ahead for Agentic AI: Challenges and Opportunities Nadia Sibai, Yara Ahmed, Serry Sibaee, Sawsan AlHalawani, Adel Ammar, Wadii Boulila cs.AI 2026-01-06  
AgentMark: Utility-Preserving Behavioral Watermarking for Agents Kaibo Huang, Jin Tan, Yukun Wei, Wanling Li, Zipei Zhang, Hui Tian, Zhongliang Yang, Linna Zhou cs.CR, cs.AI 2026-01-05  
WebCoderBench: Benchmarking Web Application Generation with Comprehensive and Interpretable Evaluation Metrics Chenxu Liu, Yingjie Fu, Wei Yang, Ying Zhang, Tao Xie cs.SE, cs.AI 2026-01-05  
CaveAgent: Transforming LLMs into Stateful Runtime Operators Maohao Ran, Zhenglin Wan, Cooper Lin, Yanting Zhang, Hongyu Xin, Hongwei Fan, Yibo Xu, Beier Luo, Yaxin Zhou, Wangbo Zhao, Lijie Yang, Lang Feng, Fuchao Yang, Jingxuan Wu, Yiqiao Huang, Chendong Ma, Dailing Jiang, Jianbo Deng, Sihui Han, Bo An, Yike Guo, Jun Song cs.AI, cs.SE 2026-01-04  
MAMA-Memeia! Multi-Aspect Multi-Agent Collaboration for Depressive Symptoms Identification in Memes Siddhant Agarwal, Adya Dhuler, Polly Ruhnke, Melvin Speisman, Md Shad Akhtar, Shweta Yadav cs.CL 2025-12-31  
Do Large Language Models Know What They Are Capable Of? Casey O. Barkan, Sid Black, Oliver Sourbut cs.CL, cs.AI 2025-12-31  
The Silicon Psyche: Anthropomorphic Vulnerabilities in Large Language Models Giuseppe Canale, Kashyap Thimmaraju cs.CR, cs.AI, cs.CY, cs.HC 2025-12-30  
Web World Models Jichen Feng, Yifan Zhang, Chenggong Zhang, Yifu Lu, Shilong Liu, Mengdi Wang cs.AI, cs.CL, cs.CV 2025-12-29  
TCEval: Using Thermal Comfort to Assess Cognitive and Perceptual Abilities of AI Jingming Li cs.AI 2025-12-29  
AI-Generated Code Is Not Reproducible (Yet): An Empirical Study of Dependency Gaps in LLM-Based Coding Agents Bhanu Prakash Vangala, Ali Adibifar, Tanu Malik, Ashish Gehani cs.SE, cs.AI, cs.MA 2025-12-26  
Emotion Diffusion in Real and Simulated Social Graphs: Structural Limits of LLM-Based Social Simulation Qiqi Qiang cs.SI 2025-12-24  
NVIDIA Nemotron 3: Efficient and Open Intelligence NVIDIA, :, Aaron Blakeman, Aaron Grattafiori, Aarti Basant, Abhibha Gupta, Abhinav Khattar, Adi Renduchintala, Aditya Vavre, Akanksha Shukla, Akhiad Bercovich, Aleksander Ficek, Aleksandr Shaposhnikov, Alex Kondratenko, Alexander Bukharin, Alexandre Milesi, Ali Taghibakhshi, Alisa Liu, Amelia Barton, Ameya Sunil Mahabaleshwarkar, Amir Klein, Amit Zuker, Amnon Geifman, Amy Shen, Anahita Bhiwandiwalla, Andrew Tao, Anjulie Agrusa, Ankur Verma, Ann Guan, Anubhav Mandarwal, Arham Mehta, Ashwath Aithal, Ashwin Poojary, Asif Ahamed, Asit Mishra, Asma Kuriparambil Thekkumpate, Ayush Dattagupta, Banghua Zhu, Bardiya Sadeghi, Barnaby Simkin, Ben Lanir, Benedikt Schifferer, Besmira Nushi, Bilal Kartal, Bita Darvish Rouhani, Boris Ginsburg, Brandon Norick, Brandon Soubasis, Branislav Kisacanin, Brian Yu, Bryan Catanzaro, Carlo del Mundo, Chantal Hwang, Charles Wang, Cheng-Ping Hsieh, Chenghao Zhang, Chenhan Yu, Chetan Mungekar, Chintan Patel, Chris Alexiuk, Christopher Parisien, Collin Neale, Cyril Meurillon, Damon Mosk-Aoyama, Dan Su, Dane Corneil, Daniel Afrimi, Daniel Lo, Daniel Rohrer, Daniel Serebrenik, Daria Gitman, Daria Levy, Darko Stosic, David Mosallanezhad, Deepak Narayanan, Dhruv Nathawani, Dima Rekesh, Dina Yared, Divyanshu Kakwani, Dong Ahn, Duncan Riach, Dusan Stosic, Edgar Minasyan, Edward Lin, Eileen Long, Eileen Peters Long, Elad Segal, Elena Lantz, Ellie Evans, Elliott Ning, Eric Chung, Eric Harper, Eric Tramel, Erick Galinkin, Erik Pounds, Evan Briones, Evelina Bakhturina, Evgeny Tsykunov, Faisal Ladhak, Fay Wang, Fei Jia, Felipe Soares, Feng Chen, Ferenc Galko, Frank Sun, Frankie Siino, Gal Hubara Agam, Ganesh Ajjanagadde, Gantavya Bhatt, Gargi Prasad, George Armstrong, Gerald Shen, Gorkem Batmaz, Grigor Nalbandyan, Haifeng Qian, Harsh Sharma, Hayley Ross, Helen Ngo, Herbert Hum, Herman Sahota, Hexin Wang, Himanshu Soni, Hiren Upadhyay, Huizi Mao, Huy C Nguyen, Huy Q Nguyen, Iain Cunningham, Ido Galil, Ido Shahaf, Igor Gitman, Ilya Loshchilov, Itamar Schen, Itay Levy, Ivan Moshkov, Izik Golan, Izzy Putterman, Jan Kautz, Jane Polak Scowcroft, Jared Casper, Jatin Mitra, Jeffrey Glick, Jenny Chen, Jesse Oliver, Jian Zhang, Jiaqi Zeng, Jie Lou, Jimmy Zhang, Jinhang Choi, Jining Huang, Joey Conway, Joey Guman, John Kamalu, Johnny Greco, Jonathan Cohen, Joseph Jennings, Joyjit Daw, Julien Veron Vialard, Junkeun Yi, Jupinder Parmar, Kai Xu, Kan Zhu, Kari Briski, Katherine Cheung, Katherine Luna, Keith Wyss, Keshav Santhanam, Kevin Shih, Kezhi Kong, Khushi Bhardwaj, Kirthi Shankar, Krishna C. Puvvada, Krzysztof Pawelec, Kumar Anik, Lawrence McAfee, Laya Sleiman, Leon Derczynski, Li Ding, Lizzie Wei, Lucas Liebenwein, Luis Vega, Maanu Grover, Maarten Van Segbroeck, Maer Rodrigues de Melo, Mahdi Nazemi, Makesh Narsimhan Sreedhar, Manoj Kilaru, Maor Ashkenazi, Marc Romeijn, Marcin Chochowski, Mark Cai, Markus Kliegl, Maryam Moosaei, Matt Kulka, Matvei Novikov, Mehrzad Samadi, Melissa Corpuz, Mengru Wang, Meredith Price, Michael Andersch, Michael Boone, Michael Evans, Miguel Martinez, Mikail Khona, Mike Chrzanowski, Minseok Lee, Mohammad Dabbah, Mohammad Shoeybi, Mostofa Patwary, Nabin Mulepati, Najeeb Nabwani, Natalie Hereth, Nave Assaf, Negar Habibi, Neta Zmora, Netanel Haber, Nicola Sessions, Nidhi Bhatia, Nikhil Jukar, Nikki Pope, Nikolai Ludwig, Nima Tajbakhsh, Nir Ailon, Nirmal Juluru, Nishant Sharma, Oleksii Hrinchuk, Oleksii Kuchaiev, Olivier Delalleau, Oluwatobi Olabiyi, Omer Ullman Argov, Omri Puny, Oren Tropp, Ouye Xie, Parth Chadha, Pasha Shamis, Paul Gibbons, Pavlo Molchanov, Pawel Morkisz, Peter Dykas, Peter Jin, Pinky Xu, Piotr Januszewski, Pranav Prashant Thombre, Prasoon Varshney, Pritam Gundecha, Przemek Tredak, Qing Miao, Qiyu Wan, Rabeeh Karimi Mahabadi, Rachit Garg, Ran El-Yaniv, Ran Zilberstein, Rasoul Shafipour, Rich Harang, Rick Izzo, Rima Shahbazyan, Rishabh Garg, Ritika Borkar, Ritu Gala, Riyad Islam, Robert Hesse, Roger Waleffe, Rohit Watve, Roi Koren, Ruoxi Zhang, Russell Hewett, Russell J. Hewett, Ryan Prenger, Ryan Timbrook, Sadegh Mahdavi, Sahil Modi, Samuel Kriman, Sangkug Lim, Sanjay Kariyappa, Sanjeev Satheesh, Saori Kaji, Satish Pasumarthi, Saurav Muralidharan, Sean Narentharen, Sean Narenthiran, Seonmyeong Bak, Sergey Kashirsky, Seth Poulos, Shahar Mor, Shanmugam Ramasamy, Shantanu Acharya, Shaona Ghosh, Sharath Turuvekere Sreenivas, Shelby Thomas, Shiqing Fan, Shreya Gopal, Shrimai Prabhumoye, Shubham Pachori, Shubham Toshniwal, Shuoyang Ding, Siddharth Singh, Simeng Sun, Smita Ithape, Somshubra Majumdar, Soumye Singhal, Stas Sergienko, Stefania Alborghetti, Stephen Ge, Sugam Dipak Devare, Sumeet Kumar Barua, Suseella Panguluri, Suyog Gupta, Sweta Priyadarshi, Syeda Nahida Akter, Tan Bui, Teodor-Dumitru Ene, Terry Kong, Thanh Do, Tijmen Blankevoort, Tim Moon, Tom Balough, Tomer Asida, Tomer Bar Natan, Tomer Ronen, Tugrul Konuk, Twinkle Vashishth, Udi Karpas, Ushnish De, Vahid Noorozi, Vahid Noroozi, Venkat Srinivasan, Venmugil Elango, Victor Cui, Vijay Korthikanti, Vinay Rao, Vitaly Kurin, Vitaly Lavrukhin, Vladimir Anisimov, Wanli Jiang, Wasi Uddin Ahmad, Wei Du, Wei Ping, Wenfei Zhou, Will Jennings, William Zhang, Wojciech Prazuch, Xiaowei Ren, Yashaswi Karnati, Yejin Choi, Yev Meyer, Yi-Fu Wu, Yian Zhang, Yigong Qin, Ying Lin, Yonatan Geifman, Yonggan Fu, Yoshi Subara, Yoshi Suhara, Yubo Gao, Zach Moshe, Zhen Dong, Zhongbo Zhu, Zihan Liu, Zijia Chen, Zijie Yan cs.CL, cs.AI, cs.LG 2025-12-24  
AgentMath: Empowering Mathematical Reasoning for Large Language Models via Tool-Augmented Agent Haipeng Luo, Huawen Feng, Qingfeng Sun, Can Xu, Kai Zheng, Yufei Wang, Tao Yang, Han Hu, Yansong Tang, Di Wang cs.AI, cs.CL, cs.LG 2025-12-23  
SA-DiffuSeq: Addressing Computational and Scalability Challenges in Long-Document Generation with Sparse Attention Alexandros Christoforos, Chadbourne Davis cs.CL, cs.AI 2025-12-23  
MoE-DiffuSeq: Enhancing Long-Document Diffusion Models with Sparse Attention and Mixture of Experts Alexandros Christoforos, Chadbourne Davis cs.CL 2025-12-23  
Distilling to Hybrid Attention Models via KL-Guided Layer Selection Yanhong Li, Songlin Yang, Shawn Tan, Mayank Mishra, Rameswar Panda, Jiawei Zhou, Yoon Kim cs.CL, cs.AI 2025-12-23  
LLM Agents Implement an NLG System from Scratch: Building Interpretable Rule-Based RDF-to-Text Generators Mateusz Lango, Ondřej Dušek cs.CL, cs.AI 2025-12-20  
ShareChat: A Dataset of Chatbot Conversations in the Wild Yueru Yan, Tuc Nguyen, Bo Su, Melissa Lieffers, Thai Le cs.CL, cs.AI, cs.HC 2025-12-19  
Polypersona: Persona-Grounded LLM for Synthetic Survey Responses Tejaswani Dash, Dinesh Karri, Anudeep Vurity, Gautam Datla, Tazeem Ahmad, Saima Rafi, Rohith Tangudu cs.CL, cs.AI 2025-12-16  
Evaluation of AI Ethics Tools in Language Models: A Developers’ Perspective Case Stud Jhessica Silva, Diego A. B. Moreira, Gabriel O. dos Santos, Alef Ferreira, Helena Maia, Sandra Avila, Helio Pedrini cs.CY, cs.AI, cs.CL 2025-12-16  
Workflow is All You Need: Escaping the “Statistical Smoothing Trap” via High-Entropy Information Foraging and Adversarial Pacing Zhongjie Jiang cs.CL, cs.AI, cs.CY, q-fin.GN 2025-12-10  
Knowledge-Augmented Large Language Model Agents for Explainable Financial Decision-Making Qingyuan Zhang, Yuxi Wang, Cancan Hua, Yulin Huang, Ning Lyu cs.CL 2025-12-10  
The Erosion of LLM Signatures: Can We Still Distinguish Human and LLM-Generated Scientific Ideas After Iterative Paraphrasing? Sadat Shahriar, Navid Ayoobi, Arjun Mukherjee cs.LG, cs.AI 2025-12-04  
Learning Evolving Latent Strategies for Multi-Agent Language Systems without Model Fine-Tuning Wenlong Tang cs.LG, cs.AI 2025-11-28  
Towards Improving Interpretability of Language Model Generation through a Structured Knowledge Discovery Approach Shuqi Liu, Han Wu, Guanzhi Deng, Jianshu Chen, Xiaoyang Wang, Linqi Song cs.CL, cs.AI 2025-11-28  
Adaptive LLM Agents: Toward Personalized Empathetic Care Priyanka Singh, Sebastian Von Mammen cs.HC 2025-11-25  
Deep Research: A Systematic Survey Zhengliang Shi, Yiqun Chen, Haitao Li, Weiwei Sun, Shiyu Ni, Yougang Lyu, Run-Ze Fan, Bowen Jin, Yixuan Weng, Minjun Zhu, Qiujie Xie, Xinyu Guo, Qu Yang, Jiayi Wu, Jujia Zhao, Xiaqiang Tang, Xinbei Ma, Cunxiang Wang, Jiaxin Mao, Qingyao Ai, Jen-Tse Huang, Wenxuan Wang, Yue Zhang, Yiming Yang, Zhaopeng Tu, Zhaochun Ren cs.CL, cs.AI, cs.IR 2025-11-24  
MindEval: Benchmarking Language Models on Multi-turn Mental Health Support José Pombal, Maya D’Eon, Nuno M. Guerreiro, Pedro Henrique Martins, António Farinhas, Ricardo Rei cs.CL, cs.AI 2025-11-23  
NAMeGEn: Creative Name Generation via A Novel Agent-based Multiple Personalized Goal Enhancement Framework Shanlin Zhou, Xinpeng Wang, Jianxun Lian, Zhenghao Liu, Laks V. S. Lakshmanan, Xiaoyuan Yi, Yongtao Hao cs.CL, cs.AI, cs.IR, cs.MA, cs.NE 2025-11-19  
AfriSpeech-MultiBench: A Verticalized Multidomain Multicountry Benchmark Suite for African Accented English ASR Gabrial Zencha Ashungafac, Mardhiyah Sanni, Busayo Awobade, Alex Gichamba, Tobi Olatunji cs.CL 2025-11-18  
Generalist Foundation Models Are Not Clinical Enough for Hospital Operations Lavender Y. Jiang, Angelica Chen, Xu Han, Xujin Chris Liu, Radhika Dua, Kevin Eaton, Frederick Wolff, Robert Steele, Jeff Zhang, Anton Alyakin, Qingkai Pan, Yanbing Chen, Karl L. Sangwon, Daniel A. Alber, Jaden Stryker, Jin Vivian Lee, Yindalon Aphinyanaphongs, Kyunghyun Cho, Eric Karl Oermann cs.CL, cs.AI, cs.LG 2025-11-17  
Prompt-Based Value Steering of Large Language Models Giulio Antonio Abbo, Tony Belpaeme cs.CL, cs.AI 2025-11-14  
Self-Correcting Large Language Models: Generation vs. Multiple Choice Hossein A. Rahmani, Satyapriya Krishna, Xi Wang, Mohammadmehdi Naghiaei, Emine Yilmaz cs.CL, cs.AI 2025-11-12  
HalluClean: A Unified Framework to Combat Hallucinations in LLMs Yaxin Zhao, Yu Zhang cs.CL 2025-11-12  
Simulating Students with Large Language Models: A Review of Architecture, Mechanisms, and Role Modelling in Education with Generative AI Luis Marquez-Carpintero, Alberto Lopez-Sellers, Miguel Cazorla cs.CY, cs.AI, cs.CL 2025-11-08  
Transforming Mentorship: An AI Powered Chatbot Approach to University Guidance Mashrur Rahman, Mantaqa abedin, Monowar Zamil Abir, Faizul Islam Ansari, Adib Reza, Farig Yousuf Sadeque, Niloy Farhan cs.IR, cs.CL 2025-11-06  
Multi-Agent Collaborative Framework For Math Problem Generation Kia Karbasi, Kevin Hong, Mohammad Amin Samadi, Gregory Pottie cs.MA, cs.CL, cs.HC 2025-11-06  
Bayesian Evaluation of Large Language Model Behavior Rachel Longjohn, Shang Wu, Saatvik Kher, Catarina Belém, Padhraic Smyth cs.CL, cs.LG, stat.AP, stat.ML 2025-11-04  
Hybrid Quantum Transformer for Language Generation Desheng Kong, Xiangshuo Cui, Jiaying Jin, Jing Xu, Donglin Wang cs.CL, cs.AI, quant-ph 2025-11-02  
Fine-Tuning DialoGPT on Common Diseases in Rural Nepal for Medical Conversations Birat Poudel, Satyam Ghimire, Er. Prakash Chandra Prasad cs.CL 2025-11-01  
Consistently Simulating Human Personas with Multi-Turn Reinforcement Learning Marwa Abdulhai, Ryan Cheng, Donovan Clay, Tim Althoff, Sergey Levine, Natasha Jaques cs.CL, cs.AI 2025-10-31  
CATArena: Evaluation of LLM Agents through Iterative Tournament Competitions Lingyue Fu, Xin Ding, Yaoming Zhu, Shao Zhang, Lin Qiu, Weiwen Liu, Weinan Zhang, Xuezhi Cao, Xunliang Cai, Jiaxin Ding, Yong Yu cs.AI, cs.CL 2025-10-30  
Evaluating LLMs on Generating Age-Appropriate Child-Like Conversations Syed Zohaib Hassan, Pål Halvorsen, Miriam S. Johnson, Pierre Lison cs.CL 2025-10-28  
Dialogue Is Not Enough to Make a Communicative BabyLM (But Neither Is Developmentally Inspired Reinforcement Learning) Francesca Padovani, Bastian Bunzeck, Manar Ali, Omar Momen, Arianna Bisazza, Hendrik Buschmeier, Sina Zarrieß cs.CL 2025-10-23  
Check Yourself Before You Wreck Yourself: Selectively Quitting Improves LLM Agent Safety Vamshi Krishna Bonagiri, Ponnurangam Kumaragurum, Khanh Nguyen, Benjamin Plaut cs.CL 2025-10-18  
Efficient Seq2seq Coreference Resolution Using Entity Representations Matt Grenander, Shay B. Cohen, Mark Steedman cs.CL 2025-10-16  
Generating Fair Consensus Statements with Social Choice on Token-Level MDPs Carter Blair, Kate Larson cs.AI, cs.CL, cs.GT 2025-10-15  
[HF] Deflanderization for Game Dialogue: Balancing Character Authenticity with Task Execution in LLM-based NPCs Pasin Buakhaw, Kun Kerdthaisong, Phuree Phenhiran, Pitikorn Khlaisamniang, Supasate Vorathammathorn, Piyalitt Ittichaiwong, Nutchanon Yongsatianchot   2025-10-15 1
MADREC: A Multi-Aspect Driven LLM Agent for Explainable and Adaptive Recommendation Jiin Park, Misuk Kim cs.IR, cs.AI 2025-10-15  
CiteGuard: Faithful Citation Attribution for LLMs via Retrieval-Augmented Validation Yee Man Choi, Xuehang Guo, Yi R. Fung, Qingyun Wang cs.DL 2025-10-15  
GOAT: A Training Framework for Goal-Oriented Agent with Tools Hyunji Min, Sangwon Jung, Junyoung Sung, Dosung Lee, Leekyeung Han, Paul Hongsuck Seo cs.AI 2025-10-14  
ToolMem: Enhancing Multimodal Agents with Learnable Tool Capability Memory Yunzhong Xiao, Yangmin Li, Hewei Wang, Yunlong Tang, Zora Zhiruo Wang cs.CL 2025-10-08  
What Do Humans Hear When Interacting? Experiments on Selective Listening for Evaluating ASR of Spoken Dialogue Systems Kiyotada Mori, Seiya Kawano, Chaoran Liu, Carlos Toshinori Ishi, Angel Fernando Garcia Contreras, Koichiro Yoshino cs.CL 2025-08-06  
Investigating Hallucination in Conversations for Low Resource Languages Amit Das, Md. Najib Hasan, Souvika Sarkar, Zheng Zhang, Fatemeh Jamshidi, Tathagata Bhattacharya, Nilanjana Raychawdhury, Dongji Feng, Vinija Jain, Aman Chadha cs.CL 2025-07-30  
Teaching Language Models To Gather Information Proactively Tenghao Huang, Sihao Chen, Muhao Chen, Jonathan May, Longqi Yang, Mengting Wan, Pei Zhou cs.AI, cs.CL 2025-07-28  
[HF] RMTBench: Benchmarking LLMs Through Multi-Turn User-Centric Role-Playing Hao Xiang, Tianyi Tang, Yang Su, Bowen Yu, An Yang, Fei Huang, Yichang Zhang, Yaojie Lu, Hongyu Lin, Xianpei Han, Jingren Zhou, Junyang Lin, Le Sun   2025-07-27  
AI-Driven Generation of Old English: A Framework for Low-Resource Languages Rodrigo Gabriel Salazar Alva, Matías Nuñez, Cristian López, Javier Martín Arista cs.CL, cs.AI 2025-07-27  
CogDual: Enhancing Dual Cognition of LLMs via Reinforcement Learning with Implicit Rule-Based Rewards Cheng Liu, Yifei Lu, Fanghua Ye, Jian Li, Xingyu Chen, Feiliang Ren, Zhaopeng Tu, Xiaolong Li cs.CL 2025-07-23  
[HF] DialogueForge: LLM Simulation of Human-Chatbot Dialogue Ruizhe Zhu, Hao Zhu, Yaxuan Li, Syang Zhou, Shijing Cai, Malgorzata Lazuka, Elliott Ash   2025-07-21 1
On the Semantics of Large Language Models Martin Schuele cs.CL, cs.AI 2025-07-07  
SHNU Multilingual Conversational Speech Recognition System for INTERSPEECH 2025 MLC-SLM Challenge Yuxiang Mei, Yuang Zheng, Dongxing Xu, Yanhua Long cs.CL, eess.AS 2025-07-04  
The Future is Agentic: Definitions, Perspectives, and Open Challenges of Multi-Agent Recommender Systems Reza Yousefi Maragheh, Yashar Deldjoo cs.IR 2025-07-02  
Decision-Oriented Text Evaluation Yu-Shiang Huang, Chuan-Ju Wang, Chung-Chi Chen cs.CL 2025-07-02  
[HF] SPADE: Systematic Prompt Framework for Automated Dialogue Expansion in Machine-Generated Text Detection Haoyi Li, Angela Yifei Yuan, Soyeon Caren Han, Christopher Leckie   2025-03-19  
[HF] Open-Source Large Language Models as Multilingual Crowdworkers: Synthesizing Open-Domain Dialogues in Several Languages With No Examples in Targets and No Machine Translation Ahmed Njifenjou, Virgile Sucal, Bassam Jabaian, Fabrice Lefèvre   2025-03-05  
[HF] Dialogue Benchmark Generation from Knowledge Graphs with Cost-Effective Retrieval-Augmented LLMs Reham Omar, Omij Mangukiya, Essam Mansour   2025-01-17  
[HF] DEMO: Reframing Dialogue Interaction with Fine-grained Element Modeling Minzheng Wang, Xinghua Zhang, Kun Chen, Nan Xu, Haiyang Yu, Fei Huang, Wenji Mao, Yongbin Li   2024-12-06 8
[HF] DiaSynth – Synthetic Dialogue Generation Framework Sathya Krishnan Suresh, Wu Mengjun, Tushar Pranav, Eng Siong Chng   2024-09-25 20
[HF] J-CHAT: Japanese Large-scale Spoken Dialogue Corpus for Spoken Dialogue Language Modeling Wataru Nakata, Kentaro Seki, Hitomi Yanaka, Yuki Saito, Shinnosuke Takamichi, Hiroshi Saruwatari   2024-07-22  
[HF] PSYDIAL: Personality-based Synthetic Dialogue Generation using Large Language Models Ji-Eun Han, Jun-Seok Koh, Hyeon-Tae Seo, Du-Seong Chang, Kyung-Ah Sohn   2024-04-01  
[HF] StyleChat: Learning Recitation-Augmented Memory in LLMs for Stylized Dialogue Generation Jinpeng Li, Zekai Zhang, Quan Tu, Xin Cheng, Dongyan Zhao, Rui Yan   2024-03-18  
[HF] KoDialogBench: Evaluating Conversational Understanding of Language Models with Korean Dialogue Benchmark Seongbo Jang, Seonghyeon Lee, Hwanjo Yu   2024-02-27  
[HF] Instruct Once, Chat Consistently in Multiple Rounds: An Efficient Tuning Framework for Dialogue Jian Wang, Chak Tou Leong, Jiashuo Wang, Dongding Lin, Wenjie Li, Xiao-Yong Wei   2024-02-10  
[HF] Bootstrapping LLM-based Task-Oriented Dialogue Agents via Self-Talk Dennis Ulmer, Elman Mansimov, Kaixiang Lin, Justin Sun, Xibin Gao, Yi Zhang   2024-01-10 18
[HF] Faithful Persona-based Conversational Dataset Generation with Large Language Models Pegah Jandaghi, XiangHai Sheng, Xinyi Bai, Jay Pujara, Hakim Sidahmed   2023-12-15 11
[HF] CharacterGLM: Customizing Chinese Conversational AI Characters with Large Language Models Jinfeng Zhou, Zhuang Chen, Dazhen Wan, Bosi Wen, Yi Song, Jifan Yu, Yongkang Huang, Libiao Peng, Jiaming Yang, Xiyao Xiao, Sahand Sabour, Xiaohan Zhang, Wenjing Hou, Yijia Zhang, Yuxiao Dong, Jie Tang, Minlie Huang   2023-11-28 1
[HF] PRODIGy: a PROfile-based DIalogue Generation dataset Daniela Occhipinti, Serra Sinem Tekiroglu, Marco Guerini   2023-11-09  
[HF] Learning From Free-Text Human Feedback – Collect New Datasets Or Extend Existing Ones? Dominic Petrak, Nafise Sadat Moosavi, Ye Tian, Nikolai Rozanov, Iryna Gurevych   2023-10-24  
[HF] MIRACLE: Towards Personalized Dialogue Generation with Latent-Space Multiple Personal Attribute Control Zhenyi Lu, Wei Wei, Xiaoye Qu, XianLing Mao, Dangyang Chen, Jixiong Chen   2023-10-22  
[HF] BotChat: Evaluating LLMs’ Capabilities of Having Multi-Turn Dialogues Haodong Duan, Jueqi Wei, Chonghua Wang, Hongwei Liu, Yixiao Fang, Songyang Zhang, Dahua Lin, Kai Chen   2023-10-20  
[HF] Conversation Chronicles: Towards Diverse Temporal and Relational Dynamics in Multi-Session Conversations Jihyoung Jang, Minseong Boo, Hyounghun Kim   2023-10-20 2
[HF] We are what we repeatedly do: Inducing and deploying habitual schemas in persona-based responses Benjamin Kane, Lenhart Schubert   2023-10-10 1
[HF] Chat Vector: A Simple Approach to Equip LLMs With New Language Chat Capabilities Shih-Cheng Huang, Pin-Zu Li, Yu-Chi Hsu, Kuang-Ming Chen, Yu Tung Lin, Shih-Kai Hsiao, Richard Tzong-Han Tsai, Hung-yi Lee   2023-10-07  
[HF] Towards human-like spoken dialogue generation between AI agents from written dialogue Kentaro Mitsui, Yukiya Hono, Kei Sawada   2023-10-02  
[HF] ChatHaruhi: Reviving Anime Character in Reality via Large Language Model Cheng Li, Ziang Leng, Chenxi Yan, Junyi Shen, Hao Wang, Weishi MI, Yaying Fei, Xiaoyang Feng, Song Yan, HaoSheng Wang, Linkang Zhan, Yaokai Jia, Pingyu Wu, Haozhen Sun   2023-08-18 1
[HF] Three Ways of Using Large Language Models to Evaluate Chat Ondřej Plátek, Vojtěch Hudeček, Patricia Schmidtová, Mateusz Lango, Ondřej Dušek   2023-08-12  
[HF] DIALGEN: Collaborative Human-LM Generated Dialogues for Improved Understanding of Human-Human Conversations Bo-Ru Lu, Nikita Haduong, Chia-Hsuan Lee, Zeqiu Wu, Hao Cheng, Paul Koester, Jean Utke, Tao Yu, Noah A. Smith, Mari Ostendorf   2023-07-13 17
[HF] Prompted LLMs as Chatbot Modules for Long Open-domain Conversation Gibbeum Lee, Volker Hartmann, Jongho Park, Dimitris Papailiopoulos, Kangwook Lee   2023-05-08  
[HF] Multi-Party Chat: Conversational Agents in Group Settings with Humans and Models Jimmy Wei, Kurt Shuster, Arthur Szlam, Jason Weston, Jack Urbanek, Mojtaba Komeili   2023-04-26 1
[HF] ChatLLM Network: More brains, More intelligence Rui Hao, Linmei Hu, Weijian Qi, Qingliu Wu, Yirui Zhang, Liqiang Nie   2023-04-24  
[HF] Leveraging Large Language Models to Power Chatbots for Collecting User Self-Reported Data Jing Wei, Sungdong Kim, Hyunhoon Jung, Young-Ho Kim   2023-01-14  
[HF] Controllable Dialogue Simulation with In-Context Learning Zekun Li, Wenhu Chen, Shiyang Li, Hong Wang, Jing Qian, Xifeng Yan   2022-10-09  
[HF] A Benchmark for Understanding and Generating Dialogue between Characters in Stories Jianzhu Yao, Ziqi Liu, Jian Guan, Minlie Huang   2022-09-18  
[HF] MDIA: A Benchmark for Multilingual Dialogue Generation in 46 Languages Qingyu Zhang, Xiaoyu Shen, Ernie Chang, Jidong Ge, Pengke Chen   2022-08-27  
[HF] Building a Personalized Dialogue System with Prompt-Tuning Tomohito Kasahara, Daisuke Kawahara, Nguyen Tung, Shengzhe Li, Kenta Shinzato, Toshinori Sato   2022-06-11  
[HF] A Mixture-of-Expert Approach to RL-based Dialogue Management Yinlam Chow, Aza Tulepbergenov, Ofir Nachum, MoonKyung Ryu, Mohammad Ghavamzadeh, Craig Boutilier   2022-05-31 1
[HF] Towards a Progression-Aware Autonomous Dialogue Agent Abraham Sanders, Tomek Strzalkowski, Mei Si, Albert Chang, Deepanshu Dey, Jonas Braasch, Dakuo Wang   2022-05-07  
[HF] Building a Role Specified Open-Domain Dialogue System Leveraging Large-Scale Language Models Sanghwan Bae, Donghyun Kwak, Sungdong Kim, Donghoon Ham, Soyoung Kang, Sang-Woo Lee, Woomyoung Park   2022-04-30  
[HF] Meet Your Favorite Character: Open-domain Chatbot Mimicking Fictional Characters with only a Few Utterances Seungju Han, Beomsu Kim, Jin Yong Yoo, Seokjun Seo, Sangbum Kim, Enkhbayar Erdenee, Buru Chang   2022-04-22 1
[HF] Multimodal Dialogue Response Generation Qingfeng Sun, Yujing Wang, Can Xu, Kai Zheng, Yaming Yang, Huang Hu, Fei Xu, Jessica Zhang, Xiubo Geng, Daxin Jiang   2021-10-16  
[HF] CharacterChat: Supporting the Creation of Fictional Characters through Conversation and Progressive Manifestation with a Chatbot Oliver Schmitt, Daniel Buschek   2021-06-23  
[HF] Recent Advances in Deep Learning Based Dialogue Systems: A Systematic Survey Jinjie Ni, Tom Young, Vlad Pandelea, Fuzhao Xue, Erik Cambria   2021-05-10  
[HF] Like hiking? You probably enjoy nature: Persona-grounded Dialog with Commonsense Expansions Bodhisattwa Prasad Majumder, Harsh Jhamtani, Taylor Berg-Kirkpatrick, Julian McAuley   2020-10-07  
[HF] A Large-Scale Chinese Short-Text Conversation Dataset Yida Wang, Pei Ke, Yinhe Zheng, Kaili Huang, Yong Jiang, Xiaoyan Zhu, Minlie Huang   2020-08-10  
[HF] Recipes for building an open-domain chatbot Stephen Roller, Emily Dinan, Naman Goyal, Da Ju, Mary Williamson, Yinhan Liu, Jing Xu, Myle Ott, Kurt Shuster, Eric M. Smith, Y-Lan Boureau, Jason Weston   2020-04-28 1
[HF] A Pre-training Based Personalized Dialogue Generation Model with Persona-sparse Data Yinhe Zheng, Rongsheng Zhang, Xiaoxi Mao, Minlie Huang   2019-11-12  
[HF] DialoGPT: Large-Scale Generative Pre-training for Conversational Response Generation Yizhe Zhang, Siqi Sun, Michel Galley, Yen-Chun Chen, Chris Brockett, Xiang Gao, Jianfeng Gao, Jingjing Liu, Bill Dolan   2019-11-01 2
[HF] ALOHA: Artificial Learning of Human Attributes for Dialogue Agents Aaron W. Li, Veronica Jiang, Steven Y. Feng, Julia Sprague, Wei Zhou, Jesse Hoey   2019-10-18  
[HF] PLATO: Pre-trained Dialogue Generation Model with Discrete Latent Variable Siqi Bao, Huang He, Fan Wang, Hua Wu, Haifeng Wang   2019-10-17  
[HF] Towards Deep Conversational Recommendations Raymond Li, Samira Kahou, Hannes Schulz, Vincent Michalski, Laurent Charlin, Chris Pal   2018-12-18  

< Previous