Dialogue

[*] = found in both arXiv and HF search   [HF] = found via HF semantic search

written on 2026-03-28

title authors categories displaydate upvotes
OrgForge-IT: A Verifiable Synthetic Benchmark for LLM-Based Insider Threat Detection Jeffrey Flynt cs.CR, cs.LG 2026-03-23  
Optimizing Multi-Agent Weather Captioning via Text Gradient Descent: A Training-Free Approach with Consensus-Aware Gradient Fusion Shixu Liu cs.CL 2026-03-23  
Emergent Formal Verification: How an Autonomous AI Ecosystem Independently Discovered SMT-Based Safety Across Six Domains Octavian Untila cs.SE, cs.AI, cs.MA 2026-03-22  
Reasoning Gets Harder for LLMs Inside A Dialogue Ivan Kartáč, Mateusz Lango, Ondřej Dušek cs.CL 2026-03-20  
An Agentic Approach to Generating XAI-Narratives Yifan He, David Martens cs.CL 2026-03-20  
Semantic Delta: An Interpretable Signal Differentiating Human and LLMs Dialogue Riccardo Scantamburlo, Mauro Mezzanzana, Giacomo Buonanno, Francesco Bertolotti cs.CL, cs.AI 2026-03-20  
Skilled AI Agents for Embedded and IoT Systems Development Yiming Li, Yuhan Cheng, Mingchen Ma, Yihang Zou, Ningyuan Yang, Wei Cheng, Hai “Helen” Li, Yiran Chen, Tingjun Chen cs.SE, cs.AI 2026-03-20  
Mi:dm K 2.5 Pro KT Tech innovation Group cs.CL, cs.AI 2026-03-19  
Expert Personas Improve LLM Alignment but Damage Accuracy: Bootstrapping Intent-Based Persona Routing with PRISM Zizhao Hu, Mohammad Rostami, Jesse Thomason cs.AI 2026-03-19  
When Only the Final Text Survives: Implicit Execution Tracing for Multi-Agent Attribution Yi Nian, Haosen Cao, Shenzhe Zhu, Henry Peng Zou, Qingqing Luan, Yue Zhao cs.AI, cs.CL 2026-03-18  
Graph-Native Cognitive Memory for AI Agents: Formal Belief Revision Semantics for Versioned Memory Architectures Young Bin Park cs.AI, cs.IR, cs.LO 2026-03-18  
Evaluating LLM-Simulated Conversations in Modeling Inconsistent and Uncollaborative Behaviors in Human Social Interaction Ryo Kamoi, Ameya Godbole, Longqi Yang, Rui Zhang, Mengting Wan, Pei Zhou cs.CL 2026-03-17  
Differential Harm Propensity in Personalized LLM Agents: The Curious Case of Mental Health Disclosure Caglar Yildirim cs.AI 2026-03-17  
Proactive Rejection and Grounded Execution: A Dual-Stage Intent Analysis Paradigm for Safe and Efficient AIoT Smart Homes Xinxin Jin, Zhengwei Ni, Zhengguo Sheng, Victor C. M. Leung cs.AI 2026-03-17  
VIBEPASS: Can Vibe Coders Really Pass the Vibe Check? Srijan Bansal, Jiao Fangkai, Yilun Zhou, Austin Xu, Shafiq Joty, Semih Yavuz cs.SE, cs.AI 2026-03-16  
Practicing with Language Models Cultivates Human Empathic Communication Aakriti Kumar, Nalin Poungpeth, Diyi Yang, Bruce Lambert, Matthew Groh cs.CL, cs.HC 2026-03-16  
OrgForge: A Multi-Agent Simulation Framework for Verifiable Synthetic Corporate Corpora Jeffrey Flynt cs.CL, cs.AI, cs.IR 2026-03-16  
GNNVerifier: Graph-based Verifier for LLM Task Planning Yu Hao, Qiuyu Wang, Cheng Yang, Yawen Li, Zhiqiang Zhang, Chuan Shi cs.LG 2026-03-16  
GameUIAgent: An LLM-Powered Framework for Automated Game UI Design with Structured Intermediate Representation Wei Zeng, Fengwei An, Zhen Liu, Jian Zhao cs.AI 2026-03-16  
CangjieBench: Benchmarking LLMs on a Low-Resource General-Purpose Programming Language Junhang Cheng, Fang Liu, Jia Li, Chengru Wu, Nanxiang Jiang, Li Zhang cs.SE, cs.AI, cs.CL 2026-03-15  
Infinite Problem Generator: Verifiably Scaling Physics Reasoning Data with Agentic Workflows Aditya Sharan, Sriram Hebbale, Dhruv Kumar cs.CL, cs.AI 2026-03-15  
QChunker: Learning Question-Aware Text Chunking for Domain RAG via Multi-Agent Debate Jihao Zhao, Daixuan Li, Pengfei Li, Shuaishuai Zu, Biao Qin, Hongyan Liu cs.CL 2026-03-12  
[HF] End-to-End Chatbot Evaluation with Adaptive Reasoning and Uncertainty Filtering Nhi Dang, Tung Le, Huy Tien Nguyen   2026-03-11  
SPAR-K: Scheduled Periodic Alternating Early Exit for Spoken Language Models Hsiao-Ying Huang, Cheng-Han Chiang, Hung-yi Lee cs.CL, eess.AS 2026-03-10  
SCALAR: Learning and Composing Skills through LLM Guided Symbolic Planning and Deep RL Grounding Renos Zabounidis, Yue Wu, Simon Stepputtis, Woojun Kim, Yuanzhi Li, Tom Mitchell, Katia Sycara cs.LG 2026-03-10  
Memory for Autonomous LLM Agents:Mechanisms, Evaluation, and Emerging Frontiers Pengfei Du cs.AI 2026-03-08  
FireBench: Evaluating Instruction Following in Enterprise and API-Driven LLM Applications Yunfan Zhang, Yijie Bei, Jetashree Ravi, Pawel Garbacki cs.CL, cs.SE 2026-03-05  
EchoGuard: An Agentic Framework with Knowledge-Graph Memory for Detecting Manipulative Communication in Longitudinal Dialogue Ratna Kandala, Niva Manchanda, Akshata Kishore Moharir, Ananth Kandala cs.AI 2026-03-05  
Agentics 2.0: Logical Transduction Algebra for Agentic Data Workflows Alfio Massimiliano Gliozzo, Junkyu Lee, Nahuel Defosse cs.AI, cs.LG 2026-03-04  
Assessing the Effectiveness of LLMs in Delivering Cognitive Behavioral Therapy Navdeep Singh Bedi, Ana-Maria Bucur, Noriko Kando, Fabio Crestani cs.CL 2026-03-04  
BLUFF: Benchmarking the Detection of False and Synthetic Content across 58 Low-Resource Languages Jason Lucas, Matt Murtagh-White, Adaku Uchendu, Ali Al-Lawati, Michiharu Yamashita, Dominik Macko, Ivan Srba, Robert Moro, Dongwon Lee cs.CL 2026-02-28  
LLM-Driven Multi-Turn Task-Oriented Dialogue Synthesis for Realistic Reasoning Yu Zhu, Kai Yang cs.CL, cs.AI 2026-02-27  
Agentic AI as a Cybersecurity Attack Surface: Threats, Exploits, and Defenses in Runtime Supply Chains Xiaochong Jiang, Shiqi Yang, Wenting Yang, Yichen Liu, Cheng Ji cs.CR, cs.AI 2026-02-23  
TherapyGym: Evaluating and Aligning Clinical Fidelity and Safety in Therapy Chatbots Fangrui Huang, Souhad Chbeir, Arpandeep Khatua, Sheng Wang, Sijun Tan, Kenan Ye, Lily Bailey, Merryn Daniel, Ryan Louie, Sanmi Koyejo, Ehsan Adeli cs.CL, cs.AI, cs.CY 2026-02-23  
NIMMGen: Learning Neural-Integrated Mechanistic Digital Twins with LLMs Zihan Guan, Rituparna Datta, Mengxuan Hu, Shunshun Liu, Aiying Zhang, Prasanna Balachandran, Sheng Li, Anil Vullikanti cs.LG, cs.AI, cs.CL 2026-02-20  
What Do LLMs Associate with Your Name? A Human-Centered Black-Box Audit of Personal Data Dimitri Staufer, Kirsten Morehouse cs.HC, cs.AI, cs.CL, cs.CY 2026-02-19  
From Labor to Collaboration: A Methodological Experiment Using AI Agents to Augment Research Perspectives in Taiwan’s Humanities and Social Sciences Yi-Chih Huang cs.AI, cs.CL, cs.CY 2026-02-19  
Evaluating Collective Behaviour of Hundreds of LLM Agents Richard Willis, Jianing Zhao, Yali Du, Joel Z. Leibo cs.MA 2026-02-18  
AREG: Adversarial Resource Extraction Game for Evaluating Persuasion and Resistance in Large Language Models Adib Sakhawat, Fardeen Sadab cs.CL 2026-02-18  
LLM-to-Speech: A Synthetic Data Pipeline for Training Dialectal Text-to-Speech Models Ahmed Khaled Khamis, Hesham Ali cs.CL 2026-02-17  
AgriWorld:A World Tools Protocol Framework for Verifiable Agricultural Reasoning with Code-Executing LLM Agents Zhixing Zhang, Jesen Zhang, Hao Liu, Qinhan Lv, Jing Yang, Kaitong Cai, Keze Wang cs.AI 2026-02-17  
Frontier AI Risk Management Framework in Practice: A Risk Analysis Technical Report v1.5 Dongrui Liu, Yi Yu, Jie Zhang, Guanxu Chen, Qihao Lin, Hanxi Zhu, Lige Huang, Yijin Zhou, Peng Wang, Shuai Shao, Boxuan Zhang, Zicheng Liu, Jingwei Sun, Yu Li, Yuejin Xie, Jiaxuan Guo, Jia Xu, Chaochao Lu, Bowen Zhou, Xia Hu, Jing Shao cs.AI, cs.CL, cs.CV, cs.CY, cs.LG 2026-02-16  
TruthStance: An Annotated Dataset of Conversations on Truth Social Fathima Ameen, Danielle Brown, Manusha Malgareddy, Amanul Haque cs.CL, cs.AI 2026-02-16  
An end-to-end agentic pipeline for smart contract translation and quality evaluation Abhinav Goel, Chaitya Shah, Agostino Capponi, Alfio Gliozzo cs.AI, cs.SE 2026-02-14  
Never say never: Exploring the effects of available knowledge on agent persuasiveness in controlled physiotherapy motivation dialogues Stephan Vonschallen, Rahel Häusler, Theresa Schmiedel, Friederike Eyssel cs.HC, cs.AI 2026-02-13  
WavBench: Benchmarking Reasoning, Colloquialism, and Paralinguistics for End-to-End Spoken Dialogue Models Yangzhuo Li, Shengpeng Ji, Yifu Chen, Tianle Liang, Haorong Ying, Yule Wang, Junbo Li, Jun Fang, Zhou Zhao cs.CL 2026-02-12  
Evaluating AGENTS.md: Are Repository-Level Context Files Helpful for Coding Agents? Thibaud Gloaguen, Niels Mündler, Mark Müller, Veselin Raychev, Martin Vechev cs.SE, cs.AI 2026-02-12  
Do Large Language Models Adapt to Language Variation across Socioeconomic Status? Elisa Bassignana, Mike Zhang, Dirk Hovy, Amanda Cercas Curry cs.CL 2026-02-12  
RELATE: A Reinforcement Learning-Enhanced LLM Framework for Advertising Text Generation Jinfang Wang, Jiajie Liu, Jianwei Wu, Ziqin Luo, Zhen Chen, Chunlei Li, Biao Han, Tao Deng, Yi Li, Shuanglong Li, Lin Liu cs.AI 2026-02-12  
AIR: Improving Agent Safety through Incident Response Zibo Xiao, Jun Sun, Junjie Chen cs.AI 2026-02-12  
TRACER: Trajectory Risk Aggregation for Critical Episodes in Agentic Reasoning Sina Tayebati, Divake Kumar, Nastaran Darabi, Davide Ettori, Ranganath Krishnan, Amit Ranjan Trivedi cs.AI 2026-02-11  
Learning to Compose for Cross-domain Agentic Workflow Generation Jialiang Wang, Shengxiang Xu, Hanmo Liu, Jiachuan Wang, Yuyu Luo, Shimin Di, Min-Ling Zhang, Lei Chen cs.MA, cs.AI, cs.LG, cs.SE 2026-02-11  
AlphaForgeBench: Benchmarking End-to-End Trading Strategy Design with Large Language Models Wentao Zhang, Mingxuan Zhao, Jincheng Gao, Jieshun You, Huaiyu Jia, Yilei Zhao, Bo An, Shuo Sun q-fin.TR, cs.AI 2026-02-10  
Towards Poisoning Robustness Certification for Natural Language Generation Mihnea Ghitu, Matthew Wicker cs.LG 2026-02-10  
Large Language Models for Designing Participatory Budgeting Rules Nguyen Thach, Xingchen Sha, Hau Chan cs.LG 2026-02-10  
Accelerating Social Science Research via Agentic Hypothesization and Experimentation Jishu Sen Gupta, Harini SI, Somesh Kumar Singh, Syed Mohamad Tawseeq, Yaman Kumar Singla, David Doermann, Rajiv Ratn Shah, Balaji Krishnamurthy cs.AI, cs.CL 2026-02-08  
Exploring AI-Augmented Sensemaking of Patient-Generated Health Data: A Mixed-Method Study with Healthcare Professionals in Cardiac Risk Reduction Pavithren V S Pakianathan, Rania Islambouli, Diogo Branco, Albrecht Schmidt, Tiago Guerreiro, Jan David Smeddinck cs.HC, cs.AI 2026-02-05  
Generative Ontology: When Structured Knowledge Learns to Create Benny Cheung cs.AI, cs.CL 2026-02-05  
Data-Centric Interpretability for LLM-based Multi-Agent Reinforcement Learning John Yan, Michael Yu, Yuqi Sun, Alexander Duffy, Tyler Marques, Matthew Lyle Olson cs.LG, cs.AI 2026-02-05  
RA-QA: Towards Respiratory Audio-based Health Question Answering Gaia A. Bertolino, Yuwei Zhang, Tong Xia, Domenico Talia, Cecilia Mascolo cs.SD, cs.LG, eess.AS 2026-02-04  
ProxyWar: Dynamic Assessment of LLM Code Generation in Game Arenas Wenjun Peng, Xinyu Wang, Qi Wu cs.SE, cs.AI 2026-02-04  
A$^2$-LLM: An End-to-end Conversational Audio Avatar Large Language Model Xiaolin Hu, Hang Yuan, Xinzhu Sang, Binbin Yan, Zhou Yu, Cong Huang, Kai Chen cs.LG, cs.AI, cs.SD 2026-02-04  
From Crafting Text to Crafting Thought: Grounding AI Writing Support to Writing Center Pedagogy Yijun Liu, John Gallagher, Sarah Sterman, Tal August cs.HC 2026-02-03  
The Necessity of a Unified Framework for LLM-Based Agent Evaluation Pengyu Zhu, Li Sun, Philip S. Yu, Sen Su cs.AI 2026-02-03  
GuideWeb: A Benchmark for Automatic In-App Guide Generation on Real-World Web UIs Chengguang Gan, Yoshihiro Tsujii, Yunhao Liang, Tatsunori Mori, Shiwen Ni, Hiroki Itoh cs.CL 2026-02-02  
Wiki Live Challenge: Challenging Deep Research Agents with Expert-Level Wikipedia Articles Shaohan Wang, Benfeng Xu, Licheng Zhang, Mingxuan Du, Chiwei Zhu, Xiaorui Wang, Zhendong Mao, Yongdong Zhang cs.CL 2026-02-02  
PedagoSense: A Pedology Grounded LLM System for Pedagogical Strategy Detection and Contextual Response Generation in Learning Dialogues Shahem Sultan, Shahem Fadi, Yousef Melhim, Ibrahim Alsarraj, Besher Hassan cs.CL 2026-02-01  
PaperBanana: Automating Academic Illustration for AI Scientists Dawei Zhu, Rui Meng, Yale Song, Xiyu Wei, Sujian Li, Tomas Pfister, Jinsung Yoon cs.CL, cs.CV 2026-01-30  
WebArbiter: A Principle-Guided Reasoning Process Reward Model for Web Agents Yao Zhang, Shijie Tang, Zeyu Li, Zhen Han, Volker Tresp cs.AI 2026-01-29  
Embodied Task Planning via Graph-Informed Action Generation with Large Language Model Xiang Li, Ning Yan, Masood Mortazavi cs.CL 2026-01-29  
More Code, Less Reuse: Investigating Code Quality and Reviewer Sentiment towards AI-generated Pull Requests Haoming Huang, Pongchai Jaisri, Shota Shimizu, Lingfeng Chen, Sota Nakashima, Gema Rodríguez-Pérez cs.SE, cs.AI, cs.HC 2026-01-29  
Planner-Auditor Twin: Agentic Discharge Planning with FHIR-Based LLM Planning, Guideline Recall, Optional Caching and Self-Improvement Kaiyuan Wu, Aditya Nagori, Rishikesan Kamaleswaran cs.AI, cs.MA 2026-01-28  
A Dialectic Pipeline for Improving LLM Robustness Sara Candussio cs.CL, cs.MA 2026-01-28  
RobustExplain: Evaluating Robustness of LLM-Based Explanation Agents for Recommendation Guilin Zhang, Kai Zhao, Jeffrey Friedman, Xu Chu cs.IR, cs.AI, cs.LG 2026-01-27  
Assessing the Quality of Mental Health Support in LLM Responses through Multi-Attribute Human Evaluation Abeer Badawi, Md Tahmid Rahman Laskar, Elahe Rahimi, Sheri Grach, Lindsay Bertrand, Lames Danok, Frank Rudzicz, Jimmy Huang, Elham Dolatabadi cs.AI, cs.HC 2026-01-26  
LegalMALR:Multi-Agent Query Understanding and LLM-Based Reranking for Chinese Statute Retrieval Yunhan Li, Mingjie Xie, Gaoli Kang, Zihan Gong, Gengshen Wu, Min Yang cs.IR, cs.CL 2026-01-25  
Status Hierarchies in Language Models Emilio Barkett cs.HC, cs.AI, cs.CL 2026-01-24  
The Shadow Self: Intrinsic Value Misalignment in Large Language Model Agents Chen Chen, Kim Young Il, Yuan Yang, Wenhao Su, Yilin Zhang, Xueluan Gong, Qian Wang, Yongsen Zheng, Ziyao Liu, Kwok-Yan Lam cs.CL 2026-01-24  
On the Insecurity of Keystroke-Based AI Authorship Detection: Timing-Forgery Attacks Against Motor-Signal Verification David Condrey cs.CR, cs.AI, cs.HC 2026-01-24  
LLMs Got Rhythm? Hybrid Phonological Filtering for Greek Poetry Rhyme Detection and Generation Stergios Chatzikyriakidis cs.CL 2026-01-14  
Efficient Multilingual Dialogue Processing via Translation Pipelines and Distilled Language Models Santiago Martínez Novoa, Nicolás Rozo Fajardo, Diego Alejandro González Vargas, Nicolás Bedoya Figueroa cs.CL 2026-01-14  
Can LLMs interpret figurative language as humans do?: surface-level vs representational similarity Samhita Bollepally, Aurora Sloman-Moll, Takashi Yamauchi cs.CL, cs.AI 2026-01-14  
OpenMic: A Multi-Agent-Based Stand-Up Comedy Generation System Yuyang Wu, Hanzhong Cao, Jianhao Chen, Yufei Li cs.AI 2026-01-13  
Order in the Evaluation Court: A Critical Analysis of NLG Evaluation Trends Jing Yang, Nils Feldhus, Salar Mohtaj, Leonhard Hennig, Qianli Wang, Eleni Metheniti, Sherzod Hakimov, Charlott Jakob, Veronika Solopova, Konrad Rieck, David Schlangen, Sebastian Möller, Vera Schmitt cs.CL 2026-01-12  
PsyCLIENT: Client Simulation via Conversational Trajectory Modeling for Trainee Practice and Model Evaluation in Mental Health Counseling Huachuan Qiu, Zhaoming Chen, Yuqian Chen, Yuan Xie, Yu Lu, Zhenzhong Lan cs.CL 2026-01-12  
Agents of Diffusion: Enhancing Diffusion Language Models with Multi-Agent Reinforcement Learning for Structured Data Generation (Extended Version) Aja Khanal, Kaushik T. Ranade, Rishabh Agrawal, Kalyan S. Basu, Apurva Narayan cs.MA 2026-01-12  
Can a Unimodal Language Agent Provide Preferences to Tune a Multimodal Vision-Language Model? Sazia Tabasum Mim, Jack Morris, Manish Dhakal, Yanming Xiu, Maria Gorlatova, Yi Ding cs.CL 2026-01-10  
STELP: Secure Transpilation and Execution of LLM-Generated Programs Swapnil Shinde, Sahil Wadhwa, Andy Luo, Akshay Gupta, Mohammad Shahed Sorower cs.SE, cs.AI 2026-01-09  
A Preliminary Agentic Framework for Matrix Deflation Paimon Goulart, Evangelos E. Papalexakis cs.LG 2026-01-06  
The Path Ahead for Agentic AI: Challenges and Opportunities Nadia Sibai, Yara Ahmed, Serry Sibaee, Sawsan AlHalawani, Adel Ammar, Wadii Boulila cs.AI 2026-01-06  
AgentMark: Utility-Preserving Behavioral Watermarking for Agents Kaibo Huang, Jin Tan, Yukun Wei, Wanling Li, Zipei Zhang, Hui Tian, Zhongliang Yang, Linna Zhou cs.CR, cs.AI 2026-01-05  
WebCoderBench: Benchmarking Web Application Generation with Comprehensive and Interpretable Evaluation Metrics Chenxu Liu, Yingjie Fu, Wei Yang, Ying Zhang, Tao Xie cs.SE, cs.AI 2026-01-05  
CaveAgent: Transforming LLMs into Stateful Runtime Operators Maohao Ran, Zhenglin Wan, Cooper Lin, Yanting Zhang, Hongyu Xin, Hongwei Fan, Yibo Xu, Beier Luo, Yaxin Zhou, Wangbo Zhao, Lijie Yang, Lang Feng, Fuchao Yang, Jingxuan Wu, Yiqiao Huang, Chendong Ma, Dailing Jiang, Jianbo Deng, Sihui Han, Bo An, Yike Guo, Jun Song cs.AI, cs.SE 2026-01-04  
MAMA-Memeia! Multi-Aspect Multi-Agent Collaboration for Depressive Symptoms Identification in Memes Siddhant Agarwal, Adya Dhuler, Polly Ruhnke, Melvin Speisman, Md Shad Akhtar, Shweta Yadav cs.CL 2025-12-31  
Do Large Language Models Know What They Are Capable Of? Casey O. Barkan, Sid Black, Oliver Sourbut cs.CL, cs.AI 2025-12-31  
The Silicon Psyche: Anthropomorphic Vulnerabilities in Large Language Models Giuseppe Canale, Kashyap Thimmaraju cs.CR, cs.AI, cs.CY, cs.HC 2025-12-30  
Web World Models Jichen Feng, Yifan Zhang, Chenggong Zhang, Yifu Lu, Shilong Liu, Mengdi Wang cs.AI, cs.CL, cs.CV 2025-12-29  
TCEval: Using Thermal Comfort to Assess Cognitive and Perceptual Abilities of AI Jingming Li cs.AI 2025-12-29  
AI-Generated Code Is Not Reproducible (Yet): An Empirical Study of Dependency Gaps in LLM-Based Coding Agents Bhanu Prakash Vangala, Ali Adibifar, Tanu Malik, Ashish Gehani cs.SE, cs.AI, cs.MA 2025-12-26  
Emotion Diffusion in Real and Simulated Social Graphs: Structural Limits of LLM-Based Social Simulation Qiqi Qiang cs.SI 2025-12-24  
NVIDIA Nemotron 3: Efficient and Open Intelligence NVIDIA, :, Aaron Blakeman, Aaron Grattafiori, Aarti Basant, Abhibha Gupta, Abhinav Khattar, Adi Renduchintala, Aditya Vavre, Akanksha Shukla, Akhiad Bercovich, Aleksander Ficek, Aleksandr Shaposhnikov, Alex Kondratenko, Alexander Bukharin, Alexandre Milesi, Ali Taghibakhshi, Alisa Liu, Amelia Barton, Ameya Sunil Mahabaleshwarkar, Amir Klein, Amit Zuker, Amnon Geifman, Amy Shen, Anahita Bhiwandiwalla, Andrew Tao, Anjulie Agrusa, Ankur Verma, Ann Guan, Anubhav Mandarwal, Arham Mehta, Ashwath Aithal, Ashwin Poojary, Asif Ahamed, Asit Mishra, Asma Kuriparambil Thekkumpate, Ayush Dattagupta, Banghua Zhu, Bardiya Sadeghi, Barnaby Simkin, Ben Lanir, Benedikt Schifferer, Besmira Nushi, Bilal Kartal, Bita Darvish Rouhani, Boris Ginsburg, Brandon Norick, Brandon Soubasis, Branislav Kisacanin, Brian Yu, Bryan Catanzaro, Carlo del Mundo, Chantal Hwang, Charles Wang, Cheng-Ping Hsieh, Chenghao Zhang, Chenhan Yu, Chetan Mungekar, Chintan Patel, Chris Alexiuk, Christopher Parisien, Collin Neale, Cyril Meurillon, Damon Mosk-Aoyama, Dan Su, Dane Corneil, Daniel Afrimi, Daniel Lo, Daniel Rohrer, Daniel Serebrenik, Daria Gitman, Daria Levy, Darko Stosic, David Mosallanezhad, Deepak Narayanan, Dhruv Nathawani, Dima Rekesh, Dina Yared, Divyanshu Kakwani, Dong Ahn, Duncan Riach, Dusan Stosic, Edgar Minasyan, Edward Lin, Eileen Long, Eileen Peters Long, Elad Segal, Elena Lantz, Ellie Evans, Elliott Ning, Eric Chung, Eric Harper, Eric Tramel, Erick Galinkin, Erik Pounds, Evan Briones, Evelina Bakhturina, Evgeny Tsykunov, Faisal Ladhak, Fay Wang, Fei Jia, Felipe Soares, Feng Chen, Ferenc Galko, Frank Sun, Frankie Siino, Gal Hubara Agam, Ganesh Ajjanagadde, Gantavya Bhatt, Gargi Prasad, George Armstrong, Gerald Shen, Gorkem Batmaz, Grigor Nalbandyan, Haifeng Qian, Harsh Sharma, Hayley Ross, Helen Ngo, Herbert Hum, Herman Sahota, Hexin Wang, Himanshu Soni, Hiren Upadhyay, Huizi Mao, Huy C Nguyen, Huy Q Nguyen, Iain Cunningham, Ido Galil, Ido Shahaf, Igor Gitman, Ilya Loshchilov, Itamar Schen, Itay Levy, Ivan Moshkov, Izik Golan, Izzy Putterman, Jan Kautz, Jane Polak Scowcroft, Jared Casper, Jatin Mitra, Jeffrey Glick, Jenny Chen, Jesse Oliver, Jian Zhang, Jiaqi Zeng, Jie Lou, Jimmy Zhang, Jinhang Choi, Jining Huang, Joey Conway, Joey Guman, John Kamalu, Johnny Greco, Jonathan Cohen, Joseph Jennings, Joyjit Daw, Julien Veron Vialard, Junkeun Yi, Jupinder Parmar, Kai Xu, Kan Zhu, Kari Briski, Katherine Cheung, Katherine Luna, Keith Wyss, Keshav Santhanam, Kevin Shih, Kezhi Kong, Khushi Bhardwaj, Kirthi Shankar, Krishna C. Puvvada, Krzysztof Pawelec, Kumar Anik, Lawrence McAfee, Laya Sleiman, Leon Derczynski, Li Ding, Lizzie Wei, Lucas Liebenwein, Luis Vega, Maanu Grover, Maarten Van Segbroeck, Maer Rodrigues de Melo, Mahdi Nazemi, Makesh Narsimhan Sreedhar, Manoj Kilaru, Maor Ashkenazi, Marc Romeijn, Marcin Chochowski, Mark Cai, Markus Kliegl, Maryam Moosaei, Matt Kulka, Matvei Novikov, Mehrzad Samadi, Melissa Corpuz, Mengru Wang, Meredith Price, Michael Andersch, Michael Boone, Michael Evans, Miguel Martinez, Mikail Khona, Mike Chrzanowski, Minseok Lee, Mohammad Dabbah, Mohammad Shoeybi, Mostofa Patwary, Nabin Mulepati, Najeeb Nabwani, Natalie Hereth, Nave Assaf, Negar Habibi, Neta Zmora, Netanel Haber, Nicola Sessions, Nidhi Bhatia, Nikhil Jukar, Nikki Pope, Nikolai Ludwig, Nima Tajbakhsh, Nir Ailon, Nirmal Juluru, Nishant Sharma, Oleksii Hrinchuk, Oleksii Kuchaiev, Olivier Delalleau, Oluwatobi Olabiyi, Omer Ullman Argov, Omri Puny, Oren Tropp, Ouye Xie, Parth Chadha, Pasha Shamis, Paul Gibbons, Pavlo Molchanov, Pawel Morkisz, Peter Dykas, Peter Jin, Pinky Xu, Piotr Januszewski, Pranav Prashant Thombre, Prasoon Varshney, Pritam Gundecha, Przemek Tredak, Qing Miao, Qiyu Wan, Rabeeh Karimi Mahabadi, Rachit Garg, Ran El-Yaniv, Ran Zilberstein, Rasoul Shafipour, Rich Harang, Rick Izzo, Rima Shahbazyan, Rishabh Garg, Ritika Borkar, Ritu Gala, Riyad Islam, Robert Hesse, Roger Waleffe, Rohit Watve, Roi Koren, Ruoxi Zhang, Russell Hewett, Russell J. Hewett, Ryan Prenger, Ryan Timbrook, Sadegh Mahdavi, Sahil Modi, Samuel Kriman, Sangkug Lim, Sanjay Kariyappa, Sanjeev Satheesh, Saori Kaji, Satish Pasumarthi, Saurav Muralidharan, Sean Narentharen, Sean Narenthiran, Seonmyeong Bak, Sergey Kashirsky, Seth Poulos, Shahar Mor, Shanmugam Ramasamy, Shantanu Acharya, Shaona Ghosh, Sharath Turuvekere Sreenivas, Shelby Thomas, Shiqing Fan, Shreya Gopal, Shrimai Prabhumoye, Shubham Pachori, Shubham Toshniwal, Shuoyang Ding, Siddharth Singh, Simeng Sun, Smita Ithape, Somshubra Majumdar, Soumye Singhal, Stas Sergienko, Stefania Alborghetti, Stephen Ge, Sugam Dipak Devare, Sumeet Kumar Barua, Suseella Panguluri, Suyog Gupta, Sweta Priyadarshi, Syeda Nahida Akter, Tan Bui, Teodor-Dumitru Ene, Terry Kong, Thanh Do, Tijmen Blankevoort, Tim Moon, Tom Balough, Tomer Asida, Tomer Bar Natan, Tomer Ronen, Tugrul Konuk, Twinkle Vashishth, Udi Karpas, Ushnish De, Vahid Noorozi, Vahid Noroozi, Venkat Srinivasan, Venmugil Elango, Victor Cui, Vijay Korthikanti, Vinay Rao, Vitaly Kurin, Vitaly Lavrukhin, Vladimir Anisimov, Wanli Jiang, Wasi Uddin Ahmad, Wei Du, Wei Ping, Wenfei Zhou, Will Jennings, William Zhang, Wojciech Prazuch, Xiaowei Ren, Yashaswi Karnati, Yejin Choi, Yev Meyer, Yi-Fu Wu, Yian Zhang, Yigong Qin, Ying Lin, Yonatan Geifman, Yonggan Fu, Yoshi Subara, Yoshi Suhara, Yubo Gao, Zach Moshe, Zhen Dong, Zhongbo Zhu, Zihan Liu, Zijia Chen, Zijie Yan cs.CL, cs.AI, cs.LG 2025-12-24  
AgentMath: Empowering Mathematical Reasoning for Large Language Models via Tool-Augmented Agent Haipeng Luo, Huawen Feng, Qingfeng Sun, Can Xu, Kai Zheng, Yufei Wang, Tao Yang, Han Hu, Yansong Tang, Di Wang cs.AI, cs.CL, cs.LG 2025-12-23  
SA-DiffuSeq: Addressing Computational and Scalability Challenges in Long-Document Generation with Sparse Attention Alexandros Christoforos, Chadbourne Davis cs.CL, cs.AI 2025-12-23  
MoE-DiffuSeq: Enhancing Long-Document Diffusion Models with Sparse Attention and Mixture of Experts Alexandros Christoforos, Chadbourne Davis cs.CL 2025-12-23  
Distilling to Hybrid Attention Models via KL-Guided Layer Selection Yanhong Li, Songlin Yang, Shawn Tan, Mayank Mishra, Rameswar Panda, Jiawei Zhou, Yoon Kim cs.CL, cs.AI 2025-12-23  
LLM Agents Implement an NLG System from Scratch: Building Interpretable Rule-Based RDF-to-Text Generators Mateusz Lango, Ondřej Dušek cs.CL, cs.AI 2025-12-20  
ShareChat: A Dataset of Chatbot Conversations in the Wild Yueru Yan, Tuc Nguyen, Bo Su, Melissa Lieffers, Thai Le cs.CL, cs.AI, cs.HC 2025-12-19  
Polypersona: Persona-Grounded LLM for Synthetic Survey Responses Tejaswani Dash, Dinesh Karri, Anudeep Vurity, Gautam Datla, Tazeem Ahmad, Saima Rafi, Rohith Tangudu cs.CL, cs.AI 2025-12-16  
Evaluation of AI Ethics Tools in Language Models: A Developers’ Perspective Case Stud Jhessica Silva, Diego A. B. Moreira, Gabriel O. dos Santos, Alef Ferreira, Helena Maia, Sandra Avila, Helio Pedrini cs.CY, cs.AI, cs.CL 2025-12-16  
Workflow is All You Need: Escaping the “Statistical Smoothing Trap” via High-Entropy Information Foraging and Adversarial Pacing Zhongjie Jiang cs.CL, cs.AI, cs.CY, q-fin.GN 2025-12-10  
Knowledge-Augmented Large Language Model Agents for Explainable Financial Decision-Making Qingyuan Zhang, Yuxi Wang, Cancan Hua, Yulin Huang, Ning Lyu cs.CL 2025-12-10  
The Erosion of LLM Signatures: Can We Still Distinguish Human and LLM-Generated Scientific Ideas After Iterative Paraphrasing? Sadat Shahriar, Navid Ayoobi, Arjun Mukherjee cs.LG, cs.AI 2025-12-04  
Learning Evolving Latent Strategies for Multi-Agent Language Systems without Model Fine-Tuning Wenlong Tang cs.LG, cs.AI 2025-11-28  
Towards Improving Interpretability of Language Model Generation through a Structured Knowledge Discovery Approach Shuqi Liu, Han Wu, Guanzhi Deng, Jianshu Chen, Xiaoyang Wang, Linqi Song cs.CL, cs.AI 2025-11-28  
Adaptive LLM Agents: Toward Personalized Empathetic Care Priyanka Singh, Sebastian Von Mammen cs.HC 2025-11-25  
Deep Research: A Systematic Survey Zhengliang Shi, Yiqun Chen, Haitao Li, Weiwei Sun, Shiyu Ni, Yougang Lyu, Run-Ze Fan, Bowen Jin, Yixuan Weng, Minjun Zhu, Qiujie Xie, Xinyu Guo, Qu Yang, Jiayi Wu, Jujia Zhao, Xiaqiang Tang, Xinbei Ma, Cunxiang Wang, Jiaxin Mao, Qingyao Ai, Jen-Tse Huang, Wenxuan Wang, Yue Zhang, Yiming Yang, Zhaopeng Tu, Zhaochun Ren cs.CL, cs.AI, cs.IR 2025-11-24  
MindEval: Benchmarking Language Models on Multi-turn Mental Health Support José Pombal, Maya D’Eon, Nuno M. Guerreiro, Pedro Henrique Martins, António Farinhas, Ricardo Rei cs.CL, cs.AI 2025-11-23  
NAMeGEn: Creative Name Generation via A Novel Agent-based Multiple Personalized Goal Enhancement Framework Shanlin Zhou, Xinpeng Wang, Jianxun Lian, Zhenghao Liu, Laks V. S. Lakshmanan, Xiaoyuan Yi, Yongtao Hao cs.CL, cs.AI, cs.IR, cs.MA, cs.NE 2025-11-19  
AfriSpeech-MultiBench: A Verticalized Multidomain Multicountry Benchmark Suite for African Accented English ASR Gabrial Zencha Ashungafac, Mardhiyah Sanni, Busayo Awobade, Alex Gichamba, Tobi Olatunji cs.CL 2025-11-18  
Generalist Foundation Models Are Not Clinical Enough for Hospital Operations Lavender Y. Jiang, Angelica Chen, Xu Han, Xujin Chris Liu, Radhika Dua, Kevin Eaton, Frederick Wolff, Robert Steele, Jeff Zhang, Anton Alyakin, Qingkai Pan, Yanbing Chen, Karl L. Sangwon, Daniel A. Alber, Jaden Stryker, Jin Vivian Lee, Yindalon Aphinyanaphongs, Kyunghyun Cho, Eric Karl Oermann cs.CL, cs.AI, cs.LG 2025-11-17  
Prompt-Based Value Steering of Large Language Models Giulio Antonio Abbo, Tony Belpaeme cs.CL, cs.AI 2025-11-14  
Self-Correcting Large Language Models: Generation vs. Multiple Choice Hossein A. Rahmani, Satyapriya Krishna, Xi Wang, Mohammadmehdi Naghiaei, Emine Yilmaz cs.CL, cs.AI 2025-11-12  
HalluClean: A Unified Framework to Combat Hallucinations in LLMs Yaxin Zhao, Yu Zhang cs.CL 2025-11-12  
Simulating Students with Large Language Models: A Review of Architecture, Mechanisms, and Role Modelling in Education with Generative AI Luis Marquez-Carpintero, Alberto Lopez-Sellers, Miguel Cazorla cs.CY, cs.AI, cs.CL 2025-11-08  
Transforming Mentorship: An AI Powered Chatbot Approach to University Guidance Mashrur Rahman, Mantaqa abedin, Monowar Zamil Abir, Faizul Islam Ansari, Adib Reza, Farig Yousuf Sadeque, Niloy Farhan cs.IR, cs.CL 2025-11-06  
Multi-Agent Collaborative Framework For Math Problem Generation Kia Karbasi, Kevin Hong, Mohammad Amin Samadi, Gregory Pottie cs.MA, cs.CL, cs.HC 2025-11-06  
Bayesian Evaluation of Large Language Model Behavior Rachel Longjohn, Shang Wu, Saatvik Kher, Catarina Belém, Padhraic Smyth cs.CL, cs.LG, stat.AP, stat.ML 2025-11-04  
Hybrid Quantum Transformer for Language Generation Desheng Kong, Xiangshuo Cui, Jiaying Jin, Jing Xu, Donglin Wang cs.CL, cs.AI, quant-ph 2025-11-02  
Fine-Tuning DialoGPT on Common Diseases in Rural Nepal for Medical Conversations Birat Poudel, Satyam Ghimire, Er. Prakash Chandra Prasad cs.CL 2025-11-01  
Consistently Simulating Human Personas with Multi-Turn Reinforcement Learning Marwa Abdulhai, Ryan Cheng, Donovan Clay, Tim Althoff, Sergey Levine, Natasha Jaques cs.CL, cs.AI 2025-10-31  
CATArena: Evaluation of LLM Agents through Iterative Tournament Competitions Lingyue Fu, Xin Ding, Yaoming Zhu, Shao Zhang, Lin Qiu, Weiwen Liu, Weinan Zhang, Xuezhi Cao, Xunliang Cai, Jiaxin Ding, Yong Yu cs.AI, cs.CL 2025-10-30  
Evaluating LLMs on Generating Age-Appropriate Child-Like Conversations Syed Zohaib Hassan, Pål Halvorsen, Miriam S. Johnson, Pierre Lison cs.CL 2025-10-28  
Dialogue Is Not Enough to Make a Communicative BabyLM (But Neither Is Developmentally Inspired Reinforcement Learning) Francesca Padovani, Bastian Bunzeck, Manar Ali, Omar Momen, Arianna Bisazza, Hendrik Buschmeier, Sina Zarrieß cs.CL 2025-10-23  
Check Yourself Before You Wreck Yourself: Selectively Quitting Improves LLM Agent Safety Vamshi Krishna Bonagiri, Ponnurangam Kumaragurum, Khanh Nguyen, Benjamin Plaut cs.CL 2025-10-18  
Efficient Seq2seq Coreference Resolution Using Entity Representations Matt Grenander, Shay B. Cohen, Mark Steedman cs.CL 2025-10-16  
Generating Fair Consensus Statements with Social Choice on Token-Level MDPs Carter Blair, Kate Larson cs.AI, cs.CL, cs.GT 2025-10-15  
MADREC: A Multi-Aspect Driven LLM Agent for Explainable and Adaptive Recommendation Jiin Park, Misuk Kim cs.IR, cs.AI 2025-10-15  
CiteGuard: Faithful Citation Attribution for LLMs via Retrieval-Augmented Validation Yee Man Choi, Xuehang Guo, Yi R. Fung, Qingyun Wang cs.DL 2025-10-15  
GOAT: A Training Framework for Goal-Oriented Agent with Tools Hyunji Min, Sangwon Jung, Junyoung Sung, Dosung Lee, Leekyeung Han, Paul Hongsuck Seo cs.AI 2025-10-14  
ToolMem: Enhancing Multimodal Agents with Learnable Tool Capability Memory Yunzhong Xiao, Yangmin Li, Hewei Wang, Yunlong Tang, Zora Zhiruo Wang cs.CL 2025-10-08  
What Do Humans Hear When Interacting? Experiments on Selective Listening for Evaluating ASR of Spoken Dialogue Systems Kiyotada Mori, Seiya Kawano, Chaoran Liu, Carlos Toshinori Ishi, Angel Fernando Garcia Contreras, Koichiro Yoshino cs.CL 2025-08-06  
Investigating Hallucination in Conversations for Low Resource Languages Amit Das, Md. Najib Hasan, Souvika Sarkar, Zheng Zhang, Fatemeh Jamshidi, Tathagata Bhattacharya, Nilanjana Raychawdhury, Dongji Feng, Vinija Jain, Aman Chadha cs.CL 2025-07-30  
Teaching Language Models To Gather Information Proactively Tenghao Huang, Sihao Chen, Muhao Chen, Jonathan May, Longqi Yang, Mengting Wan, Pei Zhou cs.AI, cs.CL 2025-07-28  
AI-Driven Generation of Old English: A Framework for Low-Resource Languages Rodrigo Gabriel Salazar Alva, Matías Nuñez, Cristian López, Javier Martín Arista cs.CL, cs.AI 2025-07-27  
CogDual: Enhancing Dual Cognition of LLMs via Reinforcement Learning with Implicit Rule-Based Rewards Cheng Liu, Yifei Lu, Fanghua Ye, Jian Li, Xingyu Chen, Feiliang Ren, Zhaopeng Tu, Xiaolong Li cs.CL 2025-07-23  
[HF] DialogueForge: LLM Simulation of Human-Chatbot Dialogue Ruizhe Zhu, Hao Zhu, Yaxuan Li, Syang Zhou, Shijing Cai, Malgorzata Lazuka, Elliott Ash   2025-07-21 1
On the Semantics of Large Language Models Martin Schuele cs.CL, cs.AI 2025-07-07  
SHNU Multilingual Conversational Speech Recognition System for INTERSPEECH 2025 MLC-SLM Challenge Yuxiang Mei, Yuang Zheng, Dongxing Xu, Yanhua Long cs.CL, eess.AS 2025-07-04  
The Future is Agentic: Definitions, Perspectives, and Open Challenges of Multi-Agent Recommender Systems Reza Yousefi Maragheh, Yashar Deldjoo cs.IR 2025-07-02  
Decision-Oriented Text Evaluation Yu-Shiang Huang, Chuan-Ju Wang, Chung-Chi Chen cs.CL 2025-07-02  

< Previous