publications | Hwaran Lee

J: Journal, C: Conference, W: Workshop, A: Arxiv Preprint, T: Tech Report, D: Dissertation, P: Patent \ * indicates equal contribution among authors
† indicates co-corresponding authors

2026

[C29] BLEnD-Vis: Benchmarking Multimodal Cultural Understanding in Vision Language Models

Bryan Chen Zhengyu Tan, Zheng Weihua, Zhengyuan Liu, Nancy F. Chen, Hwaran Lee, Kenny Tsu Wei Choo, Roy Ka-Wei Lee
EACL, 2026

2025

[A1] Dataset Cartography for Large Language Model Alignment: Mapping and Diagnosing Preference Data

Seohyeong Lee, Eunwon Kim, Hwaran Lee, Buru Chang
Arxiv, 2025

[C28] Drift: Decoding-time Personalized Alignments with Implicit User Preferences

Minbeom Kim, Kang-il Lee, Seongho Joo, Hwaran Lee, Thibaut Thonet, Kyomin Jung
Bi-Align Workshop @ ICLR 2025 (non-archival)
EMNLP Findings, 2025

[C27] Code-Switching Curriculum Learning for Multilingual Transfer in LLMs

Haneul Yoo, Cheonbok Park, Sangdoo Yun, Alice Oh, Hwaran Lee
CALCS Workshop @ NAACL 2025 (non-archival)
ACL Findings, 2025

[C26] CSRT: Evaluation and Analysis of LLMs using Code-Switching Red-Teaming Dataset

Haneul Yoo, Yongjin Yang, Hwaran Lee
Red Teaming GenAI @ NeurIPS 2024 (non-archival)
CALCS Workshop @ NAACL 2025 (non-archival)
ACL, 2025

[C25] AdvisorQA: Towards Helpful and Harmless Advice-seeking Question Answering with Collective Intelligence

Minbeom Kim, Hwanhee Lee, Joonsuk Park, Hwaran Lee†, Kyomin Jung†
NAACL, 2025

[C24] MAQA: Evaluating Uncertainty Quantification in LLMs Regarding Data Uncertainty

Yongjin Yang, Haneul Yoo, Hwaran Lee
NAACL Findings, 2025

[C23] Guaranteed Generation from Large Language Models

Minbeom Kim, Thibaut Thonet, Jos Rozen, Hwaran Lee, Kyomin Jung, Marc Dymetman
ICLR, 2025

2024

[C22] BLEnD: A Benchmark for LLMs on Everyday Knowledge in Diverse Cultures and Languages

Junho Myung*, Nayeon Lee*, Yi Zhou*, Jiho Jin, Rifki Afina Putri, Dimosthenis Antypas, Hsuvas Borkakoty, Eunsu Kim, Carla Perez-Almendros, Abinew Ali Ayele, Víctor Gutiérrez-Basulto, Yazmín Ibáñez-García, Hwaran Lee, Shamsuddeen Hassan Muhammad, Kiwoong Park, Anar Sabuhi Rzayev, Nina White, Seid Muhie Yimam, Mohammad Taher Pilehvar, Nedjma Ousidhoum, Jose Camacho-Collados, Alice Oh
Cross-Cultural Considerations in NLP workshop @ ACL 2024 (Best paper)
NeurIPS Dataset & Benchmark Track, 2024

[T1] HyperCLOVA X Technical Report

HyperCLOVA X Team, NAVER Cloud
Role as a Safety Leader
Arxiv, 2024

[C21] TimeChara: Evaluating Point-in-Time Character Hallucination of Role-Playing Large Language Models

Jaewoo Ahn, Taehyun Lee, Junyoung Lim, Jin-Hwa Kim, Sangdoo Yun, Hwaran Lee, Gunhee Kim
ACL Findings, 2024

[C20] Calibrating Large Language Models Using Their Generations Only

Dennis Thomas Ulmer, Martin Gubri, Hwaran Lee, Sangdoo Yun, Seong Joon Oh
ACL, 2024

[C19] TRAP: Targeted Random Adversarial Prompt Honeypot for Black-Box Identification

Martin Gubri, Dennis Thomas Ulmer, Hwaran Lee, Sangdoo Yun, Seong Joon Oh
ACL Findings, 2024

Jiyoung Lee, Minwoo Kim, Seungho Kim, Junghwan Kim, Seunghyun Won, Hwaran Lee, Edward Choi
ACL Findings, 2024
project page

[C17] Who Wrote this Code? Watermarking for Code Generation

Taehyun Lee*, Seokhee Hong*, Jaewoo Ahn, Ilgee Hong, Hwaran Lee, Sangdoo Yun, Jamin Shin†, Gunhee Kim†
ACL, 2024

[C16] LifeTox: Unveiling Implicit Toxicity in Life Advice

Minbeom Kim, Jahyun Koo, Hwanhee Lee, Joonsuk Park, Hwaran Lee, Kyomin Jung
NAACL (Short), 2024

[C15, W6] Prometheus: Inducing Fine-grained Evaluation Capability in Language Models

Seungone Kim*, Jamin Shin*, Yejin Cho*, Joel Jang, Shayne Longpre, Hwaran Lee, Sangdoo Yun, Seongjin Shin, Sungdong Kim, James Thorne, Minjoon Seo
ICLR, 2024
Instruction Workshop @ NeurIPS 2023
code & datset

[J7] KoBBQ: Korean Bias Benchmark for Question Answering

Jiho Jin*, Jiseon Kim*, Nayeon Lee*, Haneul Yoo*, Alice Oh, Hwaran Lee
Transactions of the Association for Computational Linguistics (IF=9.194), 2024
Presented at ACL, 2024
Cross-Cultural Considerations in NLP workshop @ ACL 2024
project page | code & datset | dataset@huggingface

2023

[C14] ProPILE: Probing Privacy Leakage in Large Language Models

Siwon Kim, Sangdoo Yun, Hwaran Lee, Martin Gubri, Sungroh Yoon, Seong Joon Oh
NeurIPS (Spotlight), 2023

Hwaran Lee*, Seokhee Hong*, Joonsuk Park, Takyoung Kim, Gunhee Kim, and Jung-Woo Ha
ACL (Industrial Track), 2023
code & dataset

[C12] SQuARe: A Large-Scale Dataset of Sensitive Questions and Acceptable Responses Created through Human-Machine Collaboration

Hwaran Lee*, Seokhee Hong*, Joonsuk Park, Takyoung Kim, Meeyoung Cha, Yejin Choi, Byoungpil Kim, Gunhee Kim, Eun-Ju Lee, Yong Lim, Alice Oh, Sangchul Park and Jung-Woo Ha
ACL, 2023 (Oral, Best Paper Nominated)
code & dataset

[C11] Query-Efficient Black-Box Red Teaming via Bayesian Optimization

Deokjae Lee, JunYeong Lee, Jung-Woo Ha, Jin-Hwa Kim, Sang-Woo Lee, Hwaran Lee and Hyun Oh Song
ACL, 2023
code

[C10] Critic-Guided Decoding for Controlled Text Generation

Minbeom Kim, Hwanhee Lee, Kang Min Yoo, Joonsuk Park, Hwaran Lee†, Kyomin Jung†
ACL Findings, 2023
code

[C9] ClaimDiff: Comparing and Contrasting Claims on Contentious Topics

Miyoung Ko, Ingyu Seong, Hwaran Lee, Joonsuk Park, Minsuk Chang, Minjoon Seo
ACL Findings, 2023
dataset

2022

[W5] Why Knowledge Distillation Amplifies Gender Bias and How to Mitigate - from the Perspective of DistilBERT

Jaimeen Ahn, Hwaran Lee, Jin-Hwa Kim, Alice Oh
In Proceedings of the 4rd Workshop on Gender Bias in Natural Language Processing (@ NAACL workshop), 2022

[C8] Masked Summarization to Generate Factually Inconsistent Summaries for Improved Factual Consistency Checking

Hwanhee Lee, Kang Min Yoo, Joonsuk Park, Hwaran Lee†, Kyomin Jung†
NAACL Findings, 2022

[C7] Plug-and-Play Adaptation for Continuously-updated QA

Kyungjae Lee, Wookje Han, Seung-won Hwang, Hwaran Lee, Joonsuk Park, Sang-Woo Lee
ACL Findings, 2022

[C6] TaleBrush: Sketching Stories with Generative Pretrained Language Models

John Yoon Young Chung, Wooseok Kim, Kang Min Yoo, Hwaran Lee, Eytan Adar, Minsuk Chang
CHI, 2022
video

[W4] TaleBrush: Visual Sketching of Story Generation with Pretrained Language Models

John Yoon Young Chung, Wooseok Kim, Kang Min Yoo, Hwaran Lee, Eytan Adar, Minsuk Chang
CHI EA ‘22: CHI Conference on Human Factors in Computing Systems Extended Abstracts

2020-2021

[J6] SUMBT+LaRL: Effective Multi-domain End-to-end Neural Task-oriented Dialog System

Hwaran Lee, Seokhwan Jo, HyungJun Kim, Sangkeun Jung, and Tae-Yoon Kim
IEEE Access (IF=3.9), 2021
pdf | code

[C5] Reasoning Visual Dialog with Sparse Graph Learning and Knowledge Transfer

Gi-Cheon Kang, Junseok Park, Hwaran Lee, Byoung-Tak Zhang, and Jin-Hwa Kim
EMNLP Findings, 2021
code

[P2] Method and Apparatus for Providing Hybrid Intelligent Customer Consultation

Tae-Yoon Kim, Jin Kim, Hyungjoon Kim, Jinsik Lee, Hwaran Lee, Heewon Jeon, Seokhwan Jo
KR Patent KR102488886B1, filed March 2022 and issued January 2023

[P1] Method and Apparatus for Dialogue State Tracking for Use in Goal-oriented Dialog System

Hwaran Lee, Jinsik Lee, Tae-Yoon Kim
KR Patent KR102281581B1, filed July 2019 and issued July 2021
CN Patent CN114127711A, filed July 2020
WO Patent WO2021010636A1, filed July 2020
US Patent US11687731B2, filed July 2020 and issued Jan 2021

2019

[C4] SUMBT: Slot-Utterance Matching for Universal and Scalable Belief Tracking

Hwaran Lee*, Jinsik Lee*, and Tae-Yoon Kim
ACL, 2019
code | poster

[J5] Unpaired Speech Enhancement by Acoustic and Adversarial Supervision for Speech Recognition

Geonmin Kim, Hwaran Lee, Bo-Kyeong Kim, Sang-Hoon Oh, and Soo-Young Lee
IEEE Signal Processing Letters (SPL), 2019.
pdf | code

2018

[D] Neural representations for speech recognition and natural language generation

Hwaran Lee
Ph.D. Dissertation, Korea Advanced Institute of Science and Technology (KAIST), 2018.

[J4] Rescoring of N-best Hypotheses using Top-down Selective Attention for Automatic Speech Recognition

Ho-Gyeong Kim, Hwaran Lee, Geonmin Kim, Sang-Hoon Oh, and Soo-Young Lee
IEEE Signal Processing Letters (SPL), 2018.
pdf

2017

[W3] A Deep Chatbot for QA and Chitchat

Geonmin Kim*, Hwaran Lee*, CheongAn Lee, Eunmi Hong, Byunggeun Kim, and Soo-Young Lee (team kAIb)
The Conversational Intelligence Challenge section on NIPS 2017 Competition Track Workshop (ConvAI, NIPS), 2017.
Ranked 3rd |pdf | code | poster

[C3] Compositional Sentence Representation from Character within Large Context Text

Geonmin Kim, Hwaran Lee, Bo-Kyeong Kim, and Soo-Young Lee
International Conference on Neural Information Processing (ICONIP), 2017.
Oral | pdf

2016

[J3] Deep CNNs Along the Time Axis With Intermap Pooling for Robustness to Spectral Variations

Hwaran Lee, Geonmin Kim, Ho-Gyeong Kim, Sang-Hoon Oh, and Soo-Young Lee
IEEE Signal Processing Letters (SPL), 2016.
pdf | code | demo

2015

[C2] Hierarchical committee of deep CNNs with exponentially-weighted decision fusion for static facial expression recognition

Bo-Kyeong Kim, Hwaran Lee, Jihyeon Roh, and Soo-Young Lee
International Conference on Multimodal Interaction (ICMI), 2015.
Challenge Winner | Oral |demo

[C1] Active Learning for Large-scale Object Classification: from Exploration to Exploitation

Ho-Gyeong Kim, Jihyeon Roh, Hwaran Lee, Geonmin Kim, and Soo-Young Lee
In Proceedings of the 3rd International Conference on Human-Agent Interaction (HAI), 2015.
Best paper | Qualcomm Innovation Awards

[W2] Learning Tonotopically Organized Auditory Feature-map from Speech by an Intermap Pooling Layer in a Deep CNN

Hwaran Lee, Geonmin Kim, Jihyeon Roh, and Soo-Young Lee
China-Japan-Korea Joint Workshop on Neurobioloby and Neuroinformatics (NBNI), 2015. (only abstract)
pdf

[W1] Spoken Sentence Embedding from Character by Jointly Learning Character-level Compositional Word Model and RNN Sentence Encoder

Geonmin Kim, Hwaran Lee, Jaemyung Yu, and Soo-Young Lee
China-Japan-Korea Joint Workshop on Neurobioloby and Neuroinformatics (NBNI), 2015. (only abstract)
pdf

2013

[J2] A Calibration Method for Eye-Gaze Estimation Systems Based on 3D Geometrical Optics

Hwaran Lee, Nadeem Iqbal, Wonil Chang, and Soo-Young Lee
IEEE Sensors Journal (Sensors), 2013.
pdf

[J1] Smart User Interface for Mobile Consumer Devices Using Model-Based Eye-Gaze Estimation

Nadeem Iqbal, Hwaran Lee, and Soo-Young Lee
IEEE Transactions on Consumer Electronics (TCE), 2013.
pdf

2026

[C29] BLEnD-Vis: Benchmarking Multimodal Cultural Understanding in Vision Language Models

2025

[A1] Dataset Cartography for Large Language Model Alignment: Mapping and Diagnosing Preference Data

[C28] Drift: Decoding-time Personalized Alignments with Implicit User Preferences

[C27] Code-Switching Curriculum Learning for Multilingual Transfer in LLMs

[C26] CSRT: Evaluation and Analysis of LLMs using Code-Switching Red-Teaming Dataset

[C25] AdvisorQA: Towards Helpful and Harmless Advice-seeking Question Answering with Collective Intelligence

[C24] MAQA: Evaluating Uncertainty Quantification in LLMs Regarding Data Uncertainty

[C23] Guaranteed Generation from Large Language Models

2024

[C22] BLEnD: A Benchmark for LLMs on Everyday Knowledge in Diverse Cultures and Languages

[T1] HyperCLOVA X Technical Report

[C21] TimeChara: Evaluating Point-in-Time Character Hallucination of Role-Playing Large Language Models

[C20] Calibrating Large Language Models Using Their Generations Only

[C19] TRAP: Targeted Random Adversarial Prompt Honeypot for Black-Box Identification

[C18] KorNAT: LLM Alignment Benchmark for Korean Social Values and Common Knowledge

[C17] Who Wrote this Code? Watermarking for Code Generation

[C16] LifeTox: Unveiling Implicit Toxicity in Life Advice

[C15, W6] Prometheus: Inducing Fine-grained Evaluation Capability in Language Models

[J7] KoBBQ: Korean Bias Benchmark for Question Answering

2023

[C14] ProPILE: Probing Privacy Leakage in Large Language Models

[C13] KoSBi: A Dataset for Mitigating Social Bias Risks Towards Safer Large Language Model Applications

[C12] SQuARe: A Large-Scale Dataset of Sensitive Questions and Acceptable Responses Created through Human-Machine Collaboration

[C11] Query-Efficient Black-Box Red Teaming via Bayesian Optimization

[C10] Critic-Guided Decoding for Controlled Text Generation

[C9] ClaimDiff: Comparing and Contrasting Claims on Contentious Topics

2022

[W5] Why Knowledge Distillation Amplifies Gender Bias and How to Mitigate - from the Perspective of DistilBERT

[C8] Masked Summarization to Generate Factually Inconsistent Summaries for Improved Factual Consistency Checking

[C7] Plug-and-Play Adaptation for Continuously-updated QA

[C6] TaleBrush: Sketching Stories with Generative Pretrained Language Models

[W4] TaleBrush: Visual Sketching of Story Generation with Pretrained Language Models

2020-2021

[J6] SUMBT+LaRL: Effective Multi-domain End-to-end Neural Task-oriented Dialog System

[C5] Reasoning Visual Dialog with Sparse Graph Learning and Knowledge Transfer

[P2] Method and Apparatus for Providing Hybrid Intelligent Customer Consultation

[P1] Method and Apparatus for Dialogue State Tracking for Use in Goal-oriented Dialog System

2019

[C4] SUMBT: Slot-Utterance Matching for Universal and Scalable Belief Tracking

[J5] Unpaired Speech Enhancement by Acoustic and Adversarial Supervision for Speech Recognition

2018

[D] Neural representations for speech recognition and natural language generation

[J4] Rescoring of N-best Hypotheses using Top-down Selective Attention for Automatic Speech Recognition

2017

[W3] A Deep Chatbot for QA and Chitchat

[C3] Compositional Sentence Representation from Character within Large Context Text

2016

[J3] Deep CNNs Along the Time Axis With Intermap Pooling for Robustness to Spectral Variations

2015

[C2] Hierarchical committee of deep CNNs with exponentially-weighted decision fusion for static facial expression recognition

[C1] Active Learning for Large-scale Object Classification: from Exploration to Exploitation

[W2] Learning Tonotopically Organized Auditory Feature-map from Speech by an Intermap Pooling Layer in a Deep CNN

[W1] Spoken Sentence Embedding from Character by Jointly Learning Character-level Compositional Word Model and RNN Sentence Encoder

2013

[J2] A Calibration Method for Eye-Gaze Estimation Systems Based on 3D Geometrical Optics

[J1] Smart User Interface for Mobile Consumer Devices Using Model-Based Eye-Gaze Estimation