HyeonsuB. Kang
[News!] I am seeking Research Scientist or Tenure-Track Assistant Professor roles.
I specialize in novel Human-AI interactive systems to enhance human creativity in practical applications.
Please find details about my research below.
[News!] Nouran's work on meronymous interaction has won the Best Paper Award at CHI 2024!
[News!] Nouran's work on meronymous interaction has been featured on
MIT News!
Industry Bio |
Academic Bio
I am a Ph.D. candidate in computer science at Carnegie Mellon University, specializing in human-computer interaction and natural language processing. My research focuses on the design and development of large language model (LLM)-based algorithms and interactive systems to address practical problems, such as helping people discover and make sense of relevant knowledge from large repositories like scientific literature, and finding novel analogical inspirations to enhance design creativity.
In my work, I build and fine-tune ML models and design and develop fully functional interactive systems for end-users.
My work has been recognized with a Google Cloud Research Innovator (2021) award and a Best Paper Award at ACM SIGCHI (2024).
With over 8+ years of experience in full-stack development and more than 3 years in machine learning and NLP, I bring a robust technical skill set.
My history of human-centered system-building research and cross-functional collaboration in academic and industry settings, such as with MIT, Conservation X (a conservation-focused non-profit), the Allen Institute for AI, and the Toyota Research Institute, enables me to effectively drive and deploy AI-powered features and systems.
I am eager to contribute my expertise to cutting-edge projects and enhance real-world applications in any innovative and dynamic environment.
One notable project is BioSpark, an LLM-based, end-to-end system I built for generating analogical bio-inspirations from diverse species at scale, and engaging designers in transferring the inspirations into novel designs.
BioSpark iteratively prompts LLMs using structured knowledge, starting from a small set of high-quality seed inspirations available online and continuously constructing a tree-of-life structure and identifying sparse branches on it for further generation.
Beyond generating bio-inspirations, BioSpark assists designers in applying these ideas to specific domains. It does so implicitly by clustering similar mechanisms by their 'active ingredients' to help users abstract schemas, and explicitly by providing 'sparks' that map the inspiration to the domain, highlighting trade-offs, and offering a chat interface for contextualizing the inspiration. A demo video of BioSpark can be found here.
In another project, I built Synergi, a mixed-initiative tool that introduces a new form of augmented reading.
When users highlight an interesting passage in a paper, Synergi provides synthesized knowledge from related works by searching and summarizing the content of relevant papers. Synergi begins by searching for relevant papers by traversing neighborhoods within the citation graph around papers cited in the user-highlighted section.
This traversal is guided by both citation-based signals (prioritizing frequently cited papers) and content similarity-based signals (prioritizing papers with related content). Synergi retrieves the top 20 relevant paper PDFs for further hierarchical summarization, where it employs an iterative algorithm for hierarchical clustering of parsed paper content and retrieval-augmented generation at each hierarchy node. A demo video of Synergi is available here.
In yet another project, I developed an analogical search engine for increasing scientific creativity.
By fine-tuning a sequence-to-sequence model on a dataset of over 2,000 annotated paper abstracts, I created a real-time search engine where users can enter natural language queries to find new insights from a corpus of 2M+ papers. Unlike conventional search engines, this analogical search engine aims to maximize the discovery of analogical scientific insights, defined as 'diverse mechanisms applied to similar problems,' across different scientific fields. In a controlled laboratory study, I found that matching high-level problem descriptions (e.g., 'transferring heat') with different low-level details (e.g., macro vs. nano scales) helps open new design spaces for generative explorations (e.g., using nano-scale fins in semiconductor circuits to harness energy from phonon transport).
Hyeonsu Kang is a CS Ph.D. candidate at Carnegie Mellon University, advised by Niki Kittur and affiliated with the Human-Computer Interaction Institute.
His research in human-computer interaction and natural language processing is on reimagining interaction paradigms by creating novel systems for synthesis and ideation, enhancing cognitive creativity and efficiency with AI.
He focuses on designing and implementing innovative interactive systems in the real-world and computational methods for empowering people to think outside-the-box when approaching a challenge
[TOCHI'22,
CHI'22,
NAACL'22,
NeurIPS'23, AAAI'24], helping them effectively discover relevant prior knowledge and synthesize insights from it
[UIST'23,
CHI'23,
UIST'22,
CHI'22,
CHI'24], and facilitating social learning and idea development through augmented feedback and expertise exchange with peers and domain experts
[CHI'18,
UIST'17,
Collective Intelligence'19, CHI'24 🏆].
In his work, he draws from cognitive theories to examine how people use higher-order cognition to transfer ideas from one domain to another.
He also develops new interaction and natural language processing techniques to computationally augment the process of analogical transfer and insights generation.
His research endeavors have fostered collaborations with academic institutions like MIT, the University of Maryland, the University of Washington, and KAIST, and industry partners such as Conservation X (a conservation-focused non-profit), the Allen Institute for Artificial Intelligence, and Toyota Research Institute.
He has published papers in premier NLP and HCI conferences and journals such as ACM CHI, UIST, TOCHI, AAAI, NAACL, and NeurIPS, including a best paper award at CHI 2024.
His work has been applied in pragmatic scenarios, like the allocation of nearly ~2M in prize money for conservation innovation contests, in collaboration with Conservation X, and at Semantic Scholar.
As part of his research dissemination, he has presented at several conferences and delivered guest lectures at the Allen Institute for AI.
Hyeonsu's work garnered him recognition as a Google Cloud Research Innovator (2021).
His research has been funded by the National Science Foundation, the Allen Institute for Artificial Intelligence, the Office of Naval Research, Toyota Research Institute, and Google Cloud.
He was previously supported by the South Korean National Scholarship for Science and Engineering.
He received his BS in Computer Science and Engineering at Seoul National University.
He also worked and interned at MIT, the Allen Institute for AI, UC San Diego, and Tableau Software.
CHI 2024
Yoonjoo Lee, Hyeonsu B. Kang, Matthew Latzke, Juho Kim, Jonathan Bragg, Joseph Chee Chang, Pao Siangliulue
With the rapid growth of scholarly archives, researchers subscribe to "paper alert" systems that periodically provide them with recommendations of recently published papers that are similar to previously collected papers. However, researchers sometimes struggle to make sense of nuanced connections between recommended papers and their own research context, as existing systems only present paper titles and abstracts. To help researchers spot these connections, we present PaperWeaver, an enriched paper alerts system that provides contextualized text descriptions of recommended papers based on user-collected papers. PaperWeaver employs a computational method based on Large Language Models (LLMs) to infer users' research interests from their collected papers, extract context-specific aspects of papers, and compare recommended and collected papers on these aspects. Our user study (N=15) showed that participants using PaperWeaver were able to better understand the relevance of recommended papers and triage them more confidently when compared to a baseline that presented the related work sections from recommended papers.
TOCHI 2022
Hyeonsu B. Kang, Xin Qian, Tom Hope, Dafna Shahaf, Joel Chan, and Aniket Kittur
Analogies have been central to creative problem-solving throughout the history of science and technology. As the number of scientific papers continues to increase exponentially, there is a growing opportunity for finding diverse solutions to existing problems. However, realizing this potential requires the development of a means for searching through a large corpus that goes beyond surface matches and simple keywords. Here we contribute the first end-to-end system for analogical search on scientific papers and evaluate its effectiveness with scientists' own problems. Using a human-in-the-loop AI system as a probe we find that our system facilitates creative ideation, and that ideation success is mediated by an intermediate level of matching on the problem abstraction (i.e., high versus low). We also demonstrate a fully automated AI search engine that achieves a similar accuracy with the human-in-the-loop system. We conclude with design implications for enabling automated analogical inspiration engines to accelerate scientific innovation.
CACM 2024
Kyle Lo, Joseph C. Chang, Andrew Head et al. (including Hyeonsu B. Kang)
Communications of the ACM (forthcoming, 2024)
[
PDF
·
BibTeX]
Scholarly publications are key to the transfer of knowledge from scholars to others. However, research papers are information-dense, and as the volume of the scientific literature grows, the need for new technology to support scholars grows. In contrast to the process of finding papers, which has been transformed by Internet technology, the experience of reading research papers has changed little in decades. For instance, the PDF format for sharing papers remains widely used due to its portability but has significant downsides, inter alia, static content and poor accessibility for low-vision readers. This paper explores the question "Can recent advances in AI and HCI power intelligent, interactive, and accessible reading interfaces—even for legacy PDFs?" We describe the Semantic Reader Project, a collaborative effort across multiple institutions to explore automatic creation of dynamic reading interfaces for research papers. Through this project, we've developed many novel prototype interfaces and evaluated them with user study participants and real-world users to show improved reading experiences for scholars. We've also released a production research paper reading interface that will incorporate novel features as they mature. We structure this paper around five key opportunities for AI assistance in scholarly reading -- discovery, efficiency, comprehension, synthesis, and accessibility -- and present an overview of our progress and remaining open challenges.
UIST 2022
Hyeonsu B. Kang, Joseph Chee Chang, Yongsung Kim, Aniket Kittur
UIST 2022
[
PDF
·
ACM DL
·
BibTeX
·
ACM Ref
·
EndNote]
Reviewing the literature to understand relevant threads of past work is a critical part of research and vehicle for learning. However, as the scientific literature grows the challenges for users to find and make sense of the many different threads of research grow as well. Previous work has helped scholars to find and group papers with citation information or textual similarity using standalone tools or overview visualizations. Instead, in this work we explore a tool integrated into users' reading process that helps them with leveraging authors' existing summarization of threads, typically in introduction or related work sections, in order to situate their own work's contributions. To explore this we developed a prototype that supports efficient extraction and organization of threads along with supporting evidence as scientists read research articles. The system then recommends further relevant articles based on user-created threads. We evaluate the system in a lab study and find that it helps scientists to follow and curate research threads without breaking out of their flow of reading, collect relevant papers and clips, and discover interesting new articles to further grow threads.
CHI 2022
Hyeonsu B. Kang, Rafal Kocielnik, Andrew Head, Jiangjiang Yang, Matt Latzke, Aniket Kittur, Daniel Weld, Doug Downey, and Jonathan Bragg
CHI 2022
[
PDF
·
ACM DL
·
BibTeX
·
ACM Ref
·
EndNote]
Finding and engaging with the relevant scientific knowledge is foundational for intellectual progress in a society. Yet, with an exponential growth in publication rates, this becomes a challenging task. While personalized recommendations can help, they still may lack explanations of how certain papers are relevant and thus should be prioritized or attended to. To combat this, we developed a citation-based and two kinds of social relation-based approaches to boost user engagement with scholarly paper recommendations. For users who opted in, these approaches augmented paper recommendations included in email alerts with textual relevance descriptions underneath the recommendations. We evaluated our approaches in a randomized field experiment that ran for over two months and with 7,000+ users, and also in a controlled lab study (N=14) for deeper qualitative insights. We report on our findings and implications for the design of future approaches that aim to augment scholarly recommendations.
CHI 2022
Tom Hope, Ronen Tamari, Hyeonsu Kang, Daniel Hershcovich, Joel Chan, Aniket Kittur, and Dafna Shahaf
CHI 2022
[
PDF
·
ACM DL
·
BibTeX
·
ACM Ref
·
EndNote]
We explore a novel representation for automatically breaking up product ideas described in natural language into fine-grained functional aspects. This representation can capture the core purposes and mechanisms in ideas, and support the backbone interactions (e.g., functional search of ideas, mapping and exploration of the design space around a focal problem) for augmenting human intelligence and accelerating the rate of innovation.
CHI 2018
Hyeonsu B. Kang, Gabriel Amoako, Neil Sengupta, Steven Dow
CHI 2018
[
PDF
·
ACM DL
·
BibTeX
·
ACM Ref
·
EndNote]
“A picture is worth a thousand words.” We developed Paragon, a system that supports crowdworkers and peers during feedback exchange by enabling search of design examples that supplement the written feedback. In two lab studies, we found that i) feedback providers select poster examples that complement their feedback and align with a provided rubric and that ii) feedback providers give significantly more specific, actionable, and novel input when using an example-centric approach, as opposed to text alone.
NeurIPS 2023
Hyeonsu B. Kang, David Chuan-En Lin, Nikolas Martelaro, Aniket Kittur, Yan-Ying Chen, Matthew K. Hong
Nature is often used to inspire solutions for complex engineering problems, but achieving its full potential is challenging due to difficulties in discovering relevant analogies and synthesizing from them.
Here, we present an end-to-end system, BioSpark, that generates biological-analogical mechanisms and provides an interactive interface to comprehend and synthesize from them.
BioSpark pipeline starts with a small seed set of mechanisms and expands it using an iteratively constructed taxonomic hierarchies, overcoming data sparsity in manual expert curation and limited conceptual diversity in automated analogy generation via LLMs.
The interface helps designers with recognizing and understanding relevant analogs to design problems using four main interaction features.
We evaluate the biological-analogical mechanism generation pipeline and showcase the value of BioSpark through case studies.
We end with discussion and implications for future work in this area.
NAACL 2022
Hyeonsu B. Kang*, Sheshera Mysore*, Kevin Huang*, Haw-Shiuan Chang, Thorben Prein, Andrew McCallum, Aniket Kittur, Elsa Olivetti
NAACL 2022 Workshop
[
PDF
·
BibTeX]
Exposure to ideas in domains outside a scientist's own may benefit her in reformulating existing research problems in novel ways and discovering new application domains for existing solution ideas. While improved performance in scholarly search engines can help scientists efficiently identify relevant advances in domains they may already be familiar with, it may fall short of helping them explore diverse ideas outside such domains. In this paper we explore the design of systems aimed at augmenting the end-user ability in cross-domain exploration with flexible query specification. To this end, we develop an exploratory search system in which end-users can select a portion of text core to their interest from a paper abstract and retrieve papers that have a high similarity to the user-selected core aspect but differ in terms of domains. Furthermore, end-users can 'zoom in' to specific domain clusters to retrieve more papers from them and understand nuanced differences within the clusters. Our case studies with scientists uncover opportunities and design implications for systems aimed at facilitating cross-domain exploration and inspiration.
CI 2019
Matching Open Innovation Projects for Analogical Feedback Exchange
Hyeonsu Kang, Felicia Ng, Aniket Kittur
Collective Intelligence 2019
We developed an algorithm for matching teams in open innovation contests that tackle related conservataion challenges using diverse approaches, thereby encouraging the transfer of analogical inspirations between teams. To this end, our algorithm used pre-trained language models to encode the natural language text descriptions of team challenges and their solution approaches into a vector similarity space, then computed semantic similarity between them to systematically find teams tackling similar problems using diverse approaches, shown as a conducive mechanism for the transfer.