The LAS Human-Machine Collaboration team aims to bring closer a world in which computers help analysts without having to be explicitly told what to do. Themes of 2020 HMC research include instrumentation of and intervention into analyst workflows, applications of augmented and virtual reality, and efforts to understand and support analyst searches for tradecraft documentation.Team Leads: Kirk Mancinik, Jascha Swisher, Ken Thompson
Throughout the last decade, researchers have shown that the effectiveness of a visualization tool depends on the experience, personality, and cognitive abilities of the user. This work has also demonstrated that these individual traits can have significant implications for tools that support reasoning and decision-making with data. In this multi-year collaboration with LAS, we set an agenda for transforming the one-size-fits-all design methodology for visualization design. Our work seeks to broaden the access to visualization tools by systematically investigating how individual traits modulate effectiveness, and by creating design guidelines for creating tools that are tailored to the user.
Participants: R. Jordan Crouser (Smith College), Alvitta Ottley (WUSTL), Zhengliang Liu (WUSTL), Kendra Swanson (Smith College), Ananda Montoly (Smith College)
As data becomes more voluminous and complex, queries have also become more complex. Query complexity can increase the chance of error or make data less accessible. Part of the work for the human-machine collaboration team in 2020 is to reduce query complexity which can give more time to analysts to focus on other essential activities. Therefore, we aim at tackling the specific challenge as a text-to-query generation task in zero-shot (unseen data) and few-shot (limited-access data) settings. We propose a transformer-based model to generate SQL queries as generating natural language sequences. Our results show that the proposed model outperforms strong baselines in both zero-shot and few-shot settings. In addition, we diagnose error types and identify potential future directions.
Participants: Zhen Guo (NCSU), Ruijie Xi (NCSU), Munindar Singh (NCSU)
Language analysts do much more than translate. They must provide nuance and context to communications, which often requires significant on the job experience and training, extensive research, or both. Much of this target knowledge resides in the minds of our senior language analysts, and is recorded as comments in transcripts, documented in scan notes, or is inherent in the transcripts themselves. The TurboPotato project attempts to automatically extract and organize this information as a structured knowledge graph. As part of exploring how best to extract this information from dialogue, the TurboPotato project is attempting to build a "Seinfeld Knowlege Graph" (SKnoG). Extracting information from scripts from the Seinfeld television show is a surprisingly useful proxy problem as characters on the show frequently refer to others with pseudonyms such as “K-Man” or “Man Hands,” or otherwise coin terms that outsiders might not understand.
Participants: Alexis Sparko (LAS)
This video summarizes the contributions of the 2020 "Explainable Interventions for Analyst Workflow," performed at the University of Kentucky. In this research, we collected detailed logs of how analysts use tools to investigate a set of documents. These logs are annotated with their intentions and insights as they work. From these logs, we build a workflow graph to model how individual analysts, and analysts in general, explore the documents, use their tools, and draw conclusions. This graph-based model of workflow will enable intelligent interventions.
Participants: Brent Harrison (University of Kentucky), Stephen Ware (University of Kentucky)
Intelligence Analysis is becoming more challenging as data grows, machine learning automation may help. However, current machine learning systems have tedious interaction. In this project, we are building an interactive Augmented Reality front-end interface to help analysts communicate with back-end machine learning assistants.
Participants: Teresa Hong (NCSU), Benjamin Watson (NCSU), Ken Thompson (LAS), Jascha Swisher (LAS), Kirk Mancinik (LAS), Sarah Margaret Tullos (LAS), Courtney Sheldon (LAS), Paul Davis (LAS)
In intelligence analysis, sensemaking of large amounts of textual information is a cognitively demanding task. Our prior work has shown the value of large 2D high-resolution-display spaces in supporting sensemaking, by providing a “Space to Think” in which analysts can triage and organize information, externalize their thought process and workflow, and synthesize hypotheses. Recent advances in interactive, immersive 3D display technologies offer new opportunities for scaling up the space to think by exploiting depth, orientation, and physical navigation through the space. We propose to investigate how analysts can use 3D immersive spaces for sensemaking tasks in intelligence analysis scenarios. We will investigate both virtual reality (VR) and augmented reality (AR) settings. To explore these issues, we will develop an Immersive Space to Think (IST) system, which will allow users to interact with a set of documents in head-worn VR/AR displays. We will test and refine the system through usability studies in which participants will complete an intelligence analysis task using IST. We hypothesize that IST will provide a more expressive and expansive space to think during analytic synthesis, exploit embodied cognitive capabilities that improve analytic outcomes and efficiency, and support rich semantic interactions that enable future augmentation with machine learning.
Participants: Kylie Davidson (Virginia Tech), Doug Bowman (Virginia Tech), Chris North (Virginia Tech)
It’s not uncommon for an analyst to have a task that they do not immediately know how to accomplish. In 2020, the University of North Carolina research team worked to better understand how Department of Defense analysts and others search for such procedural knowledge, through detailed surveys of current users of a tradecraft knowledge repository, and a lab study where participants searched online to determine how best to accomplish a set of provided tasks.
Participants: Jamie Arguello (UNC), Rob Capra (UNC), Bogeum Choi (UNC), Sarah Casteel (UNC)
In many real-world problems, analysts deal with information found in large and heterogeneous data sources. For instance, in social science, these sources include structured census data, unstructured social media posts, and large-scale social networks. In cybersecurity, analysts look at structured device logs and their visualizations on a dashboard, vulnerability assessment reports, as well as news about security breaches. In the healthcare domain, relevant data sources include electronic health records, biomedical literature, patient surveys, public health surveillance data, and so on. Traditionally, different data are managed by different tools because the underlying techniques for data search and retrieval can be quite different. This separation of tools for structured and unstructured data management presents a big challenge when a data analyst tries to explore and synthesize information across data sources. The analyst has to take notes of her current focus in one data and manually transfer it to another data. This project aims to computationally model a user’s ever-changing analytic focus during exploratory visual analysis, inferred from user interactions. This requires a unified representation of the analytic focus. Our approach was prototyped within a pre-existing visual analysis system in the medical domain, presenting visualizations for structured electronic medical record data. The prototype implementation leverages this focus model capability to capture all user interactions and continuously search PubMed abstracts for documents relevant to the user’s focus. These relevant abstracts are displayed to users to contextualize the current visualization and to suggest new opportunities for future exploratory analysis. Evaluation results with 24 users show that the modeling approach has high levels of accuracy and is able to surface highly relevant medical abstracts.
Participants: Zhilan Zhou (UNC), Ximing Wen (UNC), Yue Wang (UNC), David Gotz (UNC)