Research Showcase Agenda
Tuesday, November 15, 2022
|10:00 - 10:30am||Registration and coffee
Singh Gallery (4th floor Gates Center)
|10:30 - 11:10am||Welcome and Overview by Ed Lazowska and Magda Balazinska + various faculty on research areas
Zillow Commons (4th floor Gates Center)
11:15am - 12:20pm
|Networking & Systems
Gates Center, Room 271
Gates Center, Room 371
Gates Center, Zillow Commons
|12:25 - 1:25pm||Lunch + Keynote Talk:
David vs. Goliath: the Art of Leaderboarding in the Era of Extreme-Scale Neural Models, Yejin Choi, Paul G. Allen School of Computer Science & Engineering
Microsoft Atrium in the Allen Center
1:30 - 2:35pm
Gates Center, Room 271
|Computing for the Environment
Gates Center, Room 371
|Security & Privacy
Gates Center, Zillow Commons
2:40 - 3:45pm
Gates Center, Room 271
Gates Center, Room 371
|Computing for Society
Gates Center, Zillow Commons
3:50 - 4:55pm
Gates Center, Room 271
|Novel Input & Interaction
Gates Center, Room 371
Gates Center, Zillow Commons
|5:00 - 7:00pm||Open House: Reception + Poster Session
Microsoft Atrium in the Allen Center
|7:15 - 7:45pm||Program: Madrona Prize, People's Choice Awards
Microsoft Atrium in the Allen Center
- 11:15-11:20: Introduction and Overview, Simon Peter
- 11:20-11:35: FlexTOE: Flexible TCP Offload with Fine-Grained Parallelism, Rajath Shashidhara
FlexTOE is a flexible, yet high-performance TCP offload engine (TOE) to SmartNICs. FlexTOE eliminates almost all host data-path TCP processing and is fully customizable. FlexTOE interoperates well with other TCP stacks, is robust under adverse network conditions, and supports POSIX sockets.
FlexTOE focuses on data-path offload of established connections, avoiding complex control logic and packet buffering in the NIC. FlexTOE leverages fine-grained parallelization of the TCP data-path and segment reordering for high performance on wimpy SmartNIC architectures, while remaining flexible via a modular design. We compare FlexTOE on an Agilio-CX40 to host TCP stacks Linux and TAS, and to the Chelsio Terminator TOE. We find that Memcached scales up to 38% better on FlexTOE versus TAS, while saving up to 81% host CPU cycles versus Chelsio. FlexTOE provides competitive performance for RPCs, even with wimpy SmartNICs. FlexTOE cuts 99.99th-percentile RPC RTT by 3.2× and 50% versus Chelsio and TAS, respectively. FlexTOE's data-path parallelism generalizes across hardware architectures, improving single connection RPC throughput up to 2.4× on x86 and 4× on BlueField. FlexTOE supports C and XDP programs written in eBPF. It allows us to implement popular data center transport features, such as TCP tracing, packet filtering and capture, VLAN stripping, flow classification, firewalling, and connection splicing.
- 11:35-11:50: Gimbal: enabling multi-tenant storage disaggregation on SmartNIC JBOFs, Jaehong Min
Emerging SmartNIC-based disaggregated NVMe storage has become a promising storage infrastructure due to its competitive IO performance and low cost. These SmartNIC JBOFs are shared among multiple co-resident applications, and there is a need for the platform to ensure fairness, QoS, and high utilization. Unfortunately, given the limited computing capability of the SmartNICs and the non-deterministic nature of NVMe drives, it is challenging to provide such support on today's SmartNIC JBOFs.
This talk presents Gimbal, a software storage switch that orchestrates IO traffic between Ethernet ports and NVMe drives for co-located tenants. It enables efficient multi-tenancy on SmartNIC JBOFs using the following techniques: a delay-based SSD congestion control algorithm, dynamic estimation of SSD write costs, a fair scheduler that operates at the granularity of a virtual slot, and an end-to-end credit-based flow control channel. Our prototyped system not only achieves up to x6.6 better utilization and 62.6% less tail latency but also improves the fairness for complex workloads. It also improves a commercial key-value store performance in a multi-tenant environment with x1.7 better throughput and 35.0% less tail latency on average.
- 11:50-12:05: Towards an Architecture for Network Tail Latency SLOs, Kevin Zhao
To application programmers, network performance remains inconsistent, with a huge variance between best and worst case RPC responsiveness. Faster networks, which are thought to improve matters, are surprisingly likely to make the problem worse, and existing techniques are a poor match for application needs. In this talk we argue that we need a new approach that operates on network traffic as a whole, addressing workload observability and modeling, efficient real-time prediction of the effect of changes to workloads and control knobs, a control loop that operates on traffic classes and network aggregates rather than individual flows or paths, and enforcement mechanisms that are resilient to bursty traffic. We discuss incremental progress we’ve made towards this vision, including a highly parallel network simulator that can produce approximate tail latency predictions for large scale networks about three orders of magnitude faster than traditional simulators.
- 12:05-12:20: Xenic: SmartNIC-Accelerated Distributed Transactions, Henry Schuh
High-performance distributed transactions require efficient remote operations on database memory and protocol metadata. The high communication cost of this workload calls for hardware acceleration. Recent research has applied RDMA to this end, leveraging the network controller to manipulate host memory without consuming CPU cycles on the target server. However, the basic read/write RDMA primitives demand trade-offs in data structure and protocol design, limiting their benefits. SmartNICs are a flexible alternative for fast distributed transactions, adding programmable compute cores and on-board memory to the network interface. Applying measured performance characteristics, we design Xenic, a SmartNIC-optimized transaction processing system. Xenic applies an asynchronous, aggregated execution model to maximize network and core efficiency. Xenic's co-designed data store achieves low-overhead remote object accesses. Additionally, Xenic uses flexible, point-to-point communication patterns between SmartNICs to minimize transaction commit latency. We compare Xenic against prior RDMA- and RPC-based transaction systems with the TPC-C, Retwis, and Smallbank benchmarks. Our results for the three benchmarks show 2.42x, 2.07x, and 2.21x throughput improvement, 59%, 42%, and 22% latency reduction, while saving 2.3, 8.1, and 10.1 threads per server.
- 11:15-11:20: Introduction and Overview, Jon Froehlich
- 11:20-11:32: Anticipate and Adjust: Cultivating Access in Human-Centered Methods, Kelly Mack
Methods are fundamental to doing research and can directly impact who is included in scientific advances. Given accessibility research's increasing popularity and pervasive barriers to conducting and participating in research experienced by people with disabilities, it is critical to ask how methods are made accessible. Yet papers rarely describe their methods in detail. This talk reports on 17 interviews with accessibility experts about how they include both facilitators and participants with disabilities in popular user research methods. Our findings offer strategies for anticipating access needs while remaining flexible and responsive to unexpected access barriers. We emphasize the importance of considering accessibility at all stages of the research process, and contextualize access work in recent disability and accessibility literature. We explore how technology or processes could reflect a norm of accessibility. Finally, we discuss how various needs intersect and conflict and offer a practical structure for planning accessible research.
- 11:32-11:44 Quantifying Touch: New Metrics for Characterizing What Happens During a Touch, Judy Kong
Measures of human performance for touch-based systems have focused mainly on overall metrics like touch accuracy and target acquisition speed. But touches are not atomic—they unfold over time and space, especially for users with limited fine motor function, for whom it can be difficult to perform quick, accurate touches. To gain insight into what happens during a touch, we offer 15 target-agnostic touch metrics, most of which have not been mathematically formalized in the literature. They are touch direction, variability, drift, duration, extent, absolute/signed area change, area variability, area deviation, area extent, absolute/signed angle change, angle variability, angle deviation, and angle extent. These metrics regard a touch as a time series of ovals instead of a mere (x, y) coordinate. We provide mathematical definitions and visual depictions of our metrics, and consider policies for calculating our metrics when multiple fingers perform coincident touches. To exercise our metrics, we collected touch data from 27 participants, 15 of whom reported having limited fine motor function. Our results show that our metrics effectively characterize touch behaviors including fine-motor challenges. Our metrics can be useful for both understanding users and for evaluating touch-based systems to inform their design.
- 11:44-11:56: A Large-Scale Longitudinal Analysis of Missing Label Accessibility Failures in Android Apps, Mingyuan Zhong
We present the first large-scale longitudinal analysis of missing label accessibility failures in Android apps. We developed a crawler and collected monthly snapshots of 312 apps over 16 months. We use this unique dataset in empirical examinations of accessibility not possible in prior datasets. Key large-scale findings include missing label failures in 55.6% of unique image-based elements, longitudinal improvement in ImageButton elements but not in more prevalent ImageView elements, that 8.8% of unique screens are unreachable without navigating at least one missing label failure, that app failure rate does not improve with number of downloads, and that effective labeling is neither limited to nor guaranteed by large software organizations. We then examine longitudinal data in individual apps, presenting illustrative examples of accessibility impacts of systematic improvements, incomplete improvements, interface redesigns, and accessibility regressions. We discuss these findings and potential opportunities for tools and practices to improve label-based accessibility.
- 11:56-12:08: CodeWalk: Facilitating Shared Awareness in Mixed-Ability Collaborative Software Development, Venkatesh Potluri
COVID-19 accelerated the trend toward remote software development, increasing the need for tightly-coupled synchronous collaboration. Existing tools and practices impose high coordination overhead on blind or visually impaired (BVI) developers, impeding their abilities to collaborate effectively, compromising their agency, and limiting their contribution. To make remote collaboration more accessible, we created CodeWalk, a set of features added to Microsoft’s Live Share VS Code extension, for synchronous code review and refactoring. We chose design criteria to ease the coordination burden felt by BVI developers by conveying sighted colleagues’ navigation and edit actions via sound effects and speech. We evaluated our design in a within-subjects experiment with 10 BVI developers. Our results show that CodeWalk streamlines the dialogue required to refer to shared workspace locations, enabling participants to spend more time contributing to coding tasks. This design offers a path towards enabling BVI and sighted developers to collaborate on more equal terms.
- 12:08-12:20: Towards Semi-Automatic Detection and Localization of Indoor Accessibility Issues using Mobile Depth Scanning and Computer Vision, Xia Su
p>To help improve the safety and accessibility of indoor spaces, researchers and health professionals have created assessment instruments that enable homeowners and trained experts to audit and improve homes. With advances in computer vision, augmented reality (AR), and mobile sensors, new approaches are now possible. We introduce RASSAR (Room Accessibility and Safety Scanning in Augmented Reality), a new proof-of-concept prototype for semi-automatically identifying, categorizing, and localizing indoor accessibility and safety issues using LiDAR + camera data, machine learning, and AR. We present an overview of the current RASSAR prototype and a preliminary evaluation in a single home.
- 11:15-11:20: Introduction and Overview, Magda Balazinska
- 11:20-11:35: Human-AI Collaboration Enables More Empathic Conversations in Text-based Peer-to-Peer Mental Health Support, Ashish Sharma
- 11:35-11:50: Optimizing Dataflow Systems for Scalable Interactive Visualization, Junran Yang
- 11:50-12:05: Data Management for Model Exploration and Debugging, Dong He
- 12:05-12:20: Tisane: Authoring Statistical Models via Formal Reasoning from Conceptual and Data Relationships, Eunice Jun
- 1:30-1:35: Introduction and Overview, Su-In Lee
- 1:35-1:50: Explainable AI: where we are and how to move forward for biology and healthcare, Su-In Lee
I will present some of the research done in the AIMS lab on the topics of explainable AI applied to biomedical sciences. The main message is that explainable AI can help us make new biological discoveries from data or make informed clinical decisions and can even open new research directions in biomedicine, however, will need to evolve and improve to be able to really solve real-world problems in computational biology, medicine, and healthcare.
- 1:50-2:05: Flaws in the reasoning processes of clinical AI, Alex DeGrave + Soham Gadgil
As clinical AI devices gain regulatory approval and enter deployment, lack of knowledge about the reasoning processes of these devices heightens risk of unexpected failures and exposes patients to potential harm. In this talk, we recount how two types of medical imaging AI devices — for detection of COVID-19 in chest X-rays, and for detection of skin cancer — arrive at their predictions in problematic or unexpected ways. We highlight how tools from explainable AI act as a "computational magnifying glass" to more closely scrutinize medical AI devices and understand their reasoning processes. Finally, we preview our plans for future explainable AI tools to enable more direct, hypothesis-driven interrogation of high-stakes AI systems.
- 2:05-2:20: Accurate and Efficient Vision Transformer Explanations for Medical AI, Chanwoo Kim
Transformers have been increasingly adopted in medical image analysis, but understanding what drives their predictions remains a challenging problem. Current explanation approaches rely on attention values or input gradients, but these provide a limited understanding of a model's dependencies. Shapley values offer a theoretically sound alternative, but their high computational cost makes them impractical for large, high-dimensional models. In this work, we aim to make Shapley values practical for vision transformers (ViTs). Our experiments compare Shapley values to many baseline methods (e.g., attention rollout, GradCAM, LRP), and we find that our approach provides more accurate explanations than existing methods for ViTs.
- 2:20-2:35: Contrastive Corpus Attribution for Explaining Representations, Chris Lin
Despite the widespread use of unsupervised models, very few methods are designed to explain them. Most explanation methods explain a scalar model output. However, unsupervised models output representation vectors, the elements of which are not good candidates to explain because they lack semantic meaning. To bridge this gap, recent works defined a scalar explanation output: a dot product-based similarity in the representation space to the sample being explained (i.e., an explicand). Although this enabled explanations of unsupervised models, the interpretation of this approach can still be opaque because similarity to the explicand's representation may not be meaningful to humans. To address this, we propose contrastive corpus similarity, a novel and semantically meaningful scalar explanation output based on a reference corpus and a contrasting foil set of samples. We demonstrate that contrastive corpus similarity is compatible with many post-hoc feature attribution methods to generate COntrastive COrpus Attributions (COCOA) and quantitatively verify that features important to the corpus are identified. We showcase the utility of COCOA in two ways: (i) we draw insights by explaining augmentations of the same image in a contrastive learning setting (SimCLR); and (ii) we perform zero-shot object localization by explaining the similarity of image representations to jointly learned text representations (CLIP).
- 1:30-1:35: Introduction and Overview, Vikram Iyer
- 1:35-1:55: Reconstructing Whaling Voyages to build Maps of Whale Densities by Species, Ameya Patil
- 1:55-2:15 Machine learning for building energy usage prediction during climate extremes, Joe Breda
Buildings make up nearly 70% of the total electricity and 40% of total energy consumption in most countries, with HVAC accounting for nearly half that. Currently set-points are based on seasonal changes in outdoor temperature which depend on the consistency of the climate in a given location. On top of this, there are many variables which impact indoor temperature and thermal comfort which are not included in set-point selection. We propose a distributed sensing system including multi-point temperature sensing in buildings and wearable devices to gain aa fine-grained understanding of how temperature changes in buildings as a way to improve efficiency of HVAC control systems.
- 2:15-2:35: Title forthcoming, TBD
- 1:30-1:35: Introduction and Overview, Franzi Roesner
- 1:35-1:55: Exploring User Reactions and Mental Models Towards Perceptual Manipulation Attacks in Mixed Reality, Kaiming Cheng
Perceptual Manipulation Attacks (PMA) involve manipulating users’ multi-sensory (e.g., visual, auditory, haptic) perceptions of the world through Mixed Reality (MR) content, in order to influence users’ judgments and following actions. For example, a MR driving application that is expected to show safety-critical output might also (maliciously or unintentionally) overlay the wrong signal on a traffic sign, misleading the user into slamming on the brake. While current MR technology is sufficient to create such attacks, little research has been done to understand how users perceive, react to, and defend against such potential manipulations. To provide a foundation for understanding and addressing PMA in MR, we conducted an in-person study with 21 participants. We developed three PMA in which we focused on attacking three different perceptions: visual, auditory, and situational awareness. Our study first investigates how user reactions are affected by evaluating their performance on “microbenchmark” tasks under benchmark and different attack conditions. We observe both primary and secondary impacts from attacks, later impacting participants’ performance even under non-attack conditions. We follow up with interviews, surfacing a range of user reactions and interpretations of PMA. Through qualitative data analysis of our observations and interviews, we identify various defensive strategies participants developed, and we observe how these strategies sometimes backfire. We derive recommendations for future investigation and defensive directions based on our findings.
- 1:55-2:15: Electronic Monitoring Smartphone Apps: An Analysis of Risks from Technical, Human-Centered, and Legal Perspectives, Kentrell Owens
Electronic monitoring is the use of technology to track individuals accused or convicted of a crime (or civil violation) as an "alternative to incarceration." Traditionally, this technology has been in the form of ankle monitors, but recently federal, state, and local entities around the U.S. are shifting to using smartphone applications for electronic monitoring. These applications (apps) purport to make the monitoring simpler and more convenient for both the community supervisor and the person being monitored. However, due to the multipurpose nature of smartphones in people's lives and the amount of sensitive information (e.g., sensor data) smartphones make available, this introduces new risks to people coerced to use these apps.
To understand what type of privacy-related and other risks might be introduced to people who use these applications, we conducted a privacy-oriented analysis of 16 Android apps used for electronic monitoring. We analyzed the apps first technically, with static and (limited) dynamic analysis techniques. We also analyzed user reviews in the Google Play Store to understand the experiences of the people using these apps, and also the privacy policies. We found that apps contain numerous trackers, the permissions requested by them vary widely (with the most common one being location), and the reviews indicate that people find the apps invasive and frequently dysfunctional. We end the paper by encouraging mobile app marketplaces to reconsider their role in the future of electronic monitoring apps, and computer security and privacy researchers to consider their potential role in auditing carceral technologies. We hope that this work will lead to more transparency in this obfuscated ecosystem.
- 2:15-2:35: Anti-Privacy and Anti-Security Advice on TikTok: Case Studies of Technology-Enabled Surveillance and Control in Intimate Partner and Parent-Child Relationships, Miranda Wei
Modern technologies including smartphones, AirTags, and tracking apps enable surveillance and control in interpersonal relationships. In this work, we study videos posted on TikTok that give advice for how to surveil or control others through technology, focusing on two interpersonal contexts: intimate partner relationships and parent-child relationships. We collected 98 videos across both contexts and investigate (a) what types of surveillance or control techniques the videos describe, (b) what assets are being targeted, (c) the reasons that TikTok creators give for using these techniques, and (d) defensive techniques discussed. Additionally, we make observations about how social factors – including social acceptability, gender, and TikTok culture – are critical context for the existence of this anti-privacy and anti-security advice. We discuss the use of TikTok as a rich source of qualitative data for future studies and make recommendations for technology designers around interpersonal surveillance and control.
- 2:40-2:45: Introduction and Overview, Shwetak Patel
- 2:45-3:00: Making Medical Devices Accessible to the Next Billion People, Justin Chan
This is an exciting time to be a researcher in the field of computational health where projects can have a significant and immediate effect on reducing health inequity both in the USA and around the world. In this talk, I will be describing my work on developing frugal solutions for three medical problems for which there are significant barriers to care and testing specifically 1) Universal Newborn Hearing Screening Using Earphones in low-income countries 2) Detecting Middle Ear Fluid Using Smartphones and 3) Blood Clot Testing Using Smartphones. These projects center around creatively repurposing commodity hardware and materials like smartphones, earphones, paper and plastic to develop medical devices that can significantly reduce cost of medical testing by orders of magnitude, without sacrificing performance.
- 3:00-3:15: GlucoScreen: Diabetes prescreening using smartphones, Anand Waghmare
Blood glucose measurement is commonly used to screen for and monitor diabetes, a chronic condition characterized by the inability to effectively modulate blood glucose that can lead to heart disease, vision loss, and kidney failure. Early detection of prediabetes can forestall or reverse more serious illness if healthy lifestyle adjustments or medical interventions are made in a timely manner. Current diabetes screening methods require visits to a healthcare facility and the use of over-the-counter glucose-testing devices (glucometers), both of which are costly or inaccessible for many populations, reducing the chances of early disease detection. Therefore, we developed GlucoScreen, a readerless glucose test strip that enables affordable, single-use, at-home glucose testing, leveraging the user's touchscreen cellphone for reading and displaying the result. By integrating minimal, low-cost electronics with commercially available blood glucose testing strips, the GlucoScreen prototype introduces a new type of low-cost, battery-free glucose testing tool that works with any smartphone, obviating the need to purchase a separate dedicated reader.
- 3:15-3:30: Smartphone Camera Oximetry in an Induced Hypoxemia Study, Jason Hoffman
Hypoxemia, a medical condition that occurs when the blood is not carrying enough oxygen to adequately supply the tissues, is a leading indicator for dangerous complications of respiratory diseases like asthma, COPD, and COVID-19. While purpose-built pulse oximeters can provide accurate blood-oxygen saturation (SpO2) readings that allow for diagnosis of hypoxemia, enabling this capability in unmodified smartphone cameras via a software update could give more people access to important information about their health. Towards this goal, we performed the first clinical development validation on a smartphone camera-based SpO2 sensing system using a varied fraction of inspired oxygen (FiO2) protocol, creating a clinically relevant validation dataset for solely smartphone-based contact PPG methods on a wider range of SpO2 values (70–100%) than prior studies (85–100%). We built a deep learning model using this data to demonstrate an overall MAE = 5.00% SpO2 while identifying positive cases of low SpO2 < 90% with 81% sensitivity and 79% specificity. We also provide the data in open-source format, so that others may build on this work.
- 3:30-3:45: GLOBEM: Generalization of LOngitudinal BEhavior Modeling, Xuhai "Orson" Xu
There is a growing body of research revealing that longitudinal passive sensing data from smartphones and wearable devices can capture daily behavior signals for human behavior modeling, such as depression detection. Most prior studies build and evaluate machine learning models using data collected from a single population. However, to ensure that a behavior model can work for a larger group of users, its generalizability needs to be verified on multiple datasets from different populations. We present the first work evaluating cross-dataset generalizability of longitudinal behavior models, using depression detection as an application. We present the first multi-year passive sensing datasets, containing over 700 user-years and 497 unique users’ data collected from mobile and wearable sensors, together with a wide range of well-being metrics. Our datasets can support multiple cross-dataset evaluations of behavior modeling algorithms’ generalizability across different users and years. As a starting point, we provide the benchmark results of 18 algorithms on the task of depression detection. Our results indicate that both prior depression detection algorithms and domain generalization techniques show potential but need further research to achieve adequate cross-dataset generalizability. We envision our multi-year datasets can support the UbiComp and ML community in developing generalizable longitudinal behavior modeling algorithms.
- 2:40-2:45: Introduction and Overview, Dieter Fox
- 2:45-3:00: Robot Learning in Competitive Games, Boling Yang
Competition is one of the most common forms of human interaction, but there has only been limited discussion of competitive interaction between a robot and other embodied agents, such as another robot or even a human. In this presentation, we will share our research on robot learning in competitive settings in the context of the following two applications: 1. Human-Robot Interaction -- A competitive robot can serve a variety of positive roles, including motivating human users and inspiring their potential in certain scenarios, such as sports and physical exercise. 2. Dexterous Manipulation -- We will discuss our ongoing efforts on robot manipulation for densely packed containers via competitive training.
- 3:00-3:15: Robust high speed autonomous driving in complex outdoor terrain, Matt Schmittle
Autonomous driving is becoming more ubiquitous but what happens when we take autonomous vehicles off-road? The off-road terrain is unstructured, lacks substantial prior information, and is difficult to segment. These challenges force us to rethink our approaches to perception, planning, and control. In this talk I will cover the foundations of autonomous driving in off-road terrain, discuss the main challenges in this setting, and share some demonstrations of real world off-road autonomy from the UW RACER team.
- 3:15-3:30: Modeling Human Helpfulness with Individual and Contextual Factors for Robot Planning, Amal Nanavati
Robots deployed in human-populated spaces often need human help to effectively complete their tasks. Yet, a robot that asks for help too frequently or at the wrong times may cause annoyance, and a robot that asks too infrequently may be unable to complete its tasks. In this paper, we present a model of humans' helpfulness towards a robot in an office environment, learnt from online user study data. Our key insight is that effectively planning for a task that involves bystander help requires disaggregating individual and contextual factors and explicitly reasoning about uncertainty over individual factors. Our model incorporates the individual factor of latent helpfulness and the contextual factors of human busyness and robot frequency of asking. We integrate the model into a Bayes-Adaptive Markov Decision Process (BAMDP) framework and run a user study that compares it to baseline models that do not incorporate individual or contextual factors. The results show that our model significantly outperforms baseline models by a factor of 1.5X, and it does so by asking for help more effectively: asking 1.2X times less while still receiving more human help on average.
- 3:30-3:45: Learning Robust Real-World Dexterous Grasping Policies via Implicit Shape Augmentation, Zoey Chen
Dexterous robotic hands have the capability to interact with a wide variety of household objects to perform tasks like grasping. However, learning robust real-world grasping policies for arbitrary objects has proven challenging due to the difficulty of generating high-quality training data. In this work, we propose a learning system (ISAGrasp) for leveraging a small number of human demonstrations to bootstrap the generation of a much larger dataset containing successful grasps on a variety of novel objects. Our key insight is to use a correspondence-aware implicit generative model to deform object meshes and demonstrated human grasps in order to generate a diverse dataset of novel objects and successful grasps for supervised learning while maintaining semantic realism. We use this dataset to train a robust grasping policy in simulation which can be deployed in the real world. We demonstrate grasping performance with a four-fingered Allegro hand in both simulation and the real world, and show this method can handle entirely new semantic classes and achieve a 79% success rate on grasping unseen objects in the real world.
- 2:40-2:45: Introduction and Overview, Franzi Roesner
- 2:45-2.57: Seattle Community Network, Kurtis Heimerl
- 2:57-3:09: Gendered Mental Health Stigma in Masked Language Models, Lucille Njoo
- 3:09-3:21: Supporting the Exploration of Harms of Technologies, Yuren "Rock" Pang
- 3:21-3:33: Disparate Impacts on Online Information Access during the COVID-19 Pandemic, Jina Suh
- 3:33-3:45: ARTT: Supporting Peer-Based Misinformation Response,Amy Zhang + Franzi Roesner
- 3:50-3:55: Introduction and Overview, Sheng Wang
- 3:55-4:10: Cross-Linked Unified Embedding for single cell multi-omics representation learning, Xinming Tu
Multi-modal learning is essential for understanding information in the real world. Jointly learning from multi-modal data enables global integration of both shared and modality-specific information, but current strategies often fail when observa- tions from certain modalities are incomplete or missing for part of the subjects. To learn comprehensive representations based on such modality-incomplete data, we present a semi-supervised neural network model called CLUE (Cross-Linked Unified Embedding). Extending from multi-modal VAEs, CLUE introduces the use of cross-encoders to construct latent representations from modality-incomplete observations. Representation learning for modality-incomplete observations is common in genomics. For example, human cells are tightly regulated across multi- ple related but distinct modalities such as DNA, RNA, and protein, jointly defining a cell’s function. We benchmark CLUE on multi-modal data from single cell measurements, illustrating CLUE’s superior performance in all assessed categories of the NeurIPS 2021 Multimodal Single-cell Data Integration Competition. While we focus on analysis of single cell genomic datasets, we note that the proposed cross-linked embedding strategy could be readily applied to other cross-modality representation learning problems.
- 4:10-4:25: An explainable AI framework for interpretable biological age, Wei Qiu
An individual's biological age is a measurement of health status and provides a mechanistic understanding of aging. Age clocks estimate a biological age of an individual based on their various features. Existing clocks have key limitations caused by the undesirable tradeoff between accuracy and interpretability. Here, we present 'ENABL Age,' a framework that combines machine learning models with explainable AI methods to accurately estimate biological age with individualized explanations. To construct ENABL Age clock, we predict an age-related outcome of interest, and then rescale the predictions nonlinearly to estimate biological age. To explain the ENABL Age clock, we extended existing XAI methods so we could decompose any individual’s ENABL Age into contributing risk factors. The individualized explanations provide insights into the important risk factors for biological age. We further show that ENABL Age clocks trained on different age-related outcomes capture different and sensible aging mechanisms. Our results show strong mortality prediction power, interpretability, and flexibility. ENABL Age takes a consequential step towards accurate interpretable biological age prediction built with complex, high-performance ML models.
- 4:25-4:40: ProTranslator: Zero-Shot Protein Function Prediction Using Textual Description, Hanwen Xu
Accurately finding proteins and genes that have a certain function is the prerequisite for a broad range of biomedical applications. Despite the encouraging progress of existing computational approaches in protein function prediction, it remains challenging to annotate proteins to a novel function that is not collected in the Gene Ontology and does not have any annotated proteins. This limitation, a “side effect” from the widely-used multi-label classification problem setting of protein function prediction, hampers the progress of studying new pathways and biological processes, and further slows down research in various biomedical areas. Here, we tackle this problem by annotating proteins to a function only based on its textual description so that we don’t need to know any associated proteins for this function. The key idea of our method ProTranslator is to redefine protein function prediction as a machine translation problem, which translates the description word sequence of a function to the amino acid sequence of a protein. We can then transfer annotations from functions that have similar textual description to annotate a novel function. We observed substantial improvement in annotating novel functions and sparsely annotated functions on CAFA3, SwissProt and GOA datasets. We further demonstrated how our method accurately predicted gene members for a given pathway in Reactome, KEGG and MSigDB only based on the pathway description. Finally, we showed how ProTranslator enabled us to generate the textual description instead of the function label for a set of proteins, providing a new scheme for protein function prediction. We envision ProTranslator will give rise to a protein function “search engine” that returns a list of proteins based on the free text queried by the user.
- 4:40-4:55: Symmetry-aware representation learning of 3D protein structures, Gian Marco + Mike Pun
Characterizing a protein’s function, such as its ability to interact with a cognate protein or ligand, is a key component for the development of novel therapies and drug discovery. While protein function is determined by interactions at the structural level, predicting function from structure remains a difficult task due to the complexity of three dimensional data. Deep learning models that are built to respect underlying symmetries explore a constrained search space leading to physical robustness and data-efficiency. In this talk, we will present a novel framework for symmetry-aware representation learning of protein structures. We will first introduce a model trained to predict amino-acid preferences in proteins, highlighting its efficiency, physical robustness and out-of-task predictive power on useful protein engineering problems like mutational effect prediction. We will then show how the framework can be extended to the domain of unsupervised learning, to learn compact, symmetry-aware representations of protein structures.
- 3:50-3:55: Introduction and Overview, Amy Zhang
- 3:55-4:10: Underwater messaging using mobile devices, Justin Chan
Since its inception, underwater digital acoustic communication has required custom hardware that neither has the economies of scale nor is pervasive. We present the first acoustic system that brings underwater messaging capabilities to existing mobile devices like smartphones and smart watches. Our software-only solution leverages audio sensors, i.e., microphones and speakers, ubiquitous in today's devices to enable acoustic underwater communication between mobile devices. To achieve this, we design a communication system that in real-time adapts to differences in frequency responses across mobile devices, changes in multipath and noise levels at different locations and dynamic channel changes due to mobility. We evaluate our system in six different real-world underwater environments with depths of 2-15 m in the presence of boats, ships and people fishing and kayaking. Our results show that our system can in real-time adapt its frequency band and achieve bit rates of 100 bps to 1.8 kbps and a range of 30 m. By using a lower bit rate of 10-20 bps, we can further increase the range to 100 m. As smartphones and watches are increasingly being used in underwater scenarios, our software-based approach has the potential to make underwater messaging capabilities widely available to anyone with a mobile device.
- 4:10-4:25: ClearBuds: Wireless Binaural Earbuds for Learning-based Speech Enhancement, Ishan Chatterjee, Maruchi Kim, Vivek Jayaram
We present ClearBuds, a state-of-the-art hardware and software system for real-time speech enhancement. Our neural network runs completely on an iphone, allowing you to supress unwanted noises while taking phone calls on the go. ClearBuds bridges state-of-the-art deep learning for blind audio source separation and in-ear mobile systems by making two key technical contributions: 1) a new wireless earbud design capable of operating as a synchronized, binaural microphone array, and 2) a lightweight dual-channel speech enhancement neural network that runs on a mobile device. Results show that our wireless earbuds achieve a synchronization error less than 64 microseconds and our network has a runtime of 21.4 ms on an accompanying mobile phone.
- 4:25-4:40: Z-Ring: Single point sensing for input, object recognition and authentication, Anand Waghmare
The hand provides a window into the intention, context, and activity important to the user. As the body's primary manipulator, it engages in various tasks, such as grasping objects, gesturing to signal intention, and operating interactive controls. Wearable sensing can elucidate these interactions, providing context or input to enable richer and more powerful computational experiences. Z-Ring is a wearable ring that enables gesture input, object detection, user identification, and interaction with passive user interface (UI) elements using a single sensing modality and a single instrumentation point on the finger. Z-Ring uses active electrical impedance sensing to detect hand impedance changes caused by finger motions or touch to external surfaces. We develop a diverse set of interactions enabled by the technology and evaluate them in a set of studies.
- 4:40-4:55: Enabling hand gesture customization on wrist-worn devices, Xuhai "Orson" Xu
We present a framework for gesture customization requiring minimal examples from users, all without degrading the performance of existing gesture sets. To achieve this, we first deployed a large-scale study (N=500+) to collect data and train an accelerometer-gyroscope recognition model with a cross-user accuracy of 95.7% and a false-positive rate of 0.6 per hour when tested on everyday non-gesture data. Next, we design a few-shot learning framework which derives a lightweight model from our pre-trained model, enabling knowledge transfer without performance degradation. We validate our approach through a user study (N=20) examining on-device customization from 12 new gestures, resulting in an average accuracy of 55.3%, 83.1%, and 87.2% on using one, three, or five shots when adding a new gesture, while maintaining the same recognition accuracy and false-positive rate from the pre-existing gesture set. We further evaluate the usability of our real-time implementation with a user experience study (N=20). Our results highlight the effectiveness, learnability, and usability of our customization framework. Our approach paves the way for a future where users are no longer bound to pre-existing gestures, freeing them to creatively introduce new gestures tailored to their preferences and abilities.
- 3:50-3:55: Introduction and Overview, Adriana Schulz
- 3:55-4:05: Computational Design of Passive Grippers, Milin Kodnongbua
This work proposes a novel generative design tool for passive grippers—robot end effectors that have no additional actuation and instead leverage the existing degrees of freedom in a robotic arm to perform grasping tasks. Passive grippers are used because they offer interesting trade-offs between cost and capabilities. However, existing designs are limited in the types of shapes that can be grasped. This work proposes to use rapid-manufacturing and design optimization to expand the space of shapes that can be passively grasped. Our novel generative design algorithm takes in an object and its positioning with respect to a robotic arm and generates a 3D printable passive gripper that can stably pick the object up. To achieve this, we address the key challenge of jointly optimizing the shape and the insert trajectory to ensure a passively stable grasp. We evaluate our method on a testing suite of 22 objects (23 experiments), all of which were evaluated with physical experiments to bridge the virtual-to-real gap.
- 4:05-4:15: A Tale of Two Mice: Sustainable Electronic Design and Prototyping, Vicente Arroyos
Electronics have become integral to all aspects of life and form the physical foundation of computing; however electronic waste (e-waste) is among the fastest growing global waste streams and poses significant health and climate implications. We present a design guideline for sustainable electronics and use it to build a functional computer mouse with a biodegradable printed circuit board and case. We develop an end-to-end digital fabrication process using accessible maker tools to build circuits on biodegradable substrates that reduce embodied carbon and toxic waste. Our biodegradable circuit board sends data over USB at 800 kbps and generates 12 MHz signals without distortion. Our circuit board dissolves in water (in 5.5 min at 100 °C, 5 hrs at 20 °C) and we successfully recover and reuse two types of chips after dissolving. We also present an environmental assessment showing our design reduces the environmental carbon impact (kg CO2e) by 60.2% compared to a traditional mouse.
- 4:15-4:25: Carpentry Compiler, Amy Zhu
Traditional manufacturing workflows strongly decouple design and fabrication phases. As a result, fabrication-related objectives such as manufacturing time and precision are difficult to optimize in the design space, and vice versa. We present novel abstraction languages in the carpentry domain for decoupling design and manufacturing. We leverage these abstractions to design a compiler that supports the real-time, unoptimized translation of high-level, geometric fabrication operations into concrete, tool-specific fabrication instructions; this gives users immediate feedback on the physical feasibility of plans as they design them. Finally, we present a new approach to jointly optimize design and fabrication plans for carpentered objects. To make this bi-level optimization tractable, we adapt recent work from program synthesis based on equality graphs (e-graphs), which encode sets of equivalent programs.
- 4:25-4:35: Computational Design of Knit Templates, Ben Jones
We present an interactive design system for knitting that allows users to create template patterns that can be fabricated using an industrial knitting machine. Our interactive design tool is novel in that it allows direct control of key knitting design axes we have identified in our formative study and does so consistently across the variations of an input parametric template geometry. This is achieved with two key technical advances. First, we present an interactive meshing tool that lets users build a coarse quadrilateral mesh that adheres to their knit design guidelines. This solution ensures consistency across the parameter space for further customization over shape variations and avoids helices, promoting knittability. Second, we lift and formalize low-level machine knitting constraints to the level of this coarse quad mesh. This enables us to not only guarantee hand- and machine-knittability, but also provide automatic design assistance through auto-completion and suggestions. We show the capabilities through a set of fabricated examples that illustrate the effectiveness of our approach in creating a wide variety of objects and interactively exploring the space of design variations.
- 4:35-4:45: A DSL for Referencing CAD Geometry, Dan Canscaval
3D Computer-Aided Design (CAD) modeling is ubiquitous for constructing digital prototypes in mechanical engineering and design. CAD models are programs that produce geometry, and can be used to implement high-level geometric edits by changing input parameters. Exciting new applications are enabled by the application of program optimization techniques to CAD programs that represent a design space of many possible models rather than a single, canonical model. However, to enable such optimization, we argue one fundamental primitive remains unresolved. CAD models use references to pass geometric arguments to operations; but such references are not fully defined across all input parameters and possible designs. We propose a domain-specific language that exposes references as a first-class language construct, using user-authored queries to introspect element history and define references safely over all executions; allowing CAD models to safely represent parametric design spaces with varying topology.
- 4:45-4:55: ARDW: An Augmented Reality Workbench for Printed Circuit Board Debugging, Ishan Chatterjee
Debugging printed circuit boards (PCBs) can be a time-consuming process, requiring frequent context switching between PCB design files (schematic and layout) and the physical PCB. To assist electrical engineers in debugging PCBs, we present ARDW, an augmented reality workbench consisting of a monitor interface featuring PCB design files, a projector-augmented workspace for PCBs, tracked test probes for selection and measurement, and a connected test instrument. The system supports common debugging workflows for augmented visualization on the physical PCB as well as augmented interaction with the tracked probes. We quantitatively and qualitatively evaluate the system with 10 electrical engineers from industry and academia, finding that ARDW speeds up board navigation and provides engineers with greater confidence in debugging. We discuss practical design considerations and paths for improvement to future systems.