WearMail: On-the-Go Access to Information in Your Email with a Privacy-Preserving Human Computation Workflow
Saiganesh Swaminathan, Raymond Fok, Fanglin Chen, Ting-Hao (Kenneth) Huang, Irene Lin, Rohan Jadvani, Walter S. Lasecki, Jeffrey P. Bigham · 2017 · Proceedings of the 30th Annual ACM Symposium on User Interface Software and Technology (UIST 2017) · doi:10.1145/3126594.3126603
Summary
WearMail is a conversational system that extracts specific information from a user's email via voice queries on wearable devices (such as smartwatches), using a novel privacy-preserving human computation workflow. The system addresses the challenge that email functions as personal external memory — containing reservation numbers, meeting details, phone numbers, and other critical information — but searching for this information on mobile devices with limited input/output is cumbersome. A preliminary study of 200 mobile email users found that over 30% of email searches happened while users were in transit, and 85% of queries targeted specific information like order codes, contact details, event information, and account numbers. WearMail's two-step workflow ensures crowd workers never see actual email content. In the first step (email filtering), workers view only obfuscated email metadata — person names are randomized and subject lines have uncommon words blurred out — and select which emails likely contain the requested information. In the second step (information extraction), workers generate examples of what the requested information might look like (e.g., sample confirmation numbers found via web search), which are used to automatically construct regular expression extractors that run against the filtered emails without human access. The system combines these crowd-generated regex extractors with Stanford Named Entity Recognition for structured data types like dates, locations, and monetary amounts.
Key findings
The crowd-powered email filtering approach significantly outperformed automated keyword search, producing filtersets containing the ground truth email for 25 of 30 queries (HIT@3 = 0.833) compared to only 17 for the automated approach (HIT@3 = 0.567). The full extraction pipeline successfully returned correct results for 20 of 30 queries (HIT@3 = 0.66), with 17 of the 20 successful extractions (85%) coming from the crowd-generated regex extractor rather than NER. The obfuscation study showed that crowd workers could effectively filter emails even with heavy obfuscation — accuracy stabilized around 80% once the visible dictionary included the top 1,000 most common English words, compared to only 37.5% with the most restrictive 100-word dictionary. Average latency was 2 minutes 18 seconds for email filtering and 6 minutes 16 seconds for information extraction. The mean reciprocal rank for successful crowd-powered extractions was 0.779, meaning the correct answer was typically the first or second result returned.
Relevance
This research contributes to accessibility by demonstrating how privacy-preserving crowdsourcing can enable information access on constrained devices — a challenge that disproportionately affects users with disabilities who may rely on voice-based or simplified interfaces. The work was partially funded by NIDILRR (National Institute on Disability, Independent Living, and Rehabilitation Research), reflecting its disability research context. The conversational voice interface pattern is directly relevant to users who cannot easily interact with visual email interfaces, including blind users and people with motor disabilities. The privacy-preserving workflow is particularly important for accessibility applications: many assistive technologies (like VizWiz or remote sighted assistance) require sharing personal visual or textual information with helpers, and WearMail demonstrates that meaningful crowd assistance is possible even when workers see only heavily obfuscated data. This has implications for designing trustworthy assistive services where users may be reluctant to share sensitive personal information with strangers.
Tags: crowdsourcing · human computation · privacy · wearable technology · information extraction · mobile accessibility · conversational interface