WiHear — We Can Hear You with Wi-Fi!
The impact of wink (as denoted in the dashed red box).
• Filtering Out-Band Interference
• Partial Multipath Removal
•Mouth Motion Profile Construction
• Discrete Wavelet Packet Decomposition
Partial Multipath Removal•Mouth movement: Non-rigid
• Covert CSI (Channel State Information) from frequency domain to time domain via IFFT
•Multipath removal threshold: >500 ns
The multipath threshold value can be adjusted to achieve better performance
• Filtering Out-Band Interference
• Partial Multipath Removal
•Mouth Motion Profile Construction
• Discrete Wavelet Packet Decomposition
• Filtering Out-Band Interference
• Partial Multipath Removal
•Mouth Motion Profile Construction
• Discrete Wavelet Packet Decomposition
Discrete Wavelet Packet Decomposition• A Symlet wavelet filter of order 4 is selected
Classification & Error Correction
Learning-based Lip Reading
Vows and consonants Filtering
Partial Multipath Removal
• Context-based Error Correction
• Context-based Error Correction
• Inter word segmentationSilent interval between words
• Inner word segmentation Words are divided into phonetic events
• Context-based Error Correction
Feature Extraction• Multi-Cluster/Class Feature Selection (MCFS) scheme
• Context-based Error Correction
• Context-based Error Correction
Extending To Multiple Targets
•MIMO: Spatial diversity via multiple Rx antennas
• ZigZag decoding: a single Rx antenna
Floor plan of the testing environment. Experimental scenarios layouts. (a) line of sight; (b) non-line-of-sight; (c) through wall Tx side; (d) through wall Rx side; (e) multiple Rx; (f) multiple link pairs.
• Syllables: [æ], [e], [i], [u], [s], [l], [m], [h], [v], [ɔ], [w], [b], [j], [ ʃ ].
•Words: see, good, how, are, you, fine, look, open, is, the, door,
thank, boy, any, show, dog, bird, cat, zoo, yes, meet, some, watch, horse, sing, play, dance, lady, ride, today, like, he, she.
Automatic Segmentation Accuracy
Automatic segmentation accuracy for (a) Inner-word segmentation
on commercial devices (b) Inter-word segmentation on commercial devices (c) Inner-word segmentation on USRP(d) Inter-word segmentation on USRP
Impact of Context-based Error Correction
Performance with Multiple Receivers
Example of different views for pronouncing words
Performance for Multiple Targets
Performance of multiple users with multiple link pairs.
Performance of zigzag decoding for multiple users.
Performance of two through wall scenarios. Performance of through wall with multiple Rx.
Resistance to Environmental Dynamics
Waveform of a 4-word sentence without interference of ISM band signals or irrelevant human motions
Impact of irrelevant human movements interference
Impact of ISM band interference
• WiHear is the 1st prototype in the world, trying to use Wi-Fi signal to sense and recognize human talks.
• WiHear takes the 1st step to bridge communication between human speaking and wireless signals.
• WiHear introduces a new way so that machine can sense more complicated human behaviors (e.g. mood).
Thank you for your listening !
WiHearWe Can Hear You With Wi-Fi !
We Can Hear You with WiFi Guanhua Wang
Advanced Research in ISM band • Localization • Gesture recognition • Object Classification They enable Wi-Fi to “SEE” target objects.
Can we enable Wi-Fi signals to HEAR talks?
Can we enable Wi-Fi signals to HEAR talks?
“Hearing” human talks with Wi-Fi signals Hello Non-invasive and device-free
Hearing through walls and doors I am upset. Understanding complicated human behavior (e. g. mood)
Hearing multiple people simultaneously MIMO Technology Easy to be implemented in commercial Wi-Fi products
Wi. Hear Framework Vows and consonants Filtering Classification & Error Correction Remove Noise MIMO Beamforming Partial Multipath Removal Feature Extraction Profile Building Wavelet Transform Mouth Motion Profiling Segmentation Learning-based Lip Reading
Mouth Motion Profiling • Locating on Mouth • Filtering Out-Band Interference • Partial Multipath Removal • Mouth Motion Profile Construction • Discrete Wavelet Packet Decomposition
Locating on Mouth T 1 T 2 T 3
Mouth Motion Profiling • Locating on Mouth • Filtering Out-Band Interference • Partial Multipath Removal • Mouth Motion Profile Construction • Discrete Wavelet Packet Decomposition
Filtering Out-Band Interference • Signal changes caused by mouth motion: 2 -5 Hz • Adopt a 3 -order Butterworth IIR band-pass filter ØCancel the DC component ØCancel wink issue (
Mouth Motion Profiling • Locating on Mouth • Filtering Out-Band Interference • Partial Multipath Removal • Mouth Motion Profile Construction • Discrete Wavelet Packet Decomposition
Partial Multipath Removal • Mouth movement: Non-rigid • Covert CSI (Channel State Information) from frequency domain to time domain via IFFT • Multipath removal threshold: >500 ns • Convert processed CSI (with multipath < 500 ns) back to frequency domain via FFT The multipath threshold value can be adjusted to achieve better performance
Mouth Motion Profiling • Locating on Mouth • Filtering Out-Band Interference • Partial Multipath Removal • Mouth Motion Profile Construction • Discrete Wavelet Packet Decomposition
Mouth Motion Profiling • Locating on Mouth • Filtering Out-Band Interference • Partial Multipath Removal • Mouth Motion Profile Construction • Discrete Wavelet Packet Decomposition
Discrete Wavelet Packet Decomposition • A Symlet wavelet filter of order 4 is selected
Wi. Hear Framework Vows and consonants Filtering Classification & Error Correction Remove Noise MIMO Beamforming Partial Multipath Removal Feature Extraction Profile Building Wavelet Transform Mouth Motion Profiling Segmentation Learning-based Lip Reading
Lip Reading • Segmentation • Feature Extraction • Classification • Context-based Error Correction
Segmentation • Inter word segmentation ØSilent interval between words • Inner word segmentation Ø Words are divided into phonetic events
Lip Reading • Segmentation • Feature Extraction • Classification • Context-based Error Correction
Feature Extraction • Multi-Cluster/Class Feature Selection (MCFS) scheme
Vocabulary • Syllables: Ø[æ], [e], [i], [u], [s], [l], [m], [h], [v], [ɔ], [w], [b], [j], [ ʃ ]. • Words: Ø see, good, how, are, you, fine, look, open, is, the, door, thank, boy, any, show, dog, bird, cat, zoo, yes, meet, some, watch, horse, sing, play, dance, lady, ride, today, like, he, she.
Lip Reading • Segmentation • Feature Extraction • Classification • Context-based Error Correction
Lip Reading • Segmentation • Feature Extraction • Classification • Context-based Error Correction
Implementation Floor plan of the testing environment. Experimental scenarios layouts. (a) line of sight; (b) non-line-of-sight; (c) through wall Tx side; (d) through wall Rx side; (e) multiple Rx; (f) multiple link pairs.
Automatic Segmentation Accuracy Automatic segmentation accuracy for (a) Inner-word segmentation on commercial devices (b) Inter-word segmentation on commercial devices (c) Inner-word segmentation on USRP (d) Inter-word segmentation on USRP
Impact of Context-based Error Correction
Performance with Multiple Receivers Example of different views for pronouncing words
Extending To Multiple Targets • MIMO: Spatial diversity via multiple Rx antennas • Zig. Zag decoding: a single Rx antenna
Performance for Multiple Targets Performance of multiple users with multiple link pairs. Performance of zigzag decoding for multiple users.
Through Wall Performance of two through wall scenarios. Performance of through wall with multiple Rx.
Conclusion • Wi. Hear is the 1 st prototype in the world, trying to use Wi -Fi signal to sense and recognize human talks. • Wi. Hear takes the 1 st step to bridge communication between human speaking and wireless signals. • Wi. Hear introduces a new way so that machine can sense more complicated human behaviors (e. g. mood).
Thank you for your listening !