Voice rec is terrifying

November 15, 2017 Brandon Lewis

In a world obsessed with Internet privacy it’s surprising how little we talk about always-listening devices like the Amazon Echo. After all, a company that wants to learn intimate details about your life in order to sell you more stuff has a microphone permanently fired up in your kitchen.

If you own an Echo and weren’t aware of this feature, open up your Alexa app, select the “Settings” menu, and then select “History.” Take a listen. Were all of those recordings intended for the Echo?

I guess privacy is the price of convenience in modern consumerism. And things are about to get a whole lot more convenient.

Cacophonies, cocktail parties, convenience, and Christmas

XMOS is a fabless semiconductor company that spun out of the University of Bristol to focus on voice and music processing ICs. Among those ICs, devices based on the 32-bit xCORE MCU architecture have had notable success in the voice recognition market, delivering 16 programmable cores (partitioned into two tiles of eight cores with a shared address space for each) with DSP functions integrated in the same chip.

XMOS recently parlayed the xCORE architecture into the VocalFusion 4-Mic Dev Kit for Amazon’s Alexa Voice Service (AVS). The kit is designed around the VocalFusion XVF3000 integrated far-field voice processor and four high signal-to-noise-ratio (SNR) MEMS microphones from Infineon (Figure 1). XMOS claims the kit is the first far-field linear microphone array solution available on the market.


Figure 1. The XMOS VocalFusion 4-Mic Dev Kit for Amazon’s Alexa Voice Service (AVS) is based on the XVF3000 integrated far-field voice processor and a linear MEMS microphone array from Infineon.

Outside of range, far-field voice processing gets really interesting when combating the “cocktail party” problem, or situations in which a platform needs to isolate the voice of a single speaker from a noisy environment. At distances of 5 m or more, the VocalFusion 4-Mic Dev Kit uses a combination of acoustic echo cancellation (AEC), adaptive beamforming, dynamic de-reverberation, and automatic gain control (AGC) to isolate and extract the voice signal of a primary speaker. Beyond this is where things start to get spooky.

Earlier this year, XMOS acquired Setem Technologies, Inc. of Boston, MA, who develops massive Fourier transforms for blind-source signal separation. These blind-source separation algorithms mathematically decompose elements of source signals from a set of signals and then reconstruct them, either individually or as groups (Figure 2). In voice recognition this can be applied to an individual speaker, or even a conversation.

Figure 2. Setem Technologies, now a part of XMOS, develops blind-source separation algorithms that can be used to isolate a speaker or speakers in noisy environments.

Now, in theory (and perhaps in practice), blind-source separation can be used to isolate the voice frequencies of multiple speakers in a room, and thereby establish a biometric identity for each. As you can imagine, the application of such technology could be widespread, and not just in the sense that Amazon wants to know what every member of your family wants for Christmas. Surveillance, for instance, immediately comes to mind.

This takes us back to the VocalFusion 4-Mic Dev Kit’s linear microphone array. While many platforms such as the Amazon Echo and Google Home use a circular array of omni-directional microphones to provide 360-degree room coverage, a linear array is designed for 180-degree arcs. This is of interest because leaders in the voice recognition space envision a future where the tower-based virtual assistants of today recede into everyday objects like TVs, refrigerators, sofas, walls – you name it.

This future is designed to be ultra-convenient, delivering service by the syllable. But be careful. You probably won’t know who, or what, is listening.

 

About the Author

Brandon Lewis

Brandon is responsible for Embedded Computing Design’s IoT Design, Automotive Embedded Systems, Security by Design, and Industrial Embedded Systems brands, where he drives content strategy, positioning, and community engagement. He is also Embedded Computing Design’s IoT Insider columnist, and enjoys covering topics that range from development kits and tools to cyber security and technology business models. Brandon received a BA in English Literature from Arizona State University, where he graduated cum laude. He can be reached by email at blewis@opensystemsmedia.com.

Follow on Twitter More Content by Brandon Lewis
Previous Article
XPedite6401 Now Available with NXP Arm-based QorIQ Processors Integrating four 64-bit Arm Cortex-A72 Cores

Extreme Engineering Solutions (X-ES) announces added support for the NXP QorIQ LS1046A and LS1026A processo...

Next Article
Calculate cost for a custom IIoT IC (Hint: It’s cheaper than you think)
Calculate cost for a custom IIoT IC (Hint: It’s cheaper than you think)

×

Stay updated on processing and related topics with the Processing edition of our Embedded Daily newsletter

Subscribed! Look for 1st copy soon.
Error - something went wrong!