Voice assistant battles, part two: The strategic importance

August 3, 2018 Todd Mozer, Sensory

This is part two in a three part series. Read part one here.

Here’s the basic motivation that I see in creating Voice Assistants…Build a cross platform user experience that makes it easy for consumers to interact, control and request things through their assistant. This will ease adoption and bring more power to consumers who will use the products more and in doing so create more data for the cloud providers. This “data” will include all sorts of preferences, requests, searches, purchases, and will allow the assistants to learn more and more about the users. The more the assistant knows about any given user, the BETTER the assistant can help the user in providing services such as entertainment and assisting with purchases (e.g. offering special deals on things the consumer might want). Let’s look at each of these in a little more detail:

1. Owning the cross platform user experience and collecting user data to make a better Voice Assistants. ​

For thousands of years consumers interacted with products by touch. Squeezing, pressing, turning, and switching were all the standard means of controlling. The dawn of electronics really didn’t change this and mechanical touch systems became augmented with electrical touch mechanisms. Devices got smarter and had more capabilities but the means to access these capabilities got more confusing with more complicated interfaces and a more difficult user experience. As new sensory technologies began to be deployed (such as gesture, voice, pressure sensors, etc.) companies like Apple emerged as consumer electronics leaders because of their ability to package consumer electronics in a more user friendly manner. With the arrival of Siri on the iPhone and Alexa in the home, voice first user experiences are driving the ease of use and naturalness of interacting with consumer products. Today we find companies like Google and Amazon investing heavily into their hardware businesses and using their Assistants as a means to improve and control the user experience.

Owning the user experience on a single device is not good enough. The goal of each of these voice assistants is to be your personal assistant across devices. On your phone, in your home, in your car, wherever you may go. This is why we see Alexa and Google and Siri all battling for, as an example, a position in automotive. Your assistant wants to be the place you turn for consistent help. In doing so it can learn more about your behaviors…where you go, what you buy, what you are interested in, who you talk to, and what your history is. This isn’t just scary big brother stuff. It’s quite practical. If you have multiple assistants for different things, they may each think of you and know you differently, thereby having a less complete picture. It’s really best for the consumer to have one assistant that knows you best.

For example, let’s take the simple case of finding food when I’m hungry. I might say “I’m hungry.” Then the assistant’s response would be much more helpful the more it knows about me. Does it know I’m a vegetarian? Does it know where I’m located, or whether I am walking or driving? Maybe it knows I’m home and what’s in my refrigerator, and can suggest a recipe…does it know my food/taste preferences? How about cost preferences? Does it have the history of what I have eaten recently, and knows how much variety I’d like? Maybe it should tell me something like “Your wife is at Whole Foods, would you like me to text her a request or call her for you?” It’s easy to see how these voice assistants could really be quite helpful the more it knows about you. But with multiple assistants in different products and locations, it wouldn’t be as complete. In this example it might know I’m home, but NOT know what’s in my fridge. Or it might know what’s in the fridge and know I’m home but NOT know my wife is currently shopping at Whole Foods, etc.

The more I use my assistant across more devices in more situations and over more time, the more data it could gather and the better it should get at servicing my needs and assisting me! It’s easy to see that once it knows me well and is helping me with this knowledge it will get VERY sticky and become difficult to get me to switch to a new assistant that doesn’t know me as well.

2. Entertainment and other service package sales.

Alexa came onto the scene in 2014 with one very special domain – Music. Amazon chose to do one thing really well, and that was make a speaker that could accept voice commands for playing songs, albums, bands, radio. Not long after that Alexa added new domains and moved into new platforms like Fire TV and the Fire stick controller. It’s no coincidence that an Amazon Music service and Amazon TV services both exist and you can wrap even more services into an Amazon Prime membership. When Assistants don’t support Spotify well, there are a lot of complaints. And it’s no surprise that Spotify has been reported to be developing their own assistant and speaker. In fact Comcast has their own voice control remotes. There’s a very close tie between the voice assistants and the services that they bring. Apple is restrictive in what Siri will allow you to listen for. They want to keep you within their eco-system where they make more money. (Maybe it’s this locked in eco-system that has given Apple a more relaxed schedule in improving Siri?). Amazon and Google are really not that different, although they may have different means of leading us to the services they want us to use, they still can influence our choices for media. Spotify has over 70M subscribers (20M paying), over 5 Billion in revenues and recently went public with about a $30B market cap…and Apple Music just overtook Spotify in terms of paying subscribers. Music streaming has turned the music industry into a growth business again. The market for video services is even bigger, and Amazon is one of the top content producers of video! Your assistant will have a lot of influence on the services you choose and how accessible they are. This is one reason why voice assistant providers might be willing to lose money in getting the assistants out to the market, so they can make more money on services. The battle of Voice Assistants is really a battle of who controls your media and your purchases!

3. Selling and recommending products to consumers

The biggest business in the world is selling products. It’s helped make Amazon, Google and Apple the giants that they are today. Google makes the money on advertising, which is an indirect form of selling products. What if your assistant knew what you needed whenever you needed it? It would uproot the entire advertising industry. Amazon has the ability to pull this off. They have the world’s largest online store, they know our purchase histories, they have an awesome rating system that really works, and they have Alexa listening everywhere willing to take our orders. Because assistants use a voice interface, there will be a much more serial approach to making recommendations and selling me things. For example, if I do a text search on a device for nearby vegan restaurants, I see a map with a whole lot of choices and long list of options. Typically these options could include side bars of advertising or “sponsored” restaurants first in the listing, but I’m supplied a long list. If I do a voice search on a smart speaker with no display, it will be awkward to give me more than a few results…and I’ll bet the results we hear will become the “sponsored” restaurants and products.

It would be really obnoxious if Alexa or Siri or Cortana or Google Assistant suddenly suggested I buy something that I wasn’t interested in, but what if it knew what I needed? For example, it could track vitamin usage and ask if I want more before they run out, or it could know how frequently I wear out my shoes, and recommend a sale for my brand and my size, when I really needed them. The more my assistant knows me the better it can “advertise” and sell me in a way that’s NOT obnoxious but really helpful. And of course making extra money in the process!

Todd Mozer is the CEO of Sensory. He holds over a dozen patents in speech technology and has been involved in previous startups that reached IPO or were acquired by public companies. Todd holds an MBA from Stanford University, and has technical experience in machine learning, semiconductors, speech recognition, computer vision, and embedded software.

Previous Article
7 Hands-On Workshops Highlight Free ST Developers Conference – Register Today
7 Hands-On Workshops Highlight Free ST Developers Conference – Register Today

technical hands-on sessions are presented by subject-matter experts and show attendees how to optimize perf...

Next Article
Securing Industrial IoT sensors, part 1: The TPM for network security
Securing Industrial IoT sensors, part 1: The TPM for network security

For connected devices, low protected edge nodes, such as sensors, provide entry points to high value target...