“Credit to the team at Amazon for creating a lot of excitement in this space,” Google CEO Sundar Pichai. He made this comment during his Google I/O speech last week when introducing Google’s new voice-controlled home speaker, Google Home which offers a similar sounding description to Amazon’s Echo. Many interpreted this as a “thanks for getting it started, now we’ll take over,” kind of comment.
Google has always been somewhat marketing challenged in naming its voice assistant. Everyone knows Apple has Siri, Microsoft has Cortana, and Amazon has Alexa. But what is Google’s voice assistant called? Is it Google Voice, Google Now, OK Google, Voice Actions? Even those of us in the speech industry have found Google’s branding to be confusing. Maybe they’re clearing that up now by calling their assistant “Google Assistant.” Maybe that’s the Google way of admitting it’s an assistant without admitting they were wrong by not giving it a human sounding name.
The combination of the early announcement of Google Home and Google Assistant has caused some to comment that Amazon has BIG competition at best, and at worst, Amazon’s Alexa is in BIG trouble.
I thought I’d point out a few good reasons why Amazon is in pretty good shape:
- Google Home is not shipping. Google has a bit of a chicken-and-egg issue in that it needs to roll out a product that has industry support (for controlling third-party products by voice). How do you get industry partners without a product? You announce early! That was a smart move; now they just need to design it and ship it…not always an easy task.
- It’s about Voice Commerce. This is REALLY important. Many people think Google will own this home market because it has a better speech recognizer. Speech recognition capabilities are nice but not the end game. The value here is having a device that’s smart and trusted enough to take money out of our bank accounts and deliver us goods and services that we want when we want them. Amazon has a huge infrastructure lead here in products, reviews, shipping, and other key components of Internet commerce. Adding a convenient voice front end isn’t easy, but it’s also NOT the hardest part of enabling big revenue voice commerce systems.
- Amazon has far-field working and devices that always “talk back.” I admit the speech recognition is important, and Google has a lot of data, experience, and technologists in machine learning, AI, and speech recognition. But most of the Google experience is through Android and mobile-phone hardware. Where Amazon has made a mark is in far-field or longer distance recognition that really works, which is not easy to do. Speech recognition has always been about signal/noise ratios and far-field makes the task more difficult and requires acoustic echo cancellation, multiple microphones, plus various gain control and noise filtering/speech focusing approaches. Also, the Google recognizer was established around finding data through voice queries, most of such data being displayed on-screen (and often through search). The Google Home and Amazon Echo are no-screen devices. Having them intelligently talk back means more than just reading the text off a search. Google can handle this, of course, but it’s one more technical barrier that needs to be done right.
- Amazon has a head start and already is an industry standard. Amazon’s done a nice job with the Echo. It’s follow-on products, Tap and Dot, were intelligent offshoots. Even its Fire TV took advantage of in-house voice capabilities. The Alexa Voice Services work well and already are acting like a standard for voice control. Roughly three million Amazon devices have already sold, and I’d guess that in the next year, the number of Alexa connected devices will double through both Amazon sales and third parties using AVS. This is not to mention the tens of millions of devices on the market that can be controlled by Echo or other Amazon hardware. Amazon is pretty well entrenched!
Of course, Amazon has its challenges as well, but I’ll leave that for another blog.
Todd Mozer is the CEO of Sensory. He holds over a dozen patents in speech technology and has been involved in previous startups that reached IPO or were acquired by public companies. Todd holds an MBA from Stanford University, and has technical experience in machine learning, semiconductors, speech recognition, computer vision, and embedded software.