Perceptual computing, as described in AMD’s white paper “Perceptual Computing: Inflection Point for Embedded Applications“, holds possibilities and pitfalls in implementation within the industrial embedded sector.
Perceptual computing, whilst a relatively new term, is far from a new technology and one engineers have struggled with truly realizing for decades past. Easily defined as the ability to communicate with computers in an innately human fashion, and therein lies the difficulty; the human brain and computer processors are inherently disparate despite prolonged attempts to more closely align the two. An example I remember fondly (and with frustration) from my youth is early attempts at voice recognition, which necessitated hours spent “training” your computer to understand your utterances, which invariably spawned an unintended robotic dialect to one’s speech to avoid the natural variation in how we pronounce words in differing emotional circumstances. Despite days burned at the behest of this software, its accuracy in comprehending what I was attempting to convey was so laughable its only real purpose became amusing demonstrations of its inaccuracy amongst my peers.
According to AMD’s latest white paper, times have changed as has technology. Two decades down the line, we mere mortals long tarnished with historical human-machine interaction attempts are beginning to trust the technology. Popularized by Apple’s Siri and now increasingly embedded in game consoles and smart TVs, accurate comprehension has improved to such a degree that voice control is becoming the convenient method to issue commands and input data. It’s critical that any change in how one interacts with technology offers a tangible benefit and isn’t just technology for technology’s sake – the much-discussed IoT toaster springs to mind for me!
Perceptual computing promises to journey well beyond vocal word recognition, encompassing secondary techniques of human communication, be that detection and interpretation of verbal tone, emotion, and verbiage, to gestural cognizance – the often minute movement of limbs and facial expressions. Now if you consider computers accurately understanding speech with varying languages, dialects, and pronunciation – a highly complex task – this is a different ballgame altogether. With those motivated to push this technology into marketing and point of sale, no longer can we rely on the involved human to truthfully provide that input if they’re not driving it – if you’re attempting to control your device you would never mislead it, but we’re all well practiced in telling that pushy salesman that we’re “just browsing”, even when we’re not.
So how does this benefit industrial deployments of embedded computing technology? Perhaps the first question is “Will there even be a market?” As automation advances and Industrie 4.0 continue to attempt to drive humans out of the industrial chain altogether, and, to be honest, if the next step for humans in factories is to control machinery with perceptual and gestural inputs this may not be a bad thing.
I cite AMD’s reference to an HMI example where the machine attempts to read the operator’s emotional responses to data he is reading and suggests policy changes as a result. In my opinion the operator could be exhibiting such emotional responses for a number of reasons outside of that specific absorbed data, or even be subconsciously falsely exhibiting those characteristics for reasons unknown. The citation goes on to state the machine merely suggests policy changes and requires operator confirmation, so the legal responsibility for such decisions remains with the human operator. So what is the benefit? Does this mean the operator needs to be less trained so this represents a cost saving? If so, he’s surely less equipped to confirm any suggested course of action by the HMI? If I were the individual legally responsible for my operator’s actions, I’d demand that all machine “suggestions” are based on raw data facts, not my operator’s reaction to them – I’d also demand no opportunity for that operator to blame the HMI erroneously interpreting his emotional responses.
To me, this boils down to the claim that physiological data can prove a person’s genuine state regardless of their conscious or subconscious attempts to mask it. It would be exciting to be proved wrong, but I can’t help but analogize such a claim against polygraph testing – there’s a reason they aren’t admissible in court and there exist countless examples of their conclusions often conflicting even forensic evidence directly to the contrary. On top of this, it’s been proven that psychologically the human brain can subconsciously act out scenarios that fall well beyond reality as if they were reality.
To conclude, I firmly believe that perceptual computing will have a massive benefit, understanding demographics and emotional reactions at points of sale to consumers is fantastic. The ramifications of misinterpretation in these “retail” applications are relatively minor, though the opposite is true for any kind of industrial deployment.
Perhaps the technology will advance to such a degree that I will be proved wrong, perceptual analysis may become so indisputable it can be used in a court of law and should that occur it will herald exciting times across the board – but until then, let’s tread very carefully of introducing this into our factories.