Marketers want to read the minds of consumers, but they are not psychic. They do the next best
thing — they marry psychology with technology in order to read consumers’ minds by analyzing their facial expression.
Emotion Recognition is a relatively recent technology, sitting at the intersection of AI and machine learning, the camera on your mobile device or desktop, and the software needed to make sense of your smile or frown. While the software encodes what the camera sees, the facial expression must be matched against a database of millions of examples. That will correlate the user’s expression with known samples of a facially expressed emotion, then score it for measurement purposes.
Marketers can then use this information to figure out how to make their product more engaging and appealing—at least that is how this is supposed to work.
Facial recognition and facial coding are conflated occasionally in the mainstream press, said Max Kalehoff, VP of marketing at Realeyes, the attention and emotion measurement vendor. “Facial recognition is software for the identification or verification of individuals, while facial coding is the detection of emotions through facial cues. Facial coding is not concerned with identifying individuals.”
The tool has a lot of potential, and many pitfalls.
Perhaps a more precise label might be “facial coding for emotional recognition (FCER),” noted Seth Grimes, founder of Alta Plana, an IT analytical and consulting firm, and an expert in sentiment analysis. The technique tries to classify expression and change of expressions, and derive emotional recognition from facial recognition, he explained. But Grimes also flagged a notable caveat: that emotional recognition is also dependent on context.
Grimes gave an example: a picture of the late basketball star Kobe Bryant smiling. “Someone who views this might be sad,” he said. Yet smiling is associated with happiness. The example shows that some associations can be problematic, he noted. Considering the association between facial expression and emotion can be insufficient if you do not also consider text and speech, he said.
“Facial emotional recognition is controversial right now,” Grimes continued. The work of psychologist Paul Eckman underlies the technology. Eckman once postulated that there were six universal emotions: fear, anger, joy, sadness, disgust and surprise. (Eckman himself is skeptical about its commercial application.)
Critics note that this premise may not be universal across all cultures. “It is feasible to attempt to use machine (learning) to model emotional expression,” Grimes said. The model can take into account the differences in emotional meaning from one culture to another. Not all smiles are alike. “You need to ensure the model is as free of bias as possible,” Grimes said. For example, “if you are selling to seniors, you don’t bother having kids in the model.” Who constructs the model is also a factor. “White male engineers may not be aware of issues in diversity.”
Still, one way to transcend the limits of a model is to broaden it by collecting more dat. In the case of Realeyes, that meant recruiting psychologists and annotators to train the software to detect various human emotions, Kalehhoff explained. “Our investment in Emotion and Attention AI now includes nearly 700 million AI labels, close to six million video measurement sessions, with 17 patents now awarded. Our historical archive now includes 30,000 video ads.”
The company’s approach takes into account that a smile is not the same in every culture, so recruiting psychologists and annotators in other countries helps deepen and broaden the database while mitigating bias. “ For example, “Emotions are more subtle in (some) cultures, so we must ensure annotations reflect those nuances.” Kalehoff said.
The whole purpose of Realeyes is to measure human attention and emotion as factors to improve digital video content. If an ad video does not work, marketers will know where to “tweak” the visuals to improve viewer attention.
Realeyes scores users across emotional states (happiness, surprise, confusion, contempt, disgust, empathy, fear) as well as engagement, negativity and valence. Facial recognition is not thrown off by beards or glasses.
“With our advertising measurement products, reading and interpreting emotional states began as a more qualitative diagnosis of individual video creatives, carried out by trained creative and media analysts.” Kalehoff recalled. “Over time, we’ve learned how to synthesize raw measurement data that can by automatically processed to predict real-world outcomes, like in-market attentiveness, video completion and even brand favorability.”
That’s not a smile, it’s a “Facial Action Unit”
Noldus Information Technology is one step removed from the customer. “We give them (the UX designers) the tools to understand the user’s mental state of mind and experience,” said company founder and trained biologist Lucas Noldus. UX designers can then apply that data towards building more effective online commerce sites. Noldus’ two relevant products are FaceReader, which can be used in a lab setting, and FaceReader Online, which can be used to test subjects anywhere.
The typical approach analyzing facial expression is to “follow Eckman”. Researchers will check smiles and frowns against the “big six” emotions (fear, anger, joy, sadness, disgust, and surprise), Noldus explained. But those six basic expressions are “too coarse”, and are insufficient to describe human expression, he explained. A person uses some combination of the 43 muscles in their face to express their emotional state. Each single action is a “facial action unit”, which offers a more granular method of measuring emotional state, Noldus explained. This allows one to
measure for factors like confusion, distraction, attention and boredom — emotional states that would impact the online shopping experience.
“We humans have evolved facial expressions to communicate with each other,” Noldus said. When two people are together, smiles and frowns have meaning, while people will make different faces when dealing with inanimate objects. Here the AI must distinguish between a frown someone makes when interacting with a website — which marks concentration — versus that same frown someone makes when talking to another person—which could convey anger or disgust.
Here facial action units can offer more insight. “Confusion is important when designing interactive systems. We want the person to find what they are looking for,” Noldus said. As part of the customer journey, of course, “confusion is negative.”
Or take an online game. “You want people to be surprised or shocked, to be angry or to show fear,” Noldus said. That’s what makes the game interesting to play. But this would not be a good design feature for an online banking site aimed at senior citizens. “You want to convey trust, control and ease of use,” he said — so in this case, boring is good.
That frown is suspicious
As with any new technology, there is the potential for abuse cases as well as use cases, and ethical boundaries are still being defined. Data about emotional states can be misused. Grimes offered a hypothetical situation: What if an on-board camera mounted in a car dashboard reads “anger” in the face of a car driver stuck in traffic? Does his insurance rate go up?
Noldus draws a different boundary—he will not allow his company’s products to be used in public settings where users have not given their consent to being monitored. This differs from lab use, where subjects knowingly consent to being monitored as part of a project. Grimes quoted the old Google slogan as guidance: “Don’t be evil.”