Using emotion recognition models to find emotion

Last week we compared sentiment analysis to emotion recognition software, using VERN. We saw that sentiment analysis rated content as “positive,” “negative,” “neutral,” and “mixed.” Some software rated content with a magnitude, indicating the strength of the “positive,” “negative,” “neutral,” or “mixed.” In some cases, it is accurate. In others, it was not.

Emotion recognition looks for the emotional clues that people communicate with. VERN rated the same sentences and found indicators of Anger, Sadness, and Humor with a confidence level between 0-100%. It performed well in cases where there may be mixed emotional clues…which happens a lot in daily communication. And, sentiment analysis had challenges identifying the polarity and magnitude in these cases.

Today we’re taking a look at what emotion recognition software like VERN does and does not, so you can get a clearer picture on what to expect, and to get the results that our clients see firsthand. In this series we’ll be talking about the overall conceptual model, our methodology, some examples, and discuss limitations to the software. Today is all about the conceptual model.

Let’s start with what VERN is not.


VERN is not a neural network. We have not developed a ML model that will chew through extensive corpuses and make inferences based on the prescribed data set. This would be antithetic to our conceptualization of the phenomena of communication. Frankly those that do implement this technique are likely going to encounter significant problems with external validity and bias.

VERN does not read user’s minds. It analyzes the words that users choose to use, and simply is engineered to mimic a finely-tuned “ear” to the emotions present in most communication. If there is an emotion detected, it’s because the user has consciously chosen for there to be clues to those emotions. It can be quite spooky at times, finding things you may not have realized were there…like finding sadness in angry statements or the combination of humor and anger giving clear indications of sarcasm. But it only uses what you give it.

VERN is not a noise analyzer. We do not analyze tonal shifts, grunts or inflections. We don’t analyze facial features or expressions. While those analysis may have their place, in the end it’s the words we use. If the opposite were true, we’d be able to give driving directions using only grunts, whines and funny faces.


What VERN is: Back to the Future


VERN is “like” a classical AI that uses a Bayesian type classification system with explicit tagging aggregated by a fixed expert system. There is no standard set of training data. We did not use any predefined set of information, for fear of biasing the model. The way that the communication model works makes it a great deal easier to do incremental training with VERN, and avoid the problems with models derived from ML.

VERN is a patented emotion detection software. It works by detecting emotions in human communication. We look for specific words and phrases in different combinations and in specific usage patterns that are available in the words senders use to communicate to receivers. We use known attributes of the sender and receiver to discover and create personal frames. Personal frames are how we as individuals and groups relate to shared information. Communication requires a common understanding of meaning and usage, and some basic alignment of frames of reference. Otherwise, the signal is only noise.

The communication model of Sender and Receiver posits that communication consists of a Sender, who shares information through a medium, to a Receiver who receives it. The Sender cognitively encodes, and prepares a message and sends it to a Receiver. Here, the process happens in reverse. The Receiver decodes the message, and interprets its meaning. Wherever there is misunderstanding in human communications, it’s because of noise.

What’s different, is VERN is a modification and improvement of the transactional model of communication; in that it includes a moderating variable on the transaction between the Sender and Receiver. There can only be what is inherent in the message, what is implied by the message; and how someone may interpret the message. VERN has patented this process. To understand communication, identification and categorization of these sub-dimensions are key to understanding what emotions really are–and how they are used.

Because science doesn’t really have an answer.

We’re still debating if emotions are really a thing.


Psychological theories have plenty of ideas. They suggest 5 emotions, 27 emotions…and some include even more exotic classifications of elements as “emotions.” Some are contradictory and some are most likely the same variables but just mislabeled. It’s a confusing mishmash of concepts and theories, and as such there’s no consensus.

VERN’s concept of emotion is a bit different.


We posit that there are three emotional states in humans, and perhaps in most creatures with brains. We work our model on the assumption that the emotional states are: Euphoria, or pleasant feeling cognitive states that present with a neurological response; Dysphoria, or an unpleasant feeling cognitive states that also present with a neurological response; and of course, fear.

The emotions, archaic and modern derive from a combination of the three. In use, emotions have been theorized as expressions of our uses and gratifications, and as such each universally recognizable emotional state contains such expressions. The first of these mental states we’d all recognize as emotions that have been identified by our team are relatively salient to most groups: Anger, Sadness, and Humor. (Yes humor…believe it or not. It’s not purely subjective, for if it were no one would understand it. As it turns out, its mechanics are fairly objective).

Plus, incongruity.


Sharing of an incongruity is essential in communication. When we share information, we favor information that is ‘new’ or ‘different’ and therefore possibly incongruous with what our expectation of current and future outcomes would be. Why?

Think of it in terms of efficiency. The brain is a wonderful tool, honed by billions of years of evolution. But like any information system, it has limits to what can occupy its present state of focus. It can’t simply sense and respond to every input, the amount of energy resources necessary for that likely would have prevented evolution of certain regions of the brain. (Fish that can’t stop staring at the pretty lure don’t last long).

So if our tribe was walking on the plains in the Serengeti it wouldn’t make any sense for our brains to register everything in the environment. Just what is different.  To illustrate, I’ll use an example: “Grass, grass, grass, grass, grass, grass, lion, grass, grass.”

The incongruity in the example just gave everyone who understood it, a small dopamine boost. You’re welcome. It’s essential to reward the brain for efficiencies, and detecting and communicating incongruities have important affects on phenomena we consider emotions.

VERN detects incongruities. We use a patented method of detecting the frequency, correlation, and collocation of words in relation to one another. In doing so, we detect when people communicate outside the expected use. This signal informs various weights and thresholds within the emotional model.

(This emotional detection tool will be released when ready).


Everything is personal


People express emotions to one another through code that we all know called language. We impart our emotional state into the code with the intent of communication to satisfy our needs and wants, whether the medium be broadcast television or the thoughts in your own head. Whether or not the intent of the sender can truly be determined is a moot point; as the sender-receiver model is a two sided equation. A receiver “taking” anything other than the words that were sent at face value is a result of a different frame of reference, and therefore noise introduced into the model. The receiver hears different clues based on the “way they see things.”

That “way,” is the personal frame.

Think of it this way: The personal frame is your relationship to the information stored in your memory. That it’s the culmination of your knowledge, experiences, and learning throughout life. As such, it is unique. However, as a member of a community (even as macro as on a genus or species level) there are common shared knowledge resources. What fingers do for example. Or how a joke about a political rival might be taken. There is almost always a shared frame.

These personal frames are shared amongst people of the same or similar groups. VERN identifies personal frames through any available classifying information. It can be simply age. Or sex. Or both. Or, with more demographic information, patterns emerge and through usage show us how things are interpreted by the receivers. Here is where the machine learning system of VERN enters into the process, as a refinement of the initial to the final analysis. We use supervised and semi-supervised machine learning techniques based on a concept we call ‘floating regressions,’ or automated logistic regressions of detected factors to determine statistical significance in relation to human coder agreement. This method will determine when individuals align by agreement and has been shown very effective in our own research. It has even been effective in identifying humor.


And humor is no laughing matter.


One of the most interesting phenomena in communication is humor. It’s omnipresent, yet it barely gets the attention it deserves. Plato said it was beneath us all to study humor. So most people don’t take it seriously. It’s not just the general public that isn’t serious about humor. Psychologists to this day still insist that humor needs a physiological response to call it humor (it doesn’t). In fact, even some of the most brilliant communication scholars of our time still conflate humor and funny.

Humor, as it turns out, is the communication of a benign incongruity. Funny, is the appreciate scale of that stimulus but not necessary for humor. A political joke is a joke, whether or not a Democrat or Republican can agree on whether it was funny.

Since incongruities are the basis of humor, and the sharing of them solicits a neurochemical reward, shouldn’t science have paid a bit more attention to humor? Yeah, most likely. Silly scientists. But good for the rest of us.

In fact, as a communication phenomena Humor spans the gamut of human emotions. We’ve found that humor is present in expression of anger, sadness, which we’ve released as detectors. And we’ve found it in joy, love & affection. What is humor? It’s a detection of a benign incongruity. The appreciation for that detection is mirth.

Humor, as it turns out…is the clown car that all the emotions pile into. So buckle up, buckaroos because we’re going to find out exactly what’s going on. (You can join us and help us discover emotions).

Ok so what’s the methodology?

We’ll get into that one in our next blog…So stay tuned for the next blog coming soon!


Comments are closed.