New AI facial expression recognition technology detects subtle changes in facial expression with a high degree of accuracy, giving machines the ability to correctly judge humans’ emotions.

The new technology was developed by Fujitsu Laboratories and Fujitsu Laboratories of America in collaboration with Carnegie Mellon University School of Computer Science.

One of the obstacles for facial expression recognition technology is the difficulty in providing large amounts of data required to train detection models for each facial pose, because faces are usually captured with a wide variety of poses in real-world applications.

To address the problem, Fujitsu has developed a technology to adapt different normalisation process for each facial image.

For example, when the angle of the subject’s face is oblique, the technology can adjust the image to more closely resemble the frontal image of the face, allowing the detection model to be trained with a relatively small amount of data.

The technology can accurately detect subtle emotional changes, including uncomfortable or nervous laughter, confusion, etc.-even when the subject’s face is moving in a real-world context.

Fujitsu anticipates that the new technology will find use in a variety of real-world applications, including communication facilitation for employee engagement and to improve workplace safety for drivers and factory workers.


In recent years, technologies that detect changes in facial expression from images and read human emotions have been increasingly attracting interest.

Existing technologies have mainly been developed for detecting clear changes in facial expression (e.g. the corners of the mouth and the corners of the eyes moving widely).

These technologies have been used in some practical applications, including automatic extraction of highlight scenes in videos and enhancing robots’ reactions.

In the future, facial expression recognition technologies will be more widely utilised in a variety of situations, including patient monitoring in healthcare and analysis of customers’ responses to products in marketing campaigns.

The issues

In order to “read” human emotions more effectively, it’s critical to capture the subtle facial changes associated with emotions like understanding, bewilderment, and stress.

To accomplish this, developers have increasingly relied on action units (AUs), which express the “units” of movement corresponding to each muscle of the face based on an anatomically based classification system.

For example, AUs have been used by professionals in fields as varied as psychological research and animation. AUs are classified into approximately 30 types based on the movements of each facial muscle, including for eyebrow and cheek movements.

By integrating these AUs into its technology, Fujitsu has pioneered a new approach to detect even subtle changes in facial expression.

To detect AUs with greater accuracy, large amounts of data are required by the underlying deep learning techniques. However, in real-world situations, cameras usually capture faces at various angles, sizes, and positions, making it difficult to prepare large-scale learning data corresponding to each visual/spatial state. Therefore, the camera-captured images adversely impact detection accuracy.

New technologies

In collaboration with the Carnegie Mellon University School of Computer Science, Fujitsu Laboratories and Fujitsu Laboratories of America have developed an AI facial expression recognition technology that can detect AUs with high accuracy even with limited training data.

* Normalisation process to adjust the face for better resemblance of the frontal image – With this technology, images of the face taken at various angles, sizes, and positions are rotated, enlarged or reduced, and otherwise adjusted so that the image more closely resembles the frontal image of the face. This makes it possible to detect AUs with a small amount of training data based on the frontal view of the subject’s face.

* Analyzing significant regions that affect AU detection for each AU – In the normalisation process, multiple feature points of the face in the image are converted so that they approach the positions of the feature points in the frontal image. However, the amount of rotation, enlargement/reduction, and adjustment changes depending on where the feature points are selected in the face. For example, if the feature points are selected to be around the eyes and perform the rotation process, the area around the eyes will be close to the reference image, but parts such as the mouth will be out of alignment.

To tackle this issue, the areas that have a significant influence on AU detection from the captured face image are analyzed, and the degree of rotation, enlargement, and reduction get adjusted accordingly. By using different normalization process for each individual AU, the developed technology can detect AUs with greater accuracy.

The technology has achieved a high detection accuracy rate of 81% even with limited training data. It is also more accurate than other existing technologies according to certain facial expression recognition technology benchmarks.