Skip to content
_
_
_
_

AI cannot feel emotions, but it can recognize them in an image

A study concludes that modern systems can learn sophisticated representations of emotion concepts, despite not being trained to do so

IA emociones
Manuel G. Pascual

Machines cannot feel or empathize with people. But large language models — particularly multimodal systems (those capable of processing data in multiple formats, such as text and images) — behave as if they understand emotions. That is the conclusion of a study published in the journal Royal Society Open Science, which found that when these models are asked to respond as a human would, they rate the emotions depicted in images very similarly to the two hundred volunteers who participated in the experiment.

While large language models (LLMs), like ChatGPT, are trained on massive amounts of text, the databases used to build multimodal systems consist of billions of images paired with plausible textual descriptions. “The resulting system is a complex probabilistic model of how words and phrases correlate with image pixels, which can answer nontrivial questions about the content of visual scenes,” explains the study.

Can these systems perceive and judge the emotional content of images? Investigating this, the researchers argue, would help determine whether the responses of these models to affective situations “are aligned with our normative set of values and, thus, mitigate risks associated with biased or inappropriate responses.”

After a series of experiments, the researchers conclude that “the AI ratings are highly correlated with the average ratings provided by humans.” This is notable, since this was not the case with AI systems that did not use LLMs. According to the study, the results suggest that “modern AI systems can learn sophisticated representations of emotion concepts via natural language, without being explicitly trained to do so.”

Experiment with machines and people

The researchers tested three of today’s most advanced multimodal systems: ChatGPT-4o (from OpenAI), Gemini Pro (from Google), and Claude Sonnet (from Anthropic). The models were shown a large number of images and given a prompt, or instruction, asking them to “pretend to be a human subject participating in a psychological experiment.” They were then asked to rate the images on a scale from 1 to 9 based on how negative or positive the scene was (valence), whether it provoked a sense of relaxation or alertness (motivational direction), or whether it made them want to avoid or approach the scene (arousal). The models were also asked to rate the extent to which the image provoked happiness, anger, fear, sadness, disgust, or surprise.

These ratings were compared to those given by a sample of 204 people, who assessed the emotional charge of 362 photos using the same criteria. The images were taken from the NAPS database, which contains 1,356 photos in different categories (animals, landscapes, objects, people, and faces) and includes positive, unpleasant, or neutral content.

The results from the machines and humans were very similar. According to the study, “GPT responses correlate particularly well with those of humans” — between 0.77 and 0.90, with 1 being a 100% correlation. Claude also performed very well (0.63-0.90), “although this model often refuses to respond due to safety constraints” (it discarded 6% of the questions). “Gemini exhibits slightly lower, but still remarkable matches to human responses” (0.55-0.86).

How is it possible for multimodal systems to match human judgment so closely? “The most plausible explanation has to do with the training data,” explains Alberto Tesolin, a researcher at the Department of General Psychology and Mathematics at the University of Padua and co-author of the article. “We tend to think that image-text pairs contain purely visual semantic information, such as ‘image of a field of sunflowers.’ Our research suggests that textual descriptions are much richer, allowing us to infer the emotional status of the person who wrote the entry.”

The fact that an LLM can mimic responses to questions about subjective human judgments is striking, though it had already been documented. “

“If the machine has access to data extracted from texts about typical reactions to certain stimuli — even if not exactly the same as those used by the researchers — it is entirely possible, even if the process is completely opaque, that the model can mimic judgments,” notes psychology professor José Miguel Fernández Dols of the Autonomous University of Madrid, who did not participate in the study. “It could process adverbials, adjectives, or verbs associated with the description of the type of image it is processing,”

A controversial topic

The authors underscore a key point: “The fact that AI systems can emulate average human ratings does not imply that they possess the ability to think or feel like humans.” In fact, they continue, people can have very different affective reactions to the same stimulus. “In several cases, the AI responses are not aligned with the way humans would confront emotional situations, suggesting that ‘reading about emotions’ is qualitatively different from having direct emotional experiences.”

The perception and interpretation of emotions is a controversial area in AI. While some companies market facial recognition systems that claim to detect a person’s emotions, scientific literature disputes the existence of universal physiological responses to emotional states, emphasizing instead that emotions are largely shaped by culture. In fact, Testolin and his colleague Zaira Romeo call on the scientific community to further investigate “the large cultural differences in emotion elicitation, regulation and social sharing.”

“These kinds of achievements show that psychology has relied too heavily on verbal reports, which are highly dependent on everyday language,” Fernández Dols observes. “And they provide us with an interesting topic for reflection: everyday language is a logical construct that can be perfectly coherent, persuasive, informative, and even emotional without any brain speaking.”

Sign up for our weekly newsletter to get more English-language news coverage from EL PAÍS USA Edition

Tu suscripción se está usando en otro dispositivo

¿Quieres añadir otro usuario a tu suscripción?

Si continúas leyendo en este dispositivo, no se podrá leer en el otro.

¿Por qué estás viendo esto?

Flecha

Tu suscripción se está usando en otro dispositivo y solo puedes acceder a EL PAÍS desde un dispositivo a la vez.

Si quieres compartir tu cuenta, cambia tu suscripción a la modalidad Premium, así podrás añadir otro usuario. Cada uno accederá con su propia cuenta de email, lo que os permitirá personalizar vuestra experiencia en EL PAÍS.

¿Tienes una suscripción de empresa? Accede aquí para contratar más cuentas.

En el caso de no saber quién está usando tu cuenta, te recomendamos cambiar tu contraseña aquí.

Si decides continuar compartiendo tu cuenta, este mensaje se mostrará en tu dispositivo y en el de la otra persona que está usando tu cuenta de forma indefinida, afectando a tu experiencia de lectura. Puedes consultar aquí los términos y condiciones de la suscripción digital.

More information

Archived In

Recomendaciones EL PAÍS
Recomendaciones EL PAÍS
_
_