Computer scientists at the University of Oulu are hard at work developing a new type of robot capable of recognising and responding to people’s facial expressions, gestures, and speech – to create a system able to handle anything from home help to acting as a museum guide.
We have already seen the arrival of things like automatic lawn mowers and vacuum cleaners, but the aim of the team at the University of Oulu is much more ambitious: to take affective human-robot interaction a major step forward. Their goal is to develop a robot that is capable of interpreting people’s spontaneous expressions, gestures, and speech to register basic human emotions such as sadness, fear, anger, excitement, and amazement – and respond accordingly.
No small challenge, as people can often find it difficult to tell what mood those around them are in, but the project is drawing on some of the latest advances in areas such as speech animation, navigation, and machine vision – an area in which the University of Oulu is particularly strong through its work on things like Local Binary Patterns.
|The aim of the project at the University of Oulu is to achieve a significantly higher level of human-robot interaction than has been achieved before. Photo: Jukka Kontinen.
The aim is to achieve a level of interaction that resembles human interaction as closely as possible, opening up the potential for new types of robots that could be used in everything from care for the elderly to security and logistics, not to mention market research, for observing how people react to different types of advertisements or TV programmes.
Given the scale of the challenge, even the most optimistic members of the team expect to see the results of their work only fully enter the commercial or consumer marketplace in 10 years’ time at the earliest – despite the fact that some of their research has immediate commercial potential.
Work is currently focusing on coordinating the prototype robot’s observational (visual and aural) capabilities with its navigational ones to provide intelligent mobility, and matching this with the ability to recognise the volume, direction, and content of what people say to synthesise speech and lip movement.
The system is being designed to recognise different people and remember previous interactions with them, and compare new data that it receives to preprogramed emotional and interaction models to generate appropriate sound and movement. Data contained in the robot’s internal database is supplemented on the fly with additional information from the Internet to fine-tune responses.