“A group of researchers – including academics from the University of Cambridge, University College London and Beihang University in Beijing – have used textile strain sensors to measure the movement of throat muscles, via vibrations, as well as the pulse in the carotid arteries, to shed light on whether a user’s emotional state is neutral, relieved or frustrated.

That data is then fed into two large language models, each based on GPT-4o-mini, the model behind some instances of ChatGPT. The first, known as the token synthesis agent (TSA), aims to tease out the intended words mouthed by the user and group them into sentences.

The second, the sentence expansion agent, takes sentences from the TSA and uses contextual information like the time and weather, as well as the user’s emotional data, to expand the sentences into what the researchers describe as ‘logically coherent, personalised expressions that better capture the patient’s true intent,’ compared with when the sentences are created without contextual and emotional clues.

The researchers declined to speak to New Scientist but claim in their paper that in tests with five people with dysarthria as a result of a stroke, their system achieved sentence error rates as low as 2.9 per cent. They also found that using emotional and contextual clues to add to sentences increased user satisfaction over straightforward reconstruction of sentences by 55 per cent.”

From New Scientist.