The Harmony project is running at Ulster University in collaboration with University College London, the Universidade Federal de Santa Maria and Fast Data Science.
Harmony is a software tool which helps researchers in psychology and the social sciences to harmonise questionnaire items. Harmony uses natural language processing to identify similar semantic content in questionnaires used by psychologists.
Why do researchers need Harmony?
Researchers in mental health can use Harmony to compare questionnaire data across studies. If you want to find the best match for a set of items, or questionnaires are written in different languages, Harmony can help you find matches.
Researchers across the globe can benefit from Harmony, as it will help facilitate global mental health research and collaborations. Through Harmony, researchers can draw on multiple studies to answer pressing research questions around mental health and its underlying causes. Being able to harmonise measures across studies is the first step to combining different data resources, which is needed to advance current mental health research.
Who was involved
- Dr. Eoin McElroy, Lecturer in Psychology at the University of Ulster, Northern Ireland
- Dr. Bettina Moltrecht, Research Fellow in Population Health and Quantitative Social Science at University College London
- Prof. George Ploubidis, Professor of Population Health and Statistics at the Social Research Institute at University College London
- Dr. Mauricio Scopel Hoffmann, Associate Professor in the Department of Neuropsychiatry at Universidade Federal de Santa Maria, Brazil
- Thomas Wood, Data scientist and natural language processing expert at Fast Data Science
Harmony is funded by the Wellcome Mental Health Data Prize
The Wellcome Data Prizes are targeted at multidisciplinary teams who are using existing data to answer important research questions around mental health and the active ingredients of existing interventions. The prizes are focused on health challenges in four areas: climate and health, infectious disease, mental health and discovery research.
What is natural language processing?
Natural language processing, or NLP, is the branch of AI concerned with giving computers the ability to understand text and spoken words in much the same way human beings can. In recent years, natural language processing has brought changes to many industries and fields of study where it is the norm for humans to analyse large volumes of unstructured text data, including insurance, finance, law, pharmaceuticals, and psychology. Aside from the obvious speed advantage, natural language processing algorithms have the advantage over human reviewers in that they are consistent, reproducible, explainable, and can quantify concepts such as text similarity, as opposed to a human’s qualitative assessment.
Recent advances in neural networks have resulted in the development of word vector embeddings and sentence embeddings, where texts are converted to vectors by a neural network and texts which are semantically similar would be closer together in the vector space. GPT-3 is an example of a transformer neural network which performs this task.
Harmony had the idea of representing each questionnaire item as a vector on the surface of a multi-dimensional sphere. Items which are semantically similar would be close together and have a cosine similarity close to 1, whereas items which are completely different tend to have a similarity close to 0. They have used the deep learning transformer model Sentence-BERT to convert texts in different languages into their vector representations.