Researchers have developed a machine-learning algorithm similar to those used by Facebook and Netflix that can decode the molecular language of disease and potentially revolutionize the world of medicine.
Recommendations on social media and online entertainment platforms are derived from powerful machine-learning algorithms that monitor behavior patterns to suggest potential friends or connections, or the next series or film to watch on platforms such as Netflix. Predictive text on a smartphone also makes use of deep language learning to anticipate which words a user is likely to need next as they write a sentence.
If similar machine-learning algorithms can be trained to produce massive language models based on protein interactions within the human body, the results could prove to be revolutionary for the field of medicine, and may unlock the secret to defeating some of humanity’s most intractable and devastating diseases.
Researchers at St. John’s College, University of Cambridge, fed decades of medical research into a computer language model they say has now reached the same conclusion as human scientists did about the molecular roots of disease in the human body – specifically the consequences of (mis)behavior among proteins – but in a fraction of the time.
In other words, the machine-learning algorithm can now ‘predict’ the biological language of cancers and neurodegenerative diseases such as Alzheimer’s, and may soon allow medical experts to “correct the grammatical mistakes inside cells that cause disease.”
That’s according to Professor Tuomas Knowles, lead author of the paper and a Fellow of St. John’s College, who described the breakthrough as “an absolute game-changer” that could soon lead to the development of “targeted drugs to dramatically ease symptoms or to prevent dementia happening at all.”
The researchers were particularly interested in the language of shapeshifting biomolecular condensates – unruly clumps of proteins not dissimilar to the wax in lava lamps, which can disrupt normal biological functions with devastating consequences.
“The human body is home to thousands and thousands of proteins, and scientists don’t yet know the function of many of them. We asked a neural-network-based language model to learn the language of proteins,” said Dr. Kadi Liis Saar, the first author of the paper.
The machine-learning algorithm effectively acts as a kind of biological codebreaker that
unlocks the enigma of molecular function – or malfunction – responsible for cancers and neurodegenerative conditions.
Proteins fulfil many critical functions in the human body, including providing structure, function, and regulatory frameworks for our organs, as well as protecting them in the form of antibodies.
“We fed the algorithm all of the data held on the known proteins so it could learn and predict the language of proteins in the same way these models learn about human language and how WhatsApp knows how to suggest words for you to use,” Dr. Saar said, adding that the technology could soon explore avenues of research humans have yet to conceive of.
“It is a very challenging problem and unlocking it will help us learn the rules of the language of disease.”
The developed neural network has been made freely available to researchers across the globe with a view to further improving both it and the lives of tens of millions of human beings around the world.
Think your friends would be interested? Share this story!