Australia’s ‘ABC Online’ highlights what few of us knew about Florence Nightingale; that she was a statistician more than a nurse.
“Upon arriving at the British military hospital in Turkey in 1856, Nightingale was horrified at the hospital’s conditions and a lack of clear hospital records.
Even the number of deaths was not recorded accurately. She soon discovered three different death registers existed, each giving a completely different account of the deaths among the soldiers. Using her statistical skills, Nightingale set to work to introduce new guidelines on how to record sickness and mortality across military hospitals.
With her improved data, Nightingale put her statistical skills to use. She discovered deaths due to disease were more than seven times the number of deaths due to combat, because of unsanitary hospital conditions.
More than 150 years after Nightingale pointed out the need to standardise datasets before comparing them, we are certain she would have something to say about the lack of standardised datasets for COVID-19.”
Her key recommendation to the British military commanders — Handwashing and social distancing!
Coronaviruses are a family of viruses. There are many different kinds, and some cause disease in animals. Coronaviruses were first discovered in the 1930s when an acute respiratory infection of domesticated chickens in the US was shown to be caused by infectious bronchitis virus (IBV).
Animal coronaviruses can “spill over” into humans. Human coronaviruses were discovered in the early 1960s when it was found that two new viruses caused a cold in humans. Later, in 1967 it was discovered, using electron microscopy, that these two viruses, B814 and 229E, were related to the 1930’s IBV “bird virus”. Soon after, these three were shown to be related to a fourth, the OC43 virus, and the fifth, mouse hepatitis virus, and they came to be known as coronaviruses.
There are seven coronaviruses that infect human beings. The most recent discovered is the SARS-CoV-2, (the new coronavirus that causes coronavirus disease 2019, now called COVID-19).
COVID-19 is a highly infectious disease. But when it seeks and fails to replicate in normal body cells it can change its genetic footprint and “mutate” into a new virus. Therefore, the risk is that a new and more lethal human coronavirus could emerge in the future, especially if we lose control of COVID-19. Hence there is a sense of urgency in better understanding the coronaviruses, and finding an anti-viral treatment (vaccine) for them.
There are some 50,000 plus peer-reviewed research papers on studies of coronaviruses. There is substantial additional research material available in documents and social media. Each one of these and other sources of coronavirus data is equivalent to one of Florence Nightingale’s datasets 180 years ago; isolated pieces of what could be information critical to the development of treatments or vaccines. Florence tried to make sense of three datasets; today researchers have to investigate the contents of tens of thousands. If a single human being could read all of those, and remember their content, they might find that five of the documents had drawn conclusions that suggest something no-one had thought of. We know that such a feat is impossible for a large group of clever researchers, let alone one.
Computers, on the other hand, can read at thousands of times the speed of a human. They can “remember” everything they read. This speed-reading utilises a technique called “Deep Learning” that can extract the concepts discussed in a digital document, where a highlighted concept might be one of the proteins found in a coronavirus membrane, or an RNA mutation pattern.
Artificial Intelligence utilises sophisticated computer knowledge stores called “Semantic Knowledge Graphs”. Deep Learning can reference these fancy knowledge stores to enrich the extracted concept and store it as a concept within the Semantic Knowledge Graph, the heart of any Artificial Intelligence system.
From their capability to store and find relationships in massive amounts of data, Artificial Intelligence systems can utilize the Semantic Knowledge Graph to reason, or to “infer” new knowledge, based on finding hidden relationships between concepts and/or their properties (metadata). While this power of reasoning is currently only at the level of an infant human child, the closest we currently have to mimicking human brain behavior, it is compensated for by the amount of data it can access and the speed at which it can find these hidden relationships from which it can infer new information. Of course, any new knowledge proposed needs to be referred to a highly intelligent human, such as an epidemiologist.
The potential for a combination of highly expert medical investigators and Semantic Computing techniques has been known since 2008 when Oxford University published its seminal findings on the Cancergrid study. It was largely overlooked because Semantic Computing, as it has become known, represents a new paradigm in thinking about data structures, is based on complex mathematical concepts, and other techniques appeared easier.
The more recent advances in Deep Learning extraction of knowledge from digital research documents can now be added to the mix, representing a substantial advance in Artificial Intelligence support for subject matter experts of all kinds.
The need for AI in medical research has been drawn into the spotlight by COVID-19, with Semantic Knowledge Graphs the star of the show. This is coincident with acceptance that
- A computer can’t be intelligent if it can’t store and access knowledge in a way that it understands the knowledge itself, rather than following instructions provided by a human
- A new generation of “graph” databases can enrich descriptions of things so that a computer can “understand” them better
- Semantic Knowledge Graphs provide the most sophisticated graph model for representing knowledge in a computer system
- Semantic Knowledge Graphs are essential for “robust and automated artificial intelligence” (Forrester Research and others)
- Semantic Knowledge Graphs will become as prevalent in large and medium enterprise informatics infrastructure during the 2020’s as websites became in the 1990’s as a core enterprise technology.
Already, it is the vendors of Semantic Knowledge Graph based solutions that are prominent in support of the epidemiologists studying COVID-19.
We are working with the three leading Universities in Sydney to improve the extraction of knowledge from multi-lingual research papers (important since Europe and China are leading sources of research data on coronaviruses).
Our Semantic Knowledge Graph will ingest that extracted knowledge and it will be made available to international collaborations of coronavirus investigators in a form that is a considerable improvement on what they have today.
It is the closest thing we have to a human researcher reading 50,000 research papers, remembering everything in them, and searching for relationships between research findings to highlight potential areas for closer study.
Hopefully other companies like ours can collaborate to share Semantic Knowledge Graph data and insights through a feature of Semantic Knowledge Graphs known as “linked open data”, to speed up the processes of research collaboration and real time publication of important findings.
Author: Mark Bradley is the founder and sales chief of Cognitive Software Group, the leading cognitive computing company in Australia. www.cognitivesoftware.com