This Company Uses Artificial Intelligence To Decode Genomic Diversity
Artificial intelligence (AI) has many applications and is, in most cases, indetectable. From processing data from scanned items in self-check-out kiosks to detecting soil composition for crop growth, AI’s applications are increasingly widespread.
The funding for AI in healthcare has tripled. In 2021, McKinsey reported that biotech companies raised more than $34 billion worldwide. In Q1 2024, only 20 healthcare-focused biotech companies collectively raised $2.9 billion.
Also, in Q1 2024, Moonwalk Biosciences raised $57M in financing for epigenetic profiling. Epigenetics studies how the environment and behaviors can change how genes work without changing the DNA sequence.
In 2003, the Human Genome Project completed the work it started in 1990 to generate and decode the first sequence of the human genome. That work created fundamental information about the human blueprint, which has opened the door to personalized medicine, acceleration of the study of human biology, and medical discoveries in neurogenerative conditions, cancer and heart disease.
“Today, we understand how one percent of the human genome functions. That leaves 99% of the human genome that we know exists, but we don’t understand how they function,” said Dr. Jennifer Hintzsche, CEO and Co-founder of PherDal Fertility Science. “This was originally called “junk DNA,” but it’s far from junk.”
“There has to be a biological reason it has been maintained throughout human evolution; we just don’t yet understand that reasoning,” said Hintzsche. “Decoding 1% of the human genome has already eradicated diseases that would previously have killed us.”
“How many other diseases are waiting to be eradicated if we could just figure out the purpose of the other 99% of the human genome? It will help change medicine and, most importantly, help save people’s lives with every percentage of knowledge we gain from our own DNA,” added Hintzsche.
Artificial Intelligence
Genialis is an RNA Biomarker company that has raised $13M to date. It uses machine learning and AI to examine underlying disease biology. Rafael Rosengarten, CEO and Co-founder of Genialis, says that AI, particularly machine learning, is good at detecting patterns in massive amounts of data, which is imperative for genomic interpretation.
“Each human genome is ~6 billion base pairs (three billion long x two copies of each chromosome),” said Rosengarten. “In each of our bodies, we develop millions of genomic changes (mutations) over our lives, and there are millions of variations from person to person. And often, no single change is causal; rather, it is a combination of changes, so this is a data space well suited to the powers of AI.”
Rosengarten says the challenge of decoding a genome is understanding what changes are meaningful. “That could mean changes that occur within our bodies and trying to decipher which of these are pathogenic and/or druggable.”
“It could also mean differences in genome sequence between groups of people or individuals in a population and trying to decipher which of these genomic differences drives the variation we see, especially as it relates to healthspan, longevity, drug response, and other medical issues,” added Rosengarten.
The genomic divide
AI algorithms can bridge the gap in genetic healthcare accessibility
The Genialis team is analyzing large genomic datasets to uncover patterns of disease prevalence and genetic predispositions within underserved populations.
Each dataset may be as small as dozens or as large as a few thousand patient samples. “The datasets themselves range from being generated by microarray technologies in the late 2000s to various sequencing technologies since the early 2010s,” added Rosengarten.
“No person’s genomic data is exactly alike, and we see stark differences between ethnic groups and male to female,” said Rosengarten. “There is no one central genomic database for the world, but somewhere between 60-80 percent of the world’s patient data sets come from people of European descent.”
Rosengarten says this leads to huge problems when developing a therapy for a cohort—for example, in rural India—but is trained on data from people from different backgrounds.
Rosengarten cites Caroline Criado Pérez’s book, Invisible Woman, which notes that medical researchers previously tried to avoid female subjects, when possible, because of biological heterogeneity relating, at least in part, to menstrual or estrous cycles.
“Such wrong-headed attempts to reduce the complexity of an experimental or clinical design purposefully fails to account for variability that often proves relevant to the scientific question or medical need at hand,” said Rosengarten.
Rosengarten says it’s worth noting that since 2014, the National Institutes of Health require grant applicants to include plans to achieve sex parity among preclinical, experimental models but that the sex of the organism is not always evident from the metadata associated with data in public or even proprietary databases.
Genialis says it’s combating this disparity by working with institutions around the world, including Qatar, India, and more broadly across Asia, to create what they believe will be the world’s most ethno-geographically diverse cancer data sets and find solutions that will work in the intended patients.
“Genialis is highly deliberate in sourcing datasets from our global network of clinical partners, such that the data we use to train and validate our biomarker algorithms is inclusive and reflective of the intent-to-treat population at large, said Rosengarten. “This intentionality is critical because relying on data that represents too narrow a population will be full of gaps and biases, which the AI and machine learning algorithms will learn.”
Understanding genomic diversity
“So the first genome was conducted on one man. ONE,” said Hintzsche. “For a long time, most clinical research of things like cancer have been done comparing a tumor to the small datasets we had access to—which contain the genomes of just a handful of people, lacking genetic diversity.”
Hintzsche says we need to think how underrepresented those genomes must be.
“Most clinical research has focused on comparing DNA to only a handful of people’s DNA,” said Hintzsche. “For example, if we’re looking at tumor DNA and comparing it to this tiny dataset of a few people’s DNA – how do we know if a mutation in the DNA is causing cancer or maybe just common in people of certain backgrounds? Most of the time, we don’t.”
“We have almost zero understanding of the genomic diversity that could explain so many different reasons people react differently to treatments or even diseases. All of that is hidden away in the deep space of the DNA we haven’t even begun to understand yet.”