
AsianScientist (May. 14, 2025) – In June 2009, a clinic was set up at the KK Women’s and Children’s Hospital in Singapore to recruit for the nation’s largest and most comprehensive cohort study to date. Titled Growing Up in Singapore Towards Healthy Outcomes (GUSTO), the study aimed to evaluate how genetics and environmental factors influence the development of a child from as early as 10 weeks of pregnancy. A total of 1,247 expectant mothers volunteered for the study, managed by multiple research institutes across Singapore, where the women underwent a host of testing including ultrasound scans, oral glucose tolerance testing and nutritional assessments. The study is still ongoing with 882 active participants. It has resulted in 366 publications—providing insights on topics such as allergies, nutrition, myopia and brain development—since its inception.
In addition to gathering these useful insights, the researchers learnt an important lesson about inclusion. The study, among other criteria, required that the fetus should be racially homogenous with both sets of grandparents of the same ethnicity.
“Sixteen years ago, we understood a lot less and that requirement very quickly became absurd,” said Michael Meaney, program director of translational neuroscience at the A*STAR Institute for Human Development and Potential (A*STAR IHDP), in an interview with Asian Scientist Magazine.
The year GUSTO was launched, 15.1 percent of marriages in Singapore were inter-ethnic and 38.7 percent were transnational. These figures have remained consistently high since then and make ancestrally-limited cohort studies more difficult to achieve and less relevant to an increasingly mixed population.
Building on insights gained from GUSTO, a similar study launched in February 2015 at the KK Women’s and Children’s Hospital adopted a call for volunteers with broader and more inclusive criteria: ‘Woman, partner and family members are of Chinese, Malay, Indian ethnicities or combination thereof ’. Called the Singapore PREconception Study of long-Term maternal and child Outcomes (S-PRESTO), the new study, run by the same institutes, aimed to explore the effects of nutrition, lifestyle and maternal mood prior to and during pregnancy on the health outcomes of children. It recruited 1,032 women intending to conceive.
Globally, researchers increasingly recognize the importance of expanding datasets and intentionally including diverse and marginalized groups to address the needs of a broader population. This approach enhances understanding of unique challenges faced and ensures research benefits are applicable and equitable for all communities.
“If research only focuses on a specific group, the findings may not apply to everyone—this could lead to interventions that are less effective or even harmful to underrepresented groups,” said Seyed Ehsan Saffari, assistant professor at the Centre for Quantitative Medicine (CQM) at Duke-NUS, National University of Singapore (NUS).
GENOMICS AND BEYOND
When it comes to genetics, most genomic studies have focused on individuals of European descent— approximately 86.3 percent as of June 2021, according to the International Health Cohorts Consortium. Researchers and healthcare organizations in Asia have noticed this underrepresentation and are working to bridge the gaps.
One example is Japan-based Precision Medicine Asia (PREMIA), an oncology innovation platform that manages Japan’s nationwide clinical-genomic lung cancer registry, LC-SCRUM. The database includes genomic data from 19,000 patients in Japan and is expanding the list through partnerships with over 265 hospitals in Japan, Taiwan, Thailand and Malaysia. By doing so, PREMIA intends to gain a better understanding of how to best treat patients who are underrepresented in clinical trials. Similarly, India launched the Genome India Project in 2020 to sequence 10,000 genetic samples from citizens across India to build a reference genome.
With a population of 1.3 billion and wide ethnic diversity between groups, extensive and diverse data collection is key. The project is being conducted by 44 scientists from 20 institutes across India. The team collected roughly 20,000 samples from 99 different population groups—over 10,000 of which have been sequenced. As the data was collected, a group of researchers worked on developing new methods to analyze the data.
So far, the study has identified hundreds of millions of genetic variations that are common in Indian populations and has provided insights that were not previously detected due to the exclusion of smaller populations in previous sequencing efforts. In particular, the team has compiled a set of “benign” variants found in healthy individuals—these variants can be eliminated when researchers look to understand common diseases.
ACTIVE INCLUSION
Beyond genetics, researchers are considering a myriad of other factors to make precision medicine more effective and personalized to each individual. These factors include lifestyle, socio-economic background, physical ability, behavioral patterns, age and gender.
“When you’re talking about this kind of personalized medicine, we need to have individual data—genetic data can be gathered with blood tests, but we also look at lifestyle and environmental factors,” Saffari told Asian Scientist Magazine. “What the patient is doing consistently over the years—like drinking tea, or frequently sticking to green tea—can have an impact.”
Similar to Saffari, Creighton Heaukulani, principal data scientist at Singapore’s Ministry of Health Office for Healthcare Transformation (MOHT), said, “In modern precision medicine, we should be aiming to measure as many of these determinants as possible, and in as high a definition as possible, to increase the accuracy and reduce the uncertainty in our predictions of diagnoses and treatment outcomes.”
However, this data is not always easy to procure—particularly when it comes to marginalized groups. Such individuals may not have access to healthcare systems or organizations that researchers typically work with, or the subgroups may be too small to be statistically relevant. “It can also take time to identify and assess the effectiveness of interventions across different subgroups,” added Saffari.
Keeping such challenges in mind, a team at MOHT launched a digital mental health platform called mindline.sg in 2020, with Heaukulani coming on board a year later. The platform aims to provide targeted mental health support and stress management for a diverse set of users.
To understand their needs, Heaukulani and his colleagues worked with a broad range of individuals from different backgrounds who intended to use the service. They collected data through quantitative measurements and qualitative focus groups or interviews.
Like biomedical precision medicine research, Heaukulani’s work aims to offer specific treatment for an individual’s mental health needs. “The broader the understanding and applicability of your solution, the broader your impact,” he explained. “If there is a population not represented in your research—for example, not present in your focus group cohorts—then it is unlikely that you will understand their particular needs and therefore never have a chance to properly design solutions for those needs.”
BUILDING BETTER MODELS
While exploring ways to make their datasets more diverse, researchers in Asia are also trying to build models that can limit the difficulties researchers face when including everyone in the wider population sample.
Such models could also allow for greater generalization—making research done with one target group, applicable to wider communities.
One example is an admixture model that takes into account variants in an individual’s genetic data that may not be attributed to their reported ethnicity. A team from the SingHealth Duke-NUS Institute of Precision Medicine (PRISM) analyzed a cohort of 9,051 individuals from the SG10K Health project that sequences a large cross-section of Singapore’s population.
In the study, the team aimed to identify clinically relevant genetic variants from ancestrally diverse genomes. The cohort consisted of three major ethnic groups—Chinese, Malay and Indian—classified according to self-reported ancestry. However, self-reported ancestry does not always align with geneticallyinferred ancestry, which refers to how far an individual’s genome aligns with an ethnicity. Such misalignment can be a logistical challenge when researchers are defining studies and analyzing data.
The team overcame that challenge by using admixture software. It analyzes ancestral components of a genome and compares it to likely ethnic groups,which allowed the team to infer the genetic ancestry of the participants. Of the 9,051 genomes, 268 were inconsistent with reported ancestry.
From there, the team analyzed the genome sequences and identified the clinical variants they were interested in. The researchers noted that given a mixed population, individuals of one identified ancestry may present with genetic variants that are usually only present in another ancestry—for example, a Chinese person may present with a genetic variant that is usually only found in the Malay population. Such findings highlight the need for diversified representation across all population groups in studies. Models like admixture enable scientists to work more effectively with these large, diverse groups.
“We need many examples with a broad variety of patient characteristics in order for our statistical methods to learn how to discriminate many different cases of patients well with all their characteristics and disease manifestations,” said Heaukulani. “If you don’t have examples of a certain type of patient, you will not be able to discriminate those cases well. And you don’t just need one of those patients. You need many.”
—
This article was first published in the print version of Asian Scientist Magazine, January 2025.
Click here to subscribe to Asian Scientist Magazine in print.
Design: Wong Wey Wen/ Asian Scientist Magazine
Copyright: Asian Scientist Magazine.
Disclaimer: This article does not necessarily reflect the views of AsianScientist or its staff.