(SLS1) Portability of Genetic Risk Scores to Underrepresented Populations
Director of Studies: Manuel Corpas email@example.com
Background and aims of project
Throughout modern history, genetic study designs have been heavily influenced and shaped by individuals of European ancestry. This has been exacerbated over the last century with advances in genomics: while this group represents only 16% of the global population, a 2019 study showed that around 78% of data used in genome-wide association studies (GWAS) originated from people of European descent. As a result, genomics and medicine have been biased by the default test subject – males of Northern European origin. Even though models for prediction of genetic risk may be present for underrepresented populations, they face risk of being disregarded by clinicians due to their lower accuracy.
According to the Global Alliance for Genomics and Health, access to benefits of genome research for all peoples is not only a moral imperative but also a question of justice. Research conducted for this PhD work will have an impact in our ability to translate European-trained models for prediction of genetic risk not only for African but for other underrepresented populations. Our research outputs will make the fruits of human genome research more equitable, diverse and inclusive.
The aim of this PhD is to port genetic risk prediction models for some of the most common diseases, trained in European populations, to Africans. We build on our track record of large-scale implementation of polygenic risk scores, collaborations with the Breast Cancer Association Consortium (BCAC) and our involvement in developing polygenic risk scores for lipid traits in African populations.
Overall research strategy – main methods to be employed
The project applies informatic skills to large-scale genomic data. Methodologically, the project will combine advanced methods for longitudinal data analysis of whole genomes. This approach will help to uncover the role of genetic factors in shaping polygenic risk scores for underrepresented populations. We will be using training dataset of African populations from public resources complemented with datasets such as the UK Biobank, the Breast Cancer Consortium and the Million Veteran Program genetic archive.
We expect to come up with predictors of genetic risk for some of the most common complex diseases, including breast cancer, diabetes type 2, coronary artery disease, atrial fibrillation and inflammatory bowel disease. Together, these five traits affect ~20% of mortality cases in industrialised nations.
Experience that the student will gain
The student will be exposed to some of the most influential datasets and gain first-hand experience with big data analysis and statistics. S/he will have the opportunity to learn some of the most used types of software for high throughput analysis of personal genomes.
The skills gained through this PhD will allow the student to feel confident with a variety of technologies, including python, R, databases and will equip them with the tools and resources to understand the data cycle for analysis on a big data setting.
The chosen candidate will have the opportunity to collaborate with industry. There will be a strong connection with the London and Cambridge genomics ecosystem, with opportunities for networking at a variety of events and courses. The timing of different components are flexible.
Background science/training to have in order to undertake the project
Candidates should have programming skills and an enthusiasm for application of statistics to health and biomedical enterprises. Experience with python, bioinformatics and machine learning would be an advantage.
This project would ideally suit an MSc student who has had some exposure to programming and the linux environment. Knowledge of genomics would be advantageous but not essential.
How to apply
Please follow this link above to apply for the programme most appropriate to your research, please note that there is an option on the form to request PhD via MPhil, which is the standard route and you should choose this.
The Studentship title is SLS Full Research Studentship School of Life Sciences. Please include this in your application. You must also list the Project Code in order for us to allocate your application to the correct project.
Deadline for applications: 5pm 2nd September 2022.
Applications are invited for a Full Research Studentship which is tenable for up to three years for full-time study starting in January 2023. Overseas applicants are welcome though will have to pay the difference between the Home and Overseas fee rates. The students will be offered a stipend of £17,285 (fixed to UKRI Rate) per annum and £3000 per annum for consumables. Students will be funded full time for 3 years. Students will also be encouraged to assist with demonstrating practical classes and will be paid the rate for demonstrators.
Candidates should normally have a minimum classification of 2.1 in their Bachelor Degree or equivalent and preferably a Masters degree. Applicants whose secondary level education has not been conducted in the medium of English should also demonstrate evidence of appropriate English language proficiency normally defined as IELTS: 6.5 (overall score with not less than 6.0 in any of the individual elements).
- Corpas M et al. (in review). Implementation of Individualised Polygenic Risk Score Analysis: A Test Case of a Family of Four. BMC Medical Genomics
- Fatumo S, et al. (accepted). Polygenic prediction of lipid traits in sub-Saharan Africans. Nature Medicine
- Ahearn TU et al. (2022) Common variants in breast cancer risk loci predispose to distinct tumor subtypes. Breast cancer research 24 (1).
- Corpas M et al. (2021) Whole Genome Interpretation for a Family of Five. Frontiers in Genetics 2021; 12: 535123.
- Mavaddat, N et al. (2019) Polygenic Risk Scores for Prediction of Breast Cancer and Breast Cancer Subtypes. American journal of human genetics. 104 (1), 21–34.