research
using machine learning to understand fundamental concepts in biology.
Summaries of previous (and occasionally ongoing) research projects.
Doctoral Thesis
A particle filter for demographic inference and an ancient back-migration to Africa
This project involved the development of the SMC2 sequential Monte Carlo sampler to sample ancestral recombination graphs from whole genome sequencing data and infer population demographic parameters, especially directional migration. I used the SMC2 method to show that a particularly impactful period of directional migration existed somewhere between 40,000 and 70,000 years in the past and is present in the histories of many African populations.
Topic modelling for bulk ATAC-seq applied to MLL-r leukemia
I developed an adaptation of the cisTopic latent dirichlet allocation topic modelling framework for bulk chromatin accessibility data and applied it to patient data in a particularly impactful form of blood cancer. A publication for this work is in preperation.
Previous Research
@ Vector Institute, University of Toronto (Quaid Morris)
I collaborated with, and visited, the Morris Lab to develop a deep learning model of RNA stability from sequence.
@ Statistics Department, University of Oxford (Simon Myers)
I spent my first research rotation with the Myers group (and Leo Speidel in particular), constructing ancestral recombination graphs from modern and archaic sequences to identify regions of putative introgressions.
@ London School of Hygiene and Medicine (Frank Dudbridge)
Building on the idea that a single polygenic risk score may not capture all relevant information, I continued the work of my undergraduate thesis and developed a cardiometabolic risk score for the prediction of various complex traits. I also worked on the theoretical properties of optimal disovery procedure (ODP), a relatively recent framework for multiple testing correction.
@ University of Toronto, Lancaster University (Jo Knight)
I wrote coRge to evaluate different methods of adjusting P values for multiple testing correction in GWAS, and also wrote goldi to find Gene Ontology terms in biomedical articles.
I also collaborated with Nuwan Hettige to construct a polygenic risk score for anti-psychotic dosage and with Meghan Chenoweth to perform a GWAS for NMR, a biomarker for nicotine metabolism.
@ University of Ottawa Heart Institute (Ruth McPherson)
I worked alongside Dr. Majid Nikpay to construct polygenic risk scores for dyslipidedmia and found them to be significantly altered by adiposity. Additionally used a Mendelian randomization study to find links between Coronary Artery Disease aetiology and obesity.
My thesis focused on combining polygenic risk scores across different predisposing conditions to create a master score which would be more predictive than its constituents. This was formalized when in London with Frank Dudbridge.
@ University of Ottawa (Stephane Aris-Brosou)
I designed and implemented a rejection sampler to determine phylogeny topology in viral evolution. And started learning R
!
Teaching
Stochastic Models in Mathematical Genetics
Department of Statistics, MT 2018
Teaching assistant and demonstrator for Simon Myers’ intercollegiate Stochastic Models in Mathematical Genetics course at the Department of Statistics offered as a part of the Oxford Msc in Mathematical Sciences (OMMS) and BA/MMath Maths and Statistics.
Machine Learning for Genomics
Wellcome Centre for Human Genetics, MT 2020
Give lecture on current applications of CNNs to genomics and assist in the delivery of tutorial sessions for Gerton Lunter’s Machine Learning for Genomic course delivered at the Wellcome Centre for Human Genetics as a part of the Medical Sciences Doctoral Training Centre for incoming DPhil Genomic Medicine and Statistics students.
Admissions Note Taker
Somerville College, MT 2020
Assist Professor Robert Davies to conduct admissions interviews for undergraduate mathematics, statistics, and computer science, taking notes, and asking questions while critically judging responses.