Diagnosis and stratification will both be approached by a shared triad of computational modelling approaches to metabolic networks, enzyme structure and genetics.
WP2. Metabolic network-based classification
Whole-body, genome-scale, sex-specific models of human metabolism will be used to predict the minimum number of metabolic gene defects required to cause the concentration changes measured by conventional metabolic screening and additional, untargeted metabolomic analysis of blood and urine samples. These predicted causal genes will be compared with genetic variants (WP4) and the 3D structure of enzyme variants (WP3). For patients with specific targeted IMDs, non-personally identifiable physiological data and metabolomic analyses will be used to generate personalised WBMs of individual patients. Since the manifestation of a clinical phenotype is often primarily a function of the residual enzyme level and its activity relative to a minimal threshold, proteomic analysis of enzyme expression as well as kinetic constraints with estimated (WP3) and measured kinetic parameters will be selectively added to model IMDs of relevance to clinical partners. These personalised models will be used to enable personalised patient management (WP6) and to predict novel therapeutic approaches for, e.g., substrate reduction therapy. Metabolic network-based classification will be disseminated via a local desktop installation that communicates with a remote server in a GDPR-compliant manner.
WP3. Enzyme structure-guided classification
The 3D structure of wild-type proteins, wherever available, will be obtained from the Protein Data Bank (PDB), and the structure of variant proteins corresponding to genomic variants will be predicted from amino acid sequences using state-of-the-art deep learning algorithms. A comparison of wild-type and variant enzyme structures will be used to classify genomic VUS (WP4) as to whether they have a low, medium, or high probability of causing a loss of function. Targeted experimental determination of variant protein structures by X-ray crystallography and cryo-electron microscopy will be used to test and refine computational predictions. This sequence-based structure-guided classification approach will be made accessible to those without expertise in structural biology and disseminated via a client-server web interface. Comparison of wild-type and variant structures will also be used to identify druggable pockets within the variant’s 3D structure that are amenable to small molecule intervention. The most promising candidate enzyme variants and drug-like compounds will be subjected to detailed molecular dynamic simulations and in silico docking, and a subset will be selected for biophysical characterisation and validation of enzyme function using in vitro expression and patient’s cell-based experiments (WP6).
WP4. Genomic classification
To train genomic classification algorithms, we will use diagnosed IMD patients that have been sequenced (exome ± genome) and reanalyse them with an established statistical genetics pipeline to identify variants of known significance. In the test phase, the trained classification algorithms will be applied to newly generated whole genome sequences for a subset of U-IMD patients, including patients with VUS. Structural bioinformatics of the corresponding enzyme variants (WP3) will be used to predict causative genomic variants. Where structural bioinformatics predicts th at a genomic variant is causative, untargeted metabolomic data will be generated from patient biofluid samples. Subsequently, metabolomic data-driven whole-body metabolic modelling will be used to predict defective genes (WP2) in a manner independent of genomic data. Genomic variants that are predicted to be causative by both structural bioinformatics and metabolic modelling will be candidates for experimental structural (WP3) or functional validation (WP6). Software for the classification of genomic variants (causative, benign, unknown), given a structured electronic health record and genomic data for each patient, will be disseminated as a local extension to established open-source pipelines for the analysis of genomic data 44. In the validation phase, symptomatic patients at risk of an IMD will be selected to go forth for genome sequencing and analysis with established statistical genetics pipelines to identify genomic variants. Patients with VUS will be selected for untargeted metabolomic analysis and personalised whole-body metabolic modelling, and predictions compared with conventional biochemical gold standard diagnostic tests.
WP5. Reconstruction of human metabolic networks
Recon3 is an established global human metabolic network reconstruction that will be enhanced by a comprehensive representation of all genes associated with IMDs, by adding representation of membrane compartments and by expanding its coverage of lipid metabolic pathways. Membrane biochemistry experts will curate the biochemical literature to create a compendium on the typical composition of cellular and organellar membranes in each organ as well as the localisation of hydrophobic reactions to each membrane and biofluid compartment. In parallel, bioinformatic experts will systematically generate all theoretically possible lipid metabolites at sufficient resolution to enable each lipid reaction to be completely chemically and catalytically specified. The resulting enhanced global human metabolic network reconstruction, designated Recon4, will be used to generate enhanced sex specific WBMs (WP2).
WP6. Personalised disease modelling and patient management:
Personalised computational modelling (WP2-4) will be used to classify an established cohort of Gaucher disease (GD) patients according to the predicted identity of different compensatory or aggravating mechanisms that associate with clinical disease severity. Therapeutic recombinant enzymes, as well as candidate small molecules, predicted to act via chaperone-type enzyme stabilisation/activation (WP3), substrate reduction (WP2), or compensatory mechanisms (T6.1) will be administered to establish in vitro GD models. Established biochemical assays will be used to test for target engagement and selective pharmacological perturbations will be used to titrate residual enzyme activity. To assess the response of metabolic pathways surrounding select target enzymes, in vitro transcriptomic data will be used to generate metabolic models of macrophage cell lines derived from GD patients. These models will be used to infer metabolic fluxes, given novel extracellular metabolomic data and mass isotopologue distribution data acquired from stable isotope labelling experiments. In silico whole body modelling, as well as in vitro metabolomics and fluxomics, in response to personalised therapies will be compared with clinical metabolomic analysis of GD patients. Concordance will be clinically exploited to personalise patient management and the results will be compared with the outcome of the conventional approach to patient management.