One substrate many enzymes virtual screening uncovers missing genes of carnitine biosynthesis in human and mouse

One substrate-many enzymes screening (OSMES) for PLP-dependent enzymes

Here we develop an automated procedure to perform a reverse docking screening of a substrate containing a primary amino group bound to PLP cofactor as a Schiff base (external aldimine; substrate), against a set of 3D enzyme structures of a selected PLPome (enzyme set) (Fig.Â 1).

Fig. 1: One substrate-many enzymes screening (OSMES) workflow.

Scheme of OSMES. The pipeline consists of 5 main steps performed automatically: (i) AlphaFold monomeric models for selected proteins are retrieved; (ii) oligomeric structures are determined with SWISS-MODEL templates; (iii) the substrate is prepared for docking and (iv) used to determine the gridbox size at the active site; (v) finally, the pipeline performs docking analysis and the results are ranked using different methods.

As an enzyme set we used PLPomes of Homo sapiens and Mus musculus retrieved from the B6 database (B6DB; http://bioinformatics.unipr.it/B6db) composed of 56 and 57 genes respectively. For each RefSeq accession number we obtained the corresponding UniProt ID to download the AlphaFold monomer²⁸ and mark the position of the catalytic lysine useful for subsequent steps. In this first step, we discarded proteins without a conserved catalytic lysine, namely AZIN1, AZIN2, SPTLC1 and PDXDC1 in both sets and Ldc1 in the mouse set, obtaining 105 enzyme targets for our analysis (Supplementary TableÂ 1). The vast majority of our targets have AlphaFold models of very high confidence (pLDDT>90 over 90% of residues) for the overall (>80%) and active site (>95%) residues (Supplementary Fig.Â 1).

Since most PLP-dependent enzymes belong to fold-type I, which is characterized by obligate dimeric association forming two identical active sites at the interface, oligomerization of the monomeric AlphaFold models is a crucial step of the OSMES pipeline. We therefore exploited models available in the SWISS-MODEL Repository (SMR; https://swissmodel.expasy.org/repository) as templates to assemble AlphaFold monomers into oligomeric structures (Fig.Â 1, step 2). In our set of enzymes, 96 structures were modeled as oligomers, mostly homomers (79 dimers, 13 tetramers) with the exception of SPTLC2 and SPTLC3, which were modeled as hetero-dimers, both associated with SPTLC1. Once the enzyme set is prepared, the procedure automatically builds the covalent adduct between PLP and a substrate molecule with a given PubChem ID, and creates a 3D coordinate file of the external aldimine for docking screening (Fig.Â 1, step 3). For each enzyme structure, the grid center for docking calculation is positioned at the NZ atom of the catalytic lysine, and the grid size is defined according to the size of the substrate (Fig.Â 1, step 4) (see Methods).

As a final step (Fig.Â 1, step 5) the pipeline runs the docking analysis of the substrate against each enzyme structure with AutoDock for Flexible Receptors (ADFR)³⁵, choosing as flexible residue the same catalytic lysine used to place the grid. The results of the screening are then parsed to rank targets according to different methods (see below).

Evaluation of catalytically favorable conformations is the best performing metric in OSMES

Before proceeding with OSMES to our case study, we assessed the ability of different ranking methods to identify enzymes involved in particular PLP-dependent reactions. We considered 13 different substrates (Supplementary Fig.Â 2) against the two PLPomes (human and murine) for a total of 26 screenings evaluated with 7 ranking methods (Fig.Â 2). In each screening, one or more positive controls represented by enzymes known to catalyze the examined reaction (validation set) were considered. The validation set consisted of a total of 42 positive controls divided into 14 decarboxylases, 6 aldolases, 14 aminotransferases and 8 other reactions encompassing 4 ammonia-lyases, 2 Î³-lyases, and 2 hydrolases (Supplementary TableÂ 2). This set represents about 45% of the 93 human and mouse PLP enzymes with a four-digit EC number.

**Fig. 2: Evaluation of different ranking methods of OSMES with known substrates of PLP-enzymes.**

Among the pose clusters obtained from ADFR analysis, we considered both the lowest-energy cluster (best cluster, BC) and the most populated cluster (largest cluster, LC) (Fig.Â 2a). The lowest-energy cluster, reflecting the stability of the system, is regarded as the energetically favored one. The most populated cluster, reflecting a higher conformational entropy of the system³⁶, is regarded as the statistically favored one. For both BC and LC, we ranked the results using three different ranking methods: i) the number of conformations in the cluster (BCC and LCC); ii) the lowest binding energy of the cluster conformations (BCE and LCE); and, to discount the contribution on the constant moiety of the external aldimine, iii) the lowest binding energy of the cluster conformations without the PLP atoms, considering only the amino acid (BCaaE and LCaaE).

In addition to these more canonical criteria, we introduced a ranking method that evaluates the number of catalytically favorable conformations (CFC) based on Dunathanâs stereoelectronic hypothesis²². According to this widely accepted feature of PLP catalysis, when a compound containing a primary amine group binds covalently to the PLP cofactor to form the external aldimine, the reaction proceeds by breaking the bond more parallel to the Ï orbitals of the cofactor pyridine ring, or in other words, more orthogonal to the plane formed by the latter. In the case of an Î±-amino acid, three different cases are possible (Fig.Â 2b, c), represented by the breaking of the CÎ±-COOH (as in decarboxylases), CÎ±-CÎ² (as in aldolases), and CÎ±-HÎ± bond (as in racemases, aminotransferases, and other lyases). On this basis, for every substrate in our screenings we considered CFC conformations in which the angle (Ï) with the PLP ring is maximum for the bond cleaved during the reaction (Ï₁ for CÎ±-COOH; Ï₂ for CÎ±-CÎ²; Ï₃ for CÎ±-HÎ±; Fig.Â 2b, c). As an additional condition for a CFC, we set an upper threshold of 5âÃ for the distance between the NZ atom of catalytic lysine and the imine carbon of external aldimine (Fig.Â 2c). The cluster with the maximum number of CFC is considered the âcatalytic clusterâ (CC) and scored by the number of CFC it contains (CC-CFC) (Fig.Â 2d).

The distribution of the validation test ranked with the 7 different ranking methods shows that with the CC-CFC method the positive controls are generally ranked higher than with other methods (Fig.Â 2e). Within the CC-CFC distribution, a difference in the performance emerged by categorizing positive controls according to the reaction type, with aminotransferases (A) achieving worse results with respect to other reactions (O) that break the CÎ±-HÎ± bond or aldolases (B) and decarboxylases (D) (Supplementary Fig.Â 3a, b). The good performance obtained by CC-CFC is supported by the area under the receiver operating characteristic (AUROC) that confirms the CC-CFC as the most performing ranking method, with an AUROCâ=â0.84 compared with 0.7 of LCE, the second best method (Fig.Â 2f). As an alternative for the second step of the OSMES pipeline (the assembly of oligomeric structures), we considered the use of AlphaFold Multimer³⁷. Also in this case, CC-CFC was the best ranking method. However, a slight decrease in the performance was observed with respect to the use of oligomers based on SWISS-MODEL templates (AUROCâ=â082, Supplementary Fig.Â 4).

Application of OSMES to the identification of a missing gene in carnitine biosynthesis

Carnitine biosynthesis begins with release of N⁶-trimethyllysine (TML) from the breakdown of post-translationally modified proteins such as histones, calmodulin, cytochrome c, myosin, etc.^38,39, and involves four enzymatic steps (Fig.Â 3a). Reactions 1 and 4 are catalyzed by two Fe²⁺-dependent dioxygenases: TML dioxygenase (TMLD) and Î³-butyrobetaine dioxygenase (BBD), which are related by homology; reaction 3 is catalyzed by trimethylamino butyraldehyde dehydrogenase (TMABADH); reaction 2, the aldol cleavage of HTML to generate glycine and TMABA, is catalyzed by HTMLA. Although there is evidence that this activity requires PLP^40,41, the molecular identity of HTMLA in mammals and other metazoans is unknown.

**Fig. 3: HTMLA, the missing aldolase in animal carnitine biosynthesis.**

The pathway described above is not universally present in eukaryotes. For instance, it lacks in the yeast Saccharomyces cerevisiae and the darkling beetle Tenebrio molitor, which require an external supply of carnitine for fat metabolism^42,43. The distribution of the genes encoding TMLD and BBD in eukaryotes (Supplementary Fig.Â 5), shows that the known pathway for carnitine biosynthesis is especially present in opisthokonts (fungi and metazoa). However, absence of TMLD and/or BBD in several species, particularly in protostomes, suggests multiple pathway losses, a suitable condition for the identification of missing genes by coevolutionary analysis. This analysis, conducted with a sensitive method of gene coevolution⁴⁴ in 1,952 eukaryotic genomes⁴⁵ did not reveal an obvious HTMLA candidate, although the best signal among PLP-dependent enzymes was found for an orthogroup annotated as threonine aldolase (Supplementary TableÂ 3). Interestingly, a gene belonging to this group has been previously implicated in Candida albicans as HTMLA³⁰. A gene homologous to threonine aldolase (Tha1) is found in several mammals including mice, but not in humans⁴⁶ nor in other species capable of synthesizing carnitine (Supplementary Fig.Â 5).

Since homology and coevolutionary analysis provided inconclusive evidence on the identification of mammalian HTMLA, we decided to use OSMES on the full set of PLP-dependent enzymes of human and mouse to identify candidates on a structural basis. To this end, we modeled the external aldimine PLP-HTML complex assuming free rotations around rotatable bonds (Fig.Â 3b) and defined the condition for catalytically favorable conformations of the docked substrate (Fig.Â 3c): a distance of â¤5âÃ of the PLP aldehyde carbon of the substrate from the NZ atom of catalytic lysine and a relative maximum for the Ï₂ angle, as expected for the cleavage between CÎ±-CÎ² that occurs in the HTMLA reaction (Fig.Â 3a).

HTMLA candidates revealed by OSMES in the human and mouse PLPome

The best performing method (CC-CFC) was used to rank the results of HTML-OSMES against human and murine PLPomes (Fig.Â 4a). In the two rankings orthologous enzymes are in similar positions, as confirmed by the correlation between the two sets (Spearman râ=â0.83; Supplementary Fig.Â 6).

**Fig. 4: HTMLA candidates identified by HTML-OSMES in human and mouse.**

In both rankings, the first hit is the cytosolic serine hydroxymethyltransferase (SHMT1; Shmt1); its mitochondrial version (SHMT2; Shmt2) ranks just after in second (human) and third (mouse) position, as expected from the strong conservation of active sites residues (Supplementary Fig.Â 7). Interestingly, it has been shown that E.coli SHMT can act as an aldolase on Î²-hydroxylated amino acids, especially with erythro configuration⁴⁷ that is the configuration adopted by HTML, and it has been proposed that SHMT could be responsible for HTMLA activity in mammals³¹. Descending with the ranking, other potential candidates with tested or predicted aldolase activity and belonging to the same KEGG Reaction Classes as HTMLA (RC00312 and RC00721) are found. These are sphingosine phosphate lyase (SGPL1, Sgpl1; EC: 4.1.2.27), an enzyme anchored to endoplasmic reticulum that catalyzes aldol cleavage forming phosphoethanolamine, and the putative mouse L-threonine aldolase (Tha1), not characterized experimentally but traceable by homology to the yeast low specificity L-threonine aldolase (GLY1, EC: 4.1.2.48). A GLY1 paralog has been genetically characterized as HTMLA in C. albicans³⁰. Another example of promising candidates is the pair of paralogous enzymes called kynurenine aminotransferases (KYAT1, Kyat1, KYAT3, Kyat3). These enzymes catalyze the transamination of kynurenine into the corresponding Î±-keto acid. However, they are also able to catalyze Î²-lyase reactions toward cysteine-S-conjugate substrates (EC: 4.4.1.13), although the reaction mechanism involves deamination unlike HTMLA⁴⁸.

In the catalytic clusters of all the mentioned candidates, ADFR is able to position the PLP cofactor in a binding mode similar to that observed in the available experimental structures of homologous enzymes in complex with PLP (Supplementary Fig.Â 8). In all four SHMTs and in Tha1, the lowest-energy conformations of HTML-PLP in the catalytic cluster have the CÎ±-CÎ² bond more perpendicular than in the other enzymes (Fig.Â 4b, Supplementary Fig.Â 9). By contrast, in the case of both SGPL1 and Sgpl1 (Supplementary Fig.Â 9), and all KYATs (Fig.Â 4b, Supplementary Fig.Â 9), the CÎ±-COOH (Ï₃â<âÏ₁â>âÏ₂) and CÎ±-HÎ± (Ï₁â<âÏ₃â>âÏ₂) bond, respectively, are the most perpendicular and therefore in an unfavorable conformation for aldol cleavage.

In all four KYATs and Tha1, visual inspection of the docked complexes revealed the presence of an aromatic cage (Fig.Â 4b, Supplementary Fig.Â 9), characteristic of proteins that bind N-trimethylated substrates, establishing hydrophobic and cation-Ï interactions with the trimethyl ammonium group⁴⁹. The constant presence of a quaternary amine group in the intermediates of carnitine biosynthesis (Fig.Â 3a), suggests that an aromatic cage could be a structural feature of all enzymes of the pathway, as evidenced by the BBD structure in complex with Î³-butyrobetaine⁵⁰, the conservation of the corresponding residues in its homolog TMLD, and the binding mode predicted by docking of the substrate in the TMABADH active site (Supplementary Fig.Â 10).

Biochemical validation of HTML-OSMES candidates

For the above reasons, screening candidates KYAT1, SGPL1, SHMT1 and SHMT2 from Homo sapiens, and Kyat3 and Tha1 from Mus musculus were chosen for the experimental validation. In addition, we considered screening candidates without previous evidence of aldolase or beta-lyase activity: human ABAT as an example of high-ranking hit, mouse Thnls2 and Oat as mid-ranking hits, and human PSAT1 as a low-ranking hit.

Each protein was produced using optimized conditions in recombinant form to be assayed for HTMLA activity. We obtained soluble expression for all the proteins with the exception of ABAT. Recombinant SHMT1, SHMT2, KYAT1, PSAT1, Kyat3, Thnsl2, Oat were obtained in pure and soluble form after overexpression in E. coli (Supplementary Fig.Â 11aâd; insets). In order to obtain recombinant SGPL1 and Tha1 in the soluble form (Supplementary Fig.Â 11e,f; insets), they were co-expressed with chaperones (GroEL/GroES) as truncated forms without the N-terminal membrane anchor and mitochondrial signal (Supplementary Fig.Â 12; see Methods). All the enzymes showed the typical spectrum of protein-bound pyridoxal phosphate in the ketoenamine tautomer, with a peak around 400â430ânm (Supplementary Fig.Â 11).

Stereospecific (2S,3S) HTML for the activity assays was obtained enzymatically from chemically-synthesized TML (see Methods) by exploiting the first reaction of the pathway (Supplementary Fig.Â 13). The activity assays show that SHMT1, SHMT2 and Tha1 catalyze the aldol cleavage of HTML; on the contrary, KYAT1, Kyat3, PSAT1, Thnsl2, Oat and SGPL1 are catalytically inactive towards HTML (Fig.Â 5).

**Fig. 5: Experimental validation of HTML-OSMES candidates.**

Human SHMTs catalyze the aldol cleavage of HTML

In the ¹H NMR spectrum of HTML after addition of SHMT1, the increase of a singlet at 3.55 ppm corresponding to glycine Î±-protons is visible (Fig.Â 5a), clearly appearing after 60âmin of reaction. TMABA formation is confirmed by 2 distinctive signals at 9.63 ppm and 5.05 ppm of the carbonyl proton and its hydrated form (geminal diol), respectively (Supplementary Fig.Â 14a).

Kinetic characterization of HTML cleavage catalyzed by SHMT1, carried out by a continuous spectrophotometric coupled assay that exploits NAD⁺ reduction signal at 340ânm in the presence of the third enzyme of the pathway (TMABADH), shows a dependence of the initial velocities on substrate concentrations following Michaelis-Menten kinetics (Fig.Â 5b). The fitting of data to the Michaelis-Menten equation reveals a catalytic efficiency (k_cat/K_m) of 32.17âÂ±â5.34âs^â1 M^â1 (Supplementary TableÂ 4). We also characterized the enzymatic activity of SHMT2 by spectrophotometric assay (Supplementary Fig.Â 15i), and measured a lower catalytic efficiency (6.23âÂ±â1.26âs^â1 M^â1) compared to SHMT1 (Fig.Â 5c). In fact, despite a lower K_m (0.80âÂ±â0.16âmM vs 3.79âÂ±â0.44âmM) SHMT2 is penalized by a worse k_cat (0.005âÂ±â0.000âs^–Â¹ vs 0.122âÂ±â0.006âs^â1). The aldolase activity of human SHMTs towards HTML was not affected by the presence of tetrahydrofolate, a cofactor in the hydroxymethyltransferase reaction catalyzed by the enzyme (Supplementary Fig.Â 16).

Mouse threonine aldolase (Tha1) shows higher HTMLA activity than human SHMTs

The ¹H NMR spectrum of HTML after the addition of Tha1, shows peaks with the same chemical shift observed in the reaction with SHMT1, but in higher quantities (Supplementary Fig.Â 14b), suggesting the same enzymatic activity, but a different efficiency for the two enzymes. A small upfield shift is visible in the main peak of the trimethylated ammonium protons at 3.11 ppm (Supplementary Fig.Â 14b).

Kinetic characterization of Tha1 by the same spectrophotometric assay as SHMT1, and fitting to the Michaelis-Menten equation (Supplementary Fig.Â 15j) resulted in a k_cat of 2.311âÂ±â0.029âs^âÂ¹ and K_m of 0.169âÂ±â0.009âmM. Comparison with SHMT1 shows better values for both Tha1 constants and a k_cat/K_m (1.36 Ã10^4âs^â1 M^â1) about a thousand times greater (Fig.Â 5c). To test the substrate specificity of Tha1, we evaluated the activity of the enzyme with other Î²-hydroxylated amino acids: L-threonine and L-allo-threonine (Fig.Â 5d). The enzyme showed activity on both L-threonine and L-allo-threonine, but not with the D-enantiomers. However, the preferred substrate of Tha1 is HTML with a catalytic efficiency in the order of 10^4âs^â1 M^â1, followed by L-allo-threonine (10^2âs^â1 M^â1) and L-threonine (10^1âs^â1 M^â1) (Fig.Â 5e). These results suggest that Tha1 has a catalytic preference for Î²-hydroxylated L-amino acids with the erythro configuration. With respect to L-allo-threonine, the reaction with HTML has a similar k_cat but a 50-fold lower K_m (Supplementary Fig.Â 15j, c), suggesting a higher affinity for the intermediate of the carnitine pathway. The two human SHMTs have a similar a preference for substrates with the erythro (S,S) configuration, but are much more efficient with L-allo-threonine (~10⁴ for SHMT1, ~10² for SHMT2) than with HTML (Fig.Â 5e; Supplementary Fig.Â 15a, b; Supplementary TableÂ 4), which possesses a bulkier side chain (Fig.Â 5d).

To verify if the preference of Tha1 for the HTML substrate is a feature of threonine aldolase proteins of organisms with the carnitine biosynthesis pathway, we tested the activity of the low-specificity threonine aldolase eTA⁵¹ from E. coli, which, like other bacteria, does not have carnitine biosynthesis. Recombinant eTA was produced in intact form in the homologous host. Characterization of its catalytic efficiency for L-allo-threonine and HTML, showed high activity with both substrates with a slight preference for L-allo-threonine (Supplementary Fig.Â 15h, d).

HTML is a competitive inhibitor of KYAT1

Although KYAT1 is unable to catalyze the aldol cleavage on HTML, the good binding energies obtained with the screening suggest potential binding at the active site. We thus wanted to test if HTML can inhibit KYAT1 activity on L-kynurenine.

In the presence of an Î±-keto acid, L-kynurenine is converted by KYAT1 to the corresponding keto acid (4-(2-aminophenyl)â2,4-dioxobutanoate), which rapidly cyclizes to kynurenic acid (Supplementary Fig.Â 17a). By measuring the spectrophotometric signal at 310ânm of the final product, we were able to observe the progress of the reaction in the absence and in the presence of HTML (Supplementary Fig.Â 17b, c). After the addition of 0.5âmM of HTML to the reaction mixture, a slowdown of the reaction is observed (Supplementary Fig.Â 17c), suggesting an inhibitory action. We characterized the initial velocity of kynurenine transamination with increasing concentrations of HTML. The Lineweaver-Burk double reciprocal primary plot shows a family of straight lines intersecting on the y axis, typical of competitive inhibition with a constant V_max and an increasing apparent K_m (Fig.Â 5f). A K_i value of 4âmM was determined by the secondary plot (Supplementary Fig.Â 17d).

Crystal structure of mouse Tha1 improves HTML-OSMES results

Although the AlphaFold models in our screening are of high quality overall, there is a disparity in the dataset as evidenced by the different RMSD (root-mean-square deviations) with respect to the templates used for oligomer reconstruction (Supplementary Fig.Â 18). These differences depend on the availability of experimental structures from the same or closely related species. For instance, in the case of KYAT, SGPL, and SHMT, PDB structures are available from various mammals, including humans^52,53,54,55 and mouse^56,57, whereas in the case of Tha1, only PDB structures from distant bacterial homologs are available^51,58. To verify if the results of our screening for Tha1 are confirmed or improved with the availability of an experimental structure, we decided to determine the crystal structure of mouse Tha1.

Mouse Tha1 crystallizes in two space groups, in orthorhombic F222 and in monoclinic C2, with one molecule and two molecules in the ASU, respectively. The PLP cofactor is visible only in the monoclinic structure; however, the active site is very similar in the two cases, with only minor differences. The expected tetrameric quaternary structure is formed by crystallographic symmetries, with four identical units in F222 (related by a 222 symmetry) and two identical dimers in C2 (related by a two-fold axis). The RMSD values between the single units (around 0.26â0.28âÃ, Supplementary TableÂ 5) indicate that the monoclinic and orthorhombic structures are similar. Also the tetrameric assembly is conserved in the two space groups, with two main interfaces (Fig.Â 6a). As indicate by data from PISA analysis (Supplementary TableÂ 6), the interface between units A and B (analogous to that between units C and D, termed âmain interfaceâ) is contributing stronger to the stability of the quaternary structure in comparison with the interface between units A and C (analogous to that between units B and D, termed âsecondary interfaceâ). Hence, the tetramer can be considered a dimer (ABâ+âCD) of dimers (Aâ+âB and Câ+âD), with the first dissociation being ABCD to ABâ+âCD (as determined by PISA). A comparison with the structure of the Thermotoga maritima threonine aldolase (PDB code 1M6S) returned RMSD values between 1.03 and 1.17âÃ for the single units (Supplementary TableÂ 5), indicating a significant structure difference even though the secondary structures and the whole quaternary assembly are conserved. The major structural difference is related to an insertion of 10 residues in Tha1 between positions 337â346. In the two enzymes the position of the PLP cofactor is essentially conserved (Fig.Â 6b). In Tha1, the PLP cofactor, bound to Lys242, is stabilized in the active site by a network of hydrogen bonds and salt bridges with the side chains of Asp211, Arg214 and Thr98 from the same unit, and of Lys267 and Arg274 from the adjacent unit (Fig.Â 6b). His123 is making an aromatic stacking interaction with the PLP pyridine ring with a relative distance between the rings of 3.7âÃ. While the main interface has similar characteristics for the mouse and the T. maritima enzymes (as deduced by PISA analysis, see Supplementary TableÂ 6), the secondary interface shows a higher degree of variability. Despite with similar buried area values, around 990â1060âÃ², it stronger contributes to the stability of the tetrameric assembly in the T. maritima enzyme, with much higher values in ÎG^int (the solvation free energy gain upon formation of the assembly), ÎG^int P-value, and the Complexation Significance Score (CSS). The secondary interface has a more hydrophobic nature in the T. maritima enzyme while it is more polar in the mouse enzyme. As a consequence, the tetrameric assembly of the mouse enzyme has a lower stability, with a ÎG^diss (the free energy of assembly dissociation) value of 5â6âkcal/mol compared to the 40.9âkcal/mol of the bacterial enzyme (for the ABCD to ABâ+âCD dissociation). We performed SEC-SAXS experiments that show the presence of a single component with a MW compatible to that of the sum of 4 units, indicating that the mouse enzyme, despite the lower stability, is tetrameric in solution (Supplementary Fig.Â 19).

**Fig. 6: Crystal structure of mouse Tha1 improves HTML-OSMES results.**

We repeated the docking screening by including in the data set the crystallographic structure of Tha1. The HTML-OSMES results show an increase in CFCs compared with what was obtained with the AlphaFold model. In the catalytic cluster, there are 91 CC-CFC within it, compared with 51 in the previous analysis (Fig.Â 6d, f; Supplementary TableÂ 7). By comparing the two structures, some differences are observed in the side chains of the substrate binding residues (Fig.Â 6c). There are minor differences in the chain containing the catalytic lysine (e.g. Arg372A), while differences in the position of the residues contributed by the other chains (Tyr168C, Tyr69B) are more pronounced, suggesting that they result mainly from subunit assembly. Most importantly, it is observed that many more conformations of the entire docking analysis with the crystal structure have the relevant bond nearly perpendicular to the plane of the PLP (0Â°) (Fig.Â 6e, g), most of which have |sin(Ï₂)|ââ¥â0.95 (gray area). The number of CC-CFC obtained by HTML-OSMES with the experimental structure would have allowed Tha1 to place second in the mouse ranking. Although to a lesser extent, an increase of CC-CFC value was also obtained by the AlphaFold model⁵⁹ built with the addition of the Tha1 experimental structure as template (Supplementary Fig.Â 20; Supplementary TableÂ 7).

Extension of the OSMES procedure to other enzymes

To test the possibility of extending the OSMES procedure to a different group of enzymes, we decided to apply OSMES to aldehyde dehydrogenases (EC: 1.2.1.-), a numerous protein family sharing a common catalysis mechanism (Supplementary Fig.Â 21). A member of aldehyde dehydrogenases, TMABADH, is functionally related to HTMLA, as it catalyzes the subsequent reaction in the biosynthetic pathway (see Fig.Â 3a). Also in this case we modeled an initial step of the reaction mechanism involving the nucleophilic attack by an active site cysteine to the substrate aldehyde carbon (Supplementary Fig.Â 21a), with the formation of a covalent intermediate⁶⁰. The subsequent transfer of a hydride ion (H^â) from this thioester intermediate results in the reduction of NAD(P)+ to NAD(P)H.

We considered substrate orientations in the active site as CFC when the distance between the aldehyde carbon and the catalytic cysteine thiolate was â¤3.5âÃ, which is regarded as an upper limit for near attack conformation^60,61. Using this condition, we applied OSMES to human and murine aldehyde dehydrogenases encompassing two different PFAM domains (Gp_dh_N: PF00044 and Aldedh: PF00171), with a total of 20 and 22 enzymes for human and mouse, respectively (Supplementary TableÂ 10). Since in these proteins the active site is enclosed within monomeric units, the oligomerization step was not performed.

Six different aldehyde molecules, which are known substrates of eight different enzymes, were used as positive controls for validation (Supplementary Fig.Â 21b). We observed a performance similar to PLP-dependent enzymes, with CC-CFC as the best ranking method (AUROC scoreâ=â0.86) and the energy-based methods (LCE, BCE) as a close second best. Converversely, the ranking methods based on the number of conformations (LCC, BCC) seem to lack discriminative power (Supplementary Fig.Â 21c, d). In various instances the conformation of the best cluster (BC) corresponded to a catalytically favorable conformation (Supplementary Fig.Â 21e).

NASA astronauts hold their own Summer Olympics in space (video)

Gravitational waves could reveal hidden histories of black holes – Physics World

How molecular interactions make it possible to overcome the energy barrier

One substrate many enzymes virtual screening uncovers missing genes of carnitine biosynthesis in human and mouse

One substrate-many enzymes screening (OSMES) for PLP-dependent enzymes

Evaluation of catalytically favorable conformations is the best performing metric in OSMES

Application of OSMES to the identification of a missing gene in carnitine biosynthesis

HTMLA candidates revealed by OSMES in the human and mouse PLPome

Biochemical validation of HTML-OSMES candidates

Human SHMTs catalyze the aldol cleavage of HTML

Mouse threonine aldolase (Tha1) shows higher HTMLA activity than human SHMTs

HTML is a competitive inhibitor of KYAT1

Crystal structure of mouse Tha1 improves HTML-OSMES results

Extension of the OSMES procedure to other enzymes

Check out our other content

NASA astronauts hold their own Summer Olympics in space (video)

Gravitational waves could reveal hidden histories of black holes – Physics World

How molecular interactions make it possible to overcome the energy barrier

NASA astronauts hold their own Summer Olympics in space (video)

Gravitational waves could reveal hidden histories of black holes – Physics World

How molecular interactions make it possible to overcome the energy barrier

New study disputes Hunga Tonga volcano’s role in 2023–24 global warm-up

Friday links: the piranha principle, and more

New drug shows promise in clearing HIV from brain

Most Popular Articles

NASA astronauts hold their own Summer Olympics in space (video)

Gravitational waves could reveal hidden histories of black holes – Physics World

How molecular interactions make it possible to overcome the energy barrier

New study disputes Hunga Tonga volcano’s role in 2023–24 global warm-up

Friday links: the piranha principle, and more

New drug shows promise in clearing HIV from brain

‘You’re probably going to be working for over 40 years, so make sure you enjoy what you do’ – Physics World

Tropical plant species are as threatened by climate change as widely feared, study confirms