Structural and biochemical analysis of family 92 carbohydrate-binding modules uncovers multivalent binding to Î²-glucans

Family 92 carbohydrate-binding modules are commonly appended to glycoside hydrolases

The recent establishment of CBM92 as a family is supported by sequenceÂ comparison with other families. Indeed, in our phylogenetic analysis, family CBM92 forms a distinct clade with high bootstrap value (Supplementary Fig.Â 1). CBM92 domains are found in multi-modular proteins that in almost every case include at least one identifiable GH or polysaccharide lyase (PL) (Fig.Â 2). This indicates the possibility for CBM92 proteins to assist carbohydrate degrading activity by promoting enzyme contact with substrate. Indeed, the founding member of the family is a carrageenan-binding module appended to the Îº-carrageenase enzyme Cgk16A⁶ produced by the marine bacterium W. aestuarii, a species that appears to be proficient at metabolising marine polysaccharides^10,28. Carrageenan has structural features such as variable sulphation and anhydro-sugar moieties that are not found in terrestrial glycans²⁹. Yet our preliminary phylogenetic investigations into CBM92 uncovered sequences from aÂ broad range of non-aquatic microbes, suggesting that marine polysaccharides are not the only binding targets.

Fig. 2: Phylogenetic depiction of the multi-modular proteins that contain CBM92 domains.

Full protein sequences were aligned at the CBM92 domain, clustering proteins by domain architecture, and the phylogeny was analysed by maximum likelihood (iQtree web server, 1000 replicates). Bootstrap value is shown as branch thickness. Eubacteria, Eukaryota, and Archaea are respectively shaded with light blue, green, and yellow. Coloured squares on the outer ring indicate the phylum: Bacteroidota, Terrabacteria, Pseudomonadota, and PVC group are respectively shown in blue, light blue, pink, and purple. Pictograms depict the domains found in multi-modular proteins: see shape and colour key on the figure. Protein names contain abbreviated species names followed by the number of amino acids: see abbreviations and corresponding accession numbers in Supplementary TableÂ 1. Protein names are respectively coloured brown or blue to indicate the host species is found in soil or water, where black means unknown. Light brown and light blue are soil or water environments with close association to plants.

Using a CBM92 sequence from a soil bacterium as the search input, we identified 164 domains from 163 modular proteins as belonging to family CBM92, with non-redundant genus. Based on our analysis (Fig.Â 2), the family is mainly distributed among the Eubacteria, with some rare examples in Eukaryota and Archaea. Most CBM92-encoding species are found in soil, fresh water, and ocean ecosystems, including ocean sediment (Fig.Â 2). Among the Eubacteria, CBM92 is especially enriched in the phylum Bacteroidota, but can also be found among Pseudomonadota (formerly known as Proteobacteria), Terrabacteria, and in the PVC group (Planctomycetota, Verrucomicrobiota, and Chlamydiota). Approximately half of the CBM-containing multi-modular CAZyme proteins in our analyses are predicted to be secreted via the Bacteroidota-specific Type IX Secretion System (T9SS), as they possess the C-terminal domain that marks a protein for secretion via this pathway^30,31, which has previously been highlighted as important for the secretion of polysaccharide-degrading enzymes^22,32. A rare case of a eukaryotic CBM92-containing protein is a GH1 enzymeÂ found in four Eudicot plant species, which carries the binding domain at its N-terminal end. The only animal genome that seems to encode a CBM92 domain is that of the wood-feeding termite Coptotermes formosanus. Indeed, both our analysis and a previous transcriptomic study³³ suggest the occurrence of a protein in that species that contains a CBM92 and a CBM13 domain linked to a putative hemicellulose degrading enzyme.

Of note, the conserved ligand specificity we find for CBM92 proteins (discussed below) is in contrast to the apparent diversity in substrates targeted by the enzymes attached to these modules, which are predicted to include GH18 chitinases, GH16 Î²-1,3-glucanases and carrageenases, GH25 lysozymes, GH99 Î±-mannanases, and GH30 Î²â1,6-glucanases, as well as potentially highly diverse specificities from the multi-functional family GH5³⁴. Generally, we see that the CBM92 domain is closely attached to its enzyme partner, to which it is connected via a short linker of less than 20 amino acids in most cases.

Sequences for CBM92 domains were extracted from full-length multi-modular protein sequences, and an independent evolutionary analysis was performed. The CBM92 domains are 125-150 amino acids long, and share an overall sequence identity of â¥ 37%. In the evolutionary tree of CBM92 (Supplementary Fig.Â 2), at least three distinct clades are seen, corresponding to the Eukaryota, Archaea, and Eubacteria, and within Eubacteria a distinct sub-clade of sequences derives from the Terrabacteria taxon. Since there are many Bacteroidota encoding one or more CBM92-containing protein(s), these likely entered Bacteroidota genomes at an early stage of evolution and then diverged. Conversely, few CBM92 domains occur in Pseudomonadota, and these do not form a distinct clade, which is inconsistent with the general evolutionary tree for these taxa³⁵, and may indicate that for these species, CBM92 domains were acquired more recently via horizontal gene transfer.

CBM92 proteins have three repeats defined as distinct subdomains, each with a conserved motif

Twelve CBM92 domains were selected for further analysis. Targets were chosen from species found in diverse habitats, while sampling sequence diversity from around the phylogenetic tree shown in Fig.Â 2. Furthermore, in their native multi-modular proteins, the selected domains are appended to GH enzymes from a number of different families (Fig.Â 2). Seven were chosen from the reasonably well-studied soil bacterium C. pinensis, which has one of the largest genomes and the highest number of CAZyme-encoding genes among Bacteroidota sequenced to date^1,22,36. The C. pinensis domains analysed are appended to GH enzymes from families 5, 16, 18, and 99, which covers a broad range of potential enzyme substrates³⁷. A further two domains were selected from the seawater-isolated Aquimarina aggregata³⁸, both of which are appended to putative enzymes, with an additional CBM6 module in the full-length protein that contains AaCBM92A. One CBM92 domain was selected from each of Draconibacterium mangrovi (isolated from river sediment in China³⁹) and Pyxidicoccus caerfyrddinensis (isolated from soil in Caerfyrddin/Carmarthen in Wales⁴⁰): DmCBM92A is appended to GH5 and GH25 domains, while PcCBM92A is attached to a GH16 domain. Finally, a CBM92 was selected from Euryarchaeota archaeon to explore the potential for functional binding in an archaeal representative.

From a sequence alignment of these 12 selected CBM92 domains, three repeat regions are observed and are named subdomains Î±, Î², and Î³ (Fig.Â 3). The region of sequence highlighted in pink on Fig.Â 3 is conserved across all 164 CBM92 domains in our phylogeny. Secondary structure prediction suggests an enrichment in Î² sheets, indicating a Î²-trefoil structure, also found in e.g. CBM13⁴¹. A highly conserved âWExFâ sequence motif is present at the C-terminal end of each subdomain (Fig.Â 3). Interactions between carbohydrates and aromatic amino acids such as Trp are frequently important for CBMs^27,42. We therefore speculated that the CBM92 proteins identified here have three binding sites each, centred around the three Trp residues of the âWExFâ motifs. A survey of other CBM92 proteins in our phylogeny show that the occurrence of three WExF motifs is widespread, although the Trp is lacking in one or more sites for some proteins (discussed below). Interestingly, the WExF motif is not found at all in the previously characterised carrageenan-binding protein⁶. Two Phe residues were suggested to be important for ligand binding in that protein, proposed to form a hydrophobic platform with support from a well-conserved Arg⁶. An alignment of the known and putative carrageenan-binders identified by Mei et al. with the proteins under analysis here shows that one of these Phe residues corresponds to the second WExF motif we find in almost all CBM92 proteins (Supplementary Fig.Â 3a, b). Our alignment further indicates that the carrageenan-binding proteins likely only have one binding site per protein, and that they represent a small sub-group within the family. These striking differences suggest that there are distinct modes of binding within the family, which warrants a further investigation of the binding specificities of CBM92.

**Fig. 3: Sequence logo, secondary structure, and subdomains displayed on the alignment of twelve CBM92 domains.**

CBM92 domains bind to polysaccharides containing the Glc-Î²-1,6-Glc disaccharide unit

Gene segments encoding the 12 selected CBM92 domains were cloned and expressed as single-domain constructs in E. coli prior to purification. SDS-PAGE analysis confirmed successful production and purification for all recombinant domains (Supplementary Fig.Â 4). Carbohydrate binding was first investigated via pull-down assays and affinity gel electrophoresis using polysaccharides from diverse plant and microbial sources (see Materials and Methods for a full list of ligands tested). The heat map shown in Fig.Â 4 summarises the results of these binding assays, and the corresponding data can be found in Supplementary Fig.Â 5. The domains we tested show a consistent affinity for binding to polysaccharides containing the Glc-Î²-1,6-Glc linkage, namely pustulan (linear Î²-1,6-glucan), as well as laminarin, scleroglucan and yeast Î²-glucan (allÂ consisting of Î²-1,3-glucan chains substituted with Î²-1,6-linked glucosyl residues). In some cases, there was some binding to lichenan, which comprises Î²-1,3- and Î²-1,4-linked glucosyl residues. Of note, DmCBM92A, which naturally lacks two of the binding-site Trp residues we suggest are necessary for binding, did not noticeably bind to any of the tested polysaccharides except laminarin in this qualitative assay, although later experiments could measure some binding to yeast Î²-glucan (discussed below).

**Fig. 4: Qualitative binding determination of diverse CBM92 domains (left labels) to various polysaccharide ligands (top labels).**

Structural analysis reveals a Î²-trefoil fold with three carbohydrate binding sites

To probe the mode of binding of CBM92 domains, we successfully determined the protein structures of the C. pinensis proteins CpCBM92A and CpCBM92B by macromolecular crystallography. As was predicted by sequence analysis, both proteins form a Î²-trefoil structure comprised of 12 Î²-strands arranged into 3 subdomains (Î±, Î², and Î³), similar to Î²-trefoil domains found in Fascin and CBM13 proteins^9,41 (Fig.Â 5a, b). Soaking experiments of the CpCBM92B protein crystals with glucose, gentiobiose (G2: Glc-Î²-1,6-Glc), and sophorose (S2: Glc-Î²â1,2-Glc) revealed a binding cleft within each subdomain comprising a Trp-Glu binding motif, again implying three polysaccharide binding sites per protein (Fig. 5c). Adding either G2 or S2 to the protein crystals led to binding of the non-reducing end sugar in the binding cleft. The electron density for the reducing end sugar was observable but difficult to model accurately, although it notably projected away from the protein (Supplementary Fig.Â 6). This suggests the capacity for end-on binding to glucose monosaccharides and glucan oligo/polysaccharides of potentially any linkage type. In each ligand complex, the glucosyl unit stacks with the conserved Trp with the O3 and O4 of the sugar positioned by hydrogen bonding with the OÎµ1 and OÎµ2 of the conserved Glu. In the binding site of CpCBM92B subdomain Î², the protein is observed to further interact with the glucosyl unit through the guanidine group of Arg955 with the O2 of the sugar, and through the carbonyl of a succinimide formed in place of Asp959 with the sugar O6 (Fig.Â 5c and Supplementary Fig.Â 7). Succinimide can form as a result of cyclising dehydration from nucleophilic attack of the main-chain N atom on the Î³-carbon of Asn and Asp side chains^43,44, and is rarely seen in protein structures. Indeed, only 45 protein entries containing this chemical group are currently reported in the PDB⁴⁵. In our investigation it was found only in the Î²-subdomain of CpCBM92B and it may be an artefact of protein production or crystallisation. Collectively, the binding modes observed with the ligand complexes reveal the possibility for extensions from both the O1 and O6, presumably enabling binding along a Î²â1,6-glucan chain such as in pustulan, and additionally binding to Î²-1,6-linked glucosyl substitutions in, for example, scleroglucan or laminarin. The binding cleft Arg residue in the Î²-subdomain of CpCBM92B is found in subdomains Î² and Î³ in both CpCBM92A and -B, but is substituted with a Ser in the binding clefts of subdomain Î± in both proteins (Fig.Â 5d). This substitution in the Î± site leads to a substantial increase in accessibility around the glucosyl unitâs O2, which may permit binding to oligo- or polysaccharide extensions from this position. In the paper by Mei et al. describing Cgk16A, the founding member of family CBM92, the authors propose that a conserved Arg may be responsible for interacting with the sulphate groups of that proteinâs carrageenan ligand⁶, but our data indicate that it contributes to binding to non-sulphated glycan ligands as well (Supplementary Figs.Â 3 and 6).

**Fig. 5: Structural analysis of two CBM92 domains reveals three subdomains and three potential ligand binding sites.**

Structural comparison with homologues

CpCBM92A and CpCBM92B share structural similarity with Î²-trefoil proteins from CBM13, a multivalent family that includes single-domain galactose- or mannose-binding plant lectins as well as CBM domains foundÂ within larger CAZymes. Structural homologues to our CBM92 domains include the ricin B-like agglutinin domain from Marasmius oreades⁴⁶, an arabinose-binding CBM domain in a GH27 Î²-l-arabinopyranosidase from Streptomyces avermitilis⁴⁷, the CBM domain in CEL-III from Cucumaria echinate⁴⁸, the xylose/xylan-binding CBM domain in the xylanase Xyn10A from Streptomyces olivaceoviridis E-86⁴⁹, and actinohivin from Longispora albida K97-0003T⁵⁰. Structural alignment with these proteins yields CÎ± root mean square deviation values of 1.5 to 2.5âÃ despite low (8-20%) sequence identity. The ligand binding regions in CBM13 are also found in similar surface exposed clefts, with each protein containing three equivalent clefts as part of the trefoil fold. All of these proteins use an aromatic residue and an acidic residue to mediate ligand binding. However, the families differ in the origin of those residues, which ultimately leads to substantially different ligand binding modes (Supplementary Fig.Â 8). For example, the ricin B-like agglutinin domain from M. oreades, the CBM domain in Î²-l-arabinopyranosidase from S. avermitilis, the CBM domain in CEL-III from C. echinate, and the CBM domain in Xyn10A from S. olivaceoviridis E-86 all contain acidic residues originating from Î²2 and aromatic residues originating from Î²3 of the subdomains, effectively shifting the principal binding site by more than 5âÃ compared to CpCBM92A and CpCBM92B. Other CBM13 members, such as actinohivin from L. albida K97-0003T, also use an acidic residue from Î²2 but their aromatic residues reside on a loop, or small helical section, preceding Î²4 of the subdomain. In CBM92, the aromatic residue originates from the loop preceding Î²4 but distinctly has the acidic residue also originating from this loop, leading to the principal binding site being perpendicular to that observed in CBM13 members such as actinohivin. Collectively, while all the proteins comprise a similar overall fold and use similar residues to mediate binding, the location of the residues leads to distinct ligand binding modes.

Exploring the functionality and ligand specificity of three putative binding sites in CBM92

The crystal structures with glucose-based ligands provide evidence for chain-end binding to the non-reducing end of a ligand, with space for potential extension at O2 and O6, which would additionally permit mid-chain binding to glycansÂ with those linkages. According to the crystal structures, mid-chain binding to e.g. Î²â1,3-glucan or Î²-1,4-glucan would not be possible. This matches our observations from the qualitative polysaccharide binding assays described above, which suggested some linkage-based selectivity in ligand binding. We used isothermal titration calorimetry (ITC) to explore the binding affinities of CpCBM92A to glucose and glucose-based disaccharides. We were able to determine binding parameters for glucose, G2, and S2, while binding to C2 and L2 could not be reliably measured due to low signal and non-saturating isotherms. These experiments showed stronger binding to G2 and S2 than to glucose, perhaps reflecting the dual potential orientations of the longer ligands in the binding sites. TableÂ 1 shows the parameters of binding determined for CpCBM92A, and the corresponding data can be found in Supplementary Fig.Â 9.

Table 1 Binding parameters of the interactions between CpCBM92A and three ligands as determined by ITC analysis

To probe the respective functions of the three putative glycan binding sites, a series of modified constructs were generated for CpCBM92A, systematically altering the Trp in each WExF motif. Variants with single (W481A Î± site, W523A Î² site, W565A Î³ site variants), double (W481A/W565A, W481/W523A, W523A/W565A), and triple (W481A/W523A/W565A) binding site substitutions were produced using site-directed mutagenesis (red stars in Fig.Â 3 show the positions of the residues modified). The doubly substituted W481/W523A variant showed no protein production despite optimisation attempts, while the W481A/W565A form proved to be highly unstable during protein production; as a result, these versions of the protein could unfortunately not be purified or characterised. The melting points of CpCBM92A and all successfully produced variants were investigated, and suggested that protein structure was intact in the modified forms, which all showed similar melting point profiles (Supplementary Fig.Â 4). Pull-down assays revealed that the single mutation variants showed the same binding specificities as the wild-type, while the double and triple variants showed impaired or abolished binding (Supplementary Fig.Â 5a), confirming that there are no further unrecognised binding sites in the protein.

Due to weak binding, satisfactory ITC experiments could not be performed for the variant forms of CpCBM92A. Instead, a series of depletion isotherms were performed using the ligand yeast Î²-glucan, which comprises a backbone of Î²-1,3-glucan with regular extended sidechains of Î²â1,6-linked glucosyl units. Binding curves could not be saturated due to protein precipitation at high concentrations, so accurate K_D values could not be deduced from these data. However, lines of best fit determined using a Langmuir isotherm fitting model are shown to allow a qualitative comparison of binding strengths (Fig.Â 6). The wild type and all variant forms of CpCBM92A were first assessed, to investigate the relative contribution to binding made by each site (Fig.Â 6a). The loss of the Trp residue from either the Î² or Î³ binding site (W523A and W565A variants, respectively) caused a major shift in apparent binding ability, with the loss of the Î² site having the most profound effect. This indicates that for CpCBM92A, the Î² site likely has the strongest affinity for the ligand. We also see that the Î± site knockout shows only a small loss of binding ability compared to the wild type, but that there is some residual binding in the Î²/Î³ site variant W523A/W565A, suggesting that the wild type Î± site does make some small contribution to binding in the full protein. The Î± binding site of CpCBM92A differs from the other two in that it lacks an otherwise well-conserved adjacent Arg (Fig.Â 3) that likely supports binding by interacting with a glucose ligand and by creating a topographic âwallâ for the binding site (Supplementary Fig.Â 5b).

**Fig. 6: Depletion isotherms of CBM92 domainsÂ binding to the insoluble polysaccharide yeast Î²-glucan.**

Overall, the depletion isotherm data for variant forms of CpCBM92A indicate that a greater number of functional (i.e. Trp-containing) binding sites leads to stronger overall binding to the polysaccharide yeast Î²-glucan. From these data it is not possible to determine whether this results from merely additive or truly avid binding. As there is some natural variety within CBM92 in the number of Trp-containing binding sites within wild type proteins (Fig.Â 3), we were motivated to perform depletion isotherms for a series of native proteins with differing binding site sequences (Fig.Â 6b). We see the weakest binding from DmCBM92F, which only has Trp in the Î³ site, and gave an isotherm highly similar to that obtained for the Î²/Î³ variant W523A/W565A of CpCBM92A. For CpCBM92F and AaCBM92B, which both lack one functional site, binding is compromised compared to wild type CpCBM92A and CpCBM92B, which both have three binding site Trp residues. In short, these data agree with observations from the CpCBM92A variants and show that more Trp-containing binding sites leads to stronger interactions with ligand.

Finally, the label-free technique bio-layer interferometry (BLI) was employed, as this method has proven useful in measuring multivalent carbohydrateâprotein interactions^51,52. BLI works best with relatively high molecular weight ligands, although these must be soluble. Previous BLI experiments on carbohydrate-protein interactions mainly used streptavidin sensors⁵³ and biotinylated Fab-conjugated glycans^53,54. In this study, we instead used Ni-NTA sensors, wherein the sensor binds to the His₆ tag on recombinant proteins. The interferometry variation during ligand association/dissociation steps were analysed in real-time.

Binding to sophoropentaose (S5), laminarin, and scleroglucan was studied using BLI for CpCBM92A and its variants (Supplementary Fig.Â 10). Using the S5 ligand at a concentration of 10âÂµM enabled K_D values to be determined, as presented in TableÂ 2. The Î± and Î³ site variants (respectively the W481A and W565A forms) show a binding profile that is highly similar to that of the wild type CpCBM92A, indicating that the contributions of those sites to overall affinity is very minor. Conversely, the W523A Î² site variant shows a non-detectable degree of binding to S5, again confirming that this is the strongest binding site on the protein and that it may be particularly critical with certain ligands. The polysaccharides laminarin and scleroglucan are heterogeneous and polydisperse, so molar concentrations cannot be accurately measured. As a result, K_D values could not be determined for these interactions using BLI (Supplementary Fig.Â 10). Nonetheless, the general trend in these data echoes that from the depletion isotherm experiments, with stronger binding interactions againÂ correlating with a greater number of intact Trp binding sites (Supplementary Fig.Â 10). A response value from BLI is measured as a nm shift in the interference pattern and is proportional to the number of molecules bound to the surface of the biosensor. Comparing the maximum response values obtained with laminarin as the ligand indicates that the wild type, Î± site variant, and Î³ site variant forms of CpCBM92A saturate at roughly the same ligand concentrations, indicating highly similar binding affinities. By contrast, the Î² site variant reaches saturation more slowly in terms of ligand concentration, consistent with reduced binding affinity. With scleroglucan as ligand, which could be tested at higher concentrations than sophorose, there is a clear loss of binding in the W565A Î³ site variant, whereas loss of the Î± site (W481A) exerts a minimal effect on binding. In the doubly substituted variant where only subdomain Î± is unchanged from wild type, the binding profile is close to that of the triple variant, showing no binding to laminarin or scleroglucan. Overall, the BLI data re-confirm that the Î² site is contributing the most to CBM affinity for ligand, and indicate that the Î³ and Î± sites make lesser contributions to overall binding. Native PAGE analysis of binding to laminarin also indicated that the Î² binding site is the strongest, as the W523A Î² site variant showed the greatest reduction in mobility retardation, while the mobility of the W481A and W565A variants more closely resembles that of the wild type protein (Supplementary Fig.Â 5b). Although the BLI and depletion isotherm studies presented here show that there is some loss of overall binding capacity when the Î± or Î³ site Trp is lost, the affinity of these sites for ligand is likely to be comparatively low.

Table 2 Kinetic parameters of the interaction between CpCBM92A variants and S5

Implications of CBM92 binding to Î²â1,6-glucan

By characterising 12 examples, we have shown that CBM92 domains from distinct microbial species are capable of binding to glucose, gluco-oligosaccharides with Î²-1,2- or Î²â1,6- linkages, and to long chain glucans containing Î²â1,6-linked glucose moieties (pustulan, scleroglucan, yeast Î²-glucan, and laminarin). Previously characterised examples of CBM92-containing proteins bound to Î²â1,3-glucan¹¹ and carrageenan⁶: both of those domains bind to the same polysaccharide as their appended enzymes can target, suggesting a likely role in enzyme potentiation². Indeed, our phylogenetic analyses show that a number of CBM92 domains are attached to predicted Î²â1,6-glucanases from enzyme family GH30 (sub-family 3)⁵⁵, and these may be expected to show the same kind of rate potentiation. The natural substrate for these enzymes may be polymeric pustulan as found in lichenous fungi²⁰ or it may be shorter chains of Î²-1,6-glucan such as can be found in the cell walls of certain oomycetes¹⁸. However, the Î²-1,6-glucan-binding CBM92 domains characterised in this work are appended to CAZymes with a range of different predicted activities, suggesting that not every member of the family is involved in direct binding to the substrate of an enzyme partner. As Î²-1,6-glucosidic linkages are found in the cell walls and secretions of marine plants and soil fungi, it may be that tethering, for example, a chitinase⁵⁶ or Î²â1,3-glucanase to a complex cell wall substrate matrix does have a rate-enhancing proximity effectÂ in natural systems⁵.

In addition, the potential multivalent nature of CBM92 glycan binding might be significant, as it could lead to the formation of protein-polysaccharide networks that may stabilise enzymes in a manner conceptually similar to the use of immobilisation in industry. In a study characterising a CBM6 protein with two binding sites showing different modes of interaction with the Î²-1,3-glucan backbone of laminarin, Jam et al. proposed a model for CBM-mediated cross-linking of oligolaminarin chains up to 12 glucosyl units in length⁵⁷. The three binding sites of CBM92, which our data suggest all make some contribution to overall binding, may permit a similar cross-linking of ligands in soil and water environments. The biological implications of this remain unclear, but from a biotechnological perspective, it may suggest that CBM92 domains have use as fusion tags for immobilisation of recombinant proteins on polysaccharide surfaces. Pustulan in particular is a strong candidate for an immobilisation surface, as it is inert and insoluble, and easily recoverable from water by centrifugation or filtration. Additional experiments are needed to determine whether this cross-linking interaction is occurring and if it has a stabilising effect on appended enzymes. In Fig.Â 7 we depict hypothetical models for how CpCBM92A might interact with the various ligands analysed in this study. The model depicts two potential binding orientations for gentiobiose. If a longer oligosaccharide ligand, such as moderate chain length laminarin, were flexible enough, it may be able to sit in multiple binding sites on one protein, an interaction previously proposed for the bivalent CBM6 protein studied by Jam et al.⁵⁷. A similar phenomenon may be feasible with sophoropentaose, which might be long enough to reach two binding sites on protein. In addition, with a very long chain ligand such as scleroglucan, a cross-linked protein-polysaccharide network may form if multiple binding sites of one protein interact with different ligand chains.

**Fig. 7: Theoretical model of CpCBM92A binding to diverse Î²-glucans.**

NASA astronauts hold their own Summer Olympics in space (video)

Gravitational waves could reveal hidden histories of black holes – Physics World

How molecular interactions make it possible to overcome the energy barrier

Structural and biochemical analysis of family 92 carbohydrate-binding modules uncovers multivalent binding to Î²-glucans

Family 92 carbohydrate-binding modules are commonly appended to glycoside hydrolases

CBM92 proteins have three repeats defined as distinct subdomains, each with a conserved motif

CBM92 domains bind to polysaccharides containing the Glc-Î²-1,6-Glc disaccharide unit

Structural analysis reveals a Î²-trefoil fold with three carbohydrate binding sites

Structural comparison with homologues

Exploring the functionality and ligand specificity of three putative binding sites in CBM92

Implications of CBM92 binding to Î²â1,6-glucan

Check out our other content

NASA astronauts hold their own Summer Olympics in space (video)

Gravitational waves could reveal hidden histories of black holes – Physics World

How molecular interactions make it possible to overcome the energy barrier

NASA astronauts hold their own Summer Olympics in space (video)

Gravitational waves could reveal hidden histories of black holes – Physics World

How molecular interactions make it possible to overcome the energy barrier

New study disputes Hunga Tonga volcano’s role in 2023–24 global warm-up

Friday links: the piranha principle, and more

New drug shows promise in clearing HIV from brain

Most Popular Articles

NASA astronauts hold their own Summer Olympics in space (video)

Gravitational waves could reveal hidden histories of black holes – Physics World

How molecular interactions make it possible to overcome the energy barrier

New study disputes Hunga Tonga volcano’s role in 2023–24 global warm-up

Friday links: the piranha principle, and more

New drug shows promise in clearing HIV from brain

‘You’re probably going to be working for over 40 years, so make sure you enjoy what you do’ – Physics World

Tropical plant species are as threatened by climate change as widely feared, study confirms

Structural and biochemical analysis of family 92 carbohydrate-binding modules uncovers multivalent binding to Î²-glucans

Family 92 carbohydrate-binding modules are commonly appended to glycoside hydrolases

CBM92 proteins have three repeats defined as distinct subdomains, each with a conserved motif

CBM92 domains bind to polysaccharides containing the Glc-Î²-1,6-Glc disaccharide unit

Structural analysis reveals a Î²-trefoil fold with three carbohydrate binding sites

Structural comparison with homologues

Exploring the functionality and ligand specificity of three putative binding sites in CBM92

Implications of CBM92 binding to Î²â1,6-glucan

Check out our other content

Most Popular Articles

Implications of CBM92 binding to Î²â1,6-glucan