Friday, September 20, 2024

Cryo-EM structures of the Spo11 core complex bound to DNA

BiochemistryCryo-EM structures of the Spo11 core complex bound to DNA


Cryo-EM structure of the core complex bound to hairpin DNA

S. cerevisiae core complexes were conformationally variable in negative-stain EM but adopted a more uniform structure when bound to DNA14. We leveraged the tight binding to DNA ends to determine a cryo-EM structure of core complexes bound to a 23-bp hairpin DNA with a two-nucleotide 5′ overhang. Spo11 binds tightly to the free end of this substrate and with much lower affinity to the hairpin end14.

Core complexes were purified after expression in insect cells (Extended Data Fig. 1a) and incubated with Mg2+ and hairpin DNA (Fig. 1c and Extended Data Fig. 1b). The structure of the protein–DNA complexes was solved at 3.7-Å resolution. The cryo-EM reconstruction is shown in Extended Data Fig. 2a–d, cryo-EM statistics are provided in Table 1 and the density representation is presented in Extended Data Fig. 2e. The model was built using subunit structures predicted by AlphaFold2 (ref. 10) and manually refined by positioning of bulky amino acid side chains.

Table 1 Cryo-EM data collection, refinement and validation statistics

The core complex is monomeric (1:1:1:1 stoichiometry) and adopts a V shape with Spo11 at the base contacting the three other subunits (Fig. 1d,e). The hairpin’s overhang end is anchored in a deep cleft between the Spo11 WH and Toprim domains (Fig. 1d). The remainder of the duplex makes non-sequence-specific contacts primarily with Spo11 and fewer with Rec102 and Rec104 but not with Ski8 (Fig. 1e). The proteins form a left-handed wrap around the DNA. Protein–DNA contacts in the complex are summarized in Fig. 1c.

We could trace the base, sugar and 5′-phosphate of the first nucleotide (A2) in the overhang adjacent to the terminal base pair. By contrast, the next overhang base (T1) could not be traced because of weak density. Residue Y135 involved in DNA covalent bond formation is on the opposite side of the DNA end from the Mg2+-coordinating E233, D288 and D290 residues (Fig. 1f). This separation of active site components is expected given that, with only one copy of Spo11, the structure shows the configuration of two catalytic half-sites (Fig. 1a). It is likely that the observed structure mimics the post-DSB product complex, except that Y135 would be covalently bound to the 5′ end in a true product complex.

Cryo-EM structure of the core complex bound to gapped DNA

We also solved the structure of the core complex bound to a double-hairpin substrate with a single-stranded DNA (ssDNA) gap (Fig. 2a and Extended Data Fig. 1c) at 3.3-Å resolution. The cryo-EM reconstruction is shown in Extended Data Fig. 3, cryo-EM statistics are provided in Table 1 and the density maps are presented in Extended Data Fig. 4a–e). Two views are shown in Fig. 2b,c. The preference for binding to bent DNA14 motivated these experiments, on the premise that the ssDNA region would provide the flexibility needed to assemble dimeric complexes that resemble the pre-DSB state. However, the majority of particles again had just a single core complex (Extended Data Fig. 3a).

Fig. 2: Cryo-EM structure of Spo11 core complex bound to gapped DNA.

a, Gapped-DNA sequence and intermolecular contacts between the DNA and amino acid side chains of Spo11 (magenta), Rec102 (blue) and Rec104 (green). b, Ribbon representation of the 3.3-Å-resolution cryo-EM structure. c, As in b but rotated 90°. d, Three views of the unsharpened cryo-EM map.

We can trace the duplex of one hairpin plus the first nucleotide of the ssDNA (A39) including its 5′-phosphate. The bound hairpin is the one with a free 3′-OH and ssDNA extending from the 5′ end (Fig. 2a), which is structurally analogous to the single hairpin initially used. Weak density in the unsharpened map spanned the single-stranded nucleotides T38-T37-A36 and a portion of the second hairpin, with the two duplex segments aligned at a relative angle of 130° (Fig. 2d). Protein–DNA contacts are summarized in Fig. 2a.

The α-helical and β-strand segments for the four subunits are shown in Extended Data Fig. 5. Spo11 is a central hub making extensive protein–protein and protein–DNA contacts. The two arms of the V-shaped protein scaffold comprise the Spo11 WH domain interacting with Rec102 and Rec104 and the Toprim domain interacting with Ski8 (Fig. 2b,c). The structures with hairpin DNA and gapped DNA superpose well (root-mean-square deviation (r.m.s.d.) = 1.23 Å); hence, the following sections focus on the higher-resolution gapped-DNA structure.

Validation of the Spo11–Ski8 interface

The Spo11–Ski8 interface was previously modeled using a crystal structure of the Ski complex, in which two Ski8 copies contact two copies of a QRxxΦ motif in Ski3 that is also present in fungal Spo11 proteins14,21. This motif is critical for Spo11–Ski8 interaction in S. cerevisiae both in vivo and in vitro14,22. This same interface was also predicted computationally9. The cryo-EM structure matched these predictions, showing that Ski8 recognizes Q376REIFF in the Spo11 Toprim domain (Extended Data Fig. 6a). The interface involves both hydrophobic and hydrogen-bonding interactions (Extended Data Fig. 6b,c), including a computationally predicted extended interaction surface (Extended Data Fig. 6d).

Rec102 and its interaction with Spo11

The overall core complex architecture resembles Top6A–Top6B, as expected (Fig. 3a). Rec102 comprises a six-stranded β-sheet extended by multiple α-helices (Figs. 2b and 3b and Extended Data Fig. 5c). It has structural elements in common with the Top6B transducer domain, as anticipated, but with substantial differences. We consider here five main features from Top6B: a central β-sheet, a WKxY motif, a ‘switch loop’ that interacts with the ATP-binding site in the GHKL domain, the α-helical lever arm (‘stalk’) and the interface with Top6A (Fig. 3b, top). Conserved β-strands are indicated by uppercase letters ordered by tertiary-structure position, while numerical designations for β-strands correspond to primary-structure position (for example, Extended Data Fig. 5c); thus, these differ between species. Our structure agrees with crosslinking data14 where K60 and K64 in β-strand D crosslinked with K79 in nearby α-helix 1.

Fig. 3: Rec102 folding and interactions with other subunits.
figure 3

a, Comparison of overall architectures of the Spo11 core complex and Topo VI. Top, ribbon representations of the cryo-EM structure from this study (left) and the Top6a–Top6b heterodimer from a crystal structure of the Methanosarcina mazei Topo VI holoenzyme (right; PDB 2Q2E)5. Bottom, schematic of tertiary and quaternary organization. b, Comparison of transducer domains of Rec102 and Top6B. Top, protein topology schematics of Rec102 and Top6B highlighting conserved structural elements. Bottom, ribbon representations of Rec102 (this study) and Top6B (PDB 2ZBK)4, with the same color settings as in the topology schematics. c, Heteroallele recombination assays with different Rec102 (left) or Spo11 (right) variants introduced into rec102∆ or spo11∆ reporter strains, respectively. Bars show the mean ± s.d. of three biological replicates; points show individual measurements. Different shading colors highlight the indicated protein elements and are matched with d and Extended Data Fig. 8c,e to facilitate comparisons. d, Quantitative β-galactosidase assays to measure Y2H interactions of mutants of Gal4AD-Spo11 and LexA-Rec102. Bars show the mean ± s.d. of three biological replicates; points show individual measurements. Immunoblots below the graph show levels of the fusion proteins in whole-cell extracts detected with either anti-HA (to detect the Gal4AD fusion) or LexA antibodies. Vertical dashed lines indicate where blot images were spliced. Gray boxes indicate select mutants that were not tested by immunoblot because they were proficient in the Y2H and/or recombination assays. e–h, Details of protein–protein contacts within the gapped-DNA-bound core complex: an overall view of the structure (e); zoomed details showing hydrophobic interactions between Rec102 and Spo11 (f); the position of the C-terminal extension of Rec102 (red; g); intermolecular hydrogen bonds and hydrophobic interactions between the N-terminal extension of Spo11 and Rec102 and Rec104 (h).

Source data

In Top6B, the central β-sheet is a scaffold supporting the stalk helix and the GHKL and H2TH domains4,5,11. Rec102 has a similar β-sheet, with the first four strands corresponding to segments of homology with Top6B (ref. 13) (Fig. 3b, strands labeled A–D, and Extended Data Fig. 7a,b). These strands are mostly encoded in REC102 exon 1, which is essential in vivo23. The sheet is extended by one (Top6B) or two (Rec102) additional β-strands but these arise differently in the sequence; the short strand in Top6B is immediately after strand D but the two longer strands in Rec102 instead occur after a conserved helix (Fig. 3b).

Another conserved element is a tryptophan within a motif that has the sequence WKxY in archaea13 and that starts at the end of a set of short α-helices and continues into the subsequent turn (Fig. 3b and Extended Data Fig. 7b). The lysine in this motif is part of a putative DNA-binding surface in Topo VI that is critical for activity24. The tryptophan is in the hydrophobic core of Saccharolobus Top6B, packing against the stalk and hydrophobic residues at the beginning of strand C (Extended Data Fig. 7c). The tryptophan is nearly invariant in eukaryotes but the rest of the motif is highly variable (W91EEQ in Rec102) (Extended Data Fig. 7b). In Rec102, W91 packs against the equivalent of the stalk helix, analogous to Top6B, but contacts different strands in the β-sheet: I59 in strand D (β4) and L113 in strand E (β6) (Extended Data Fig. 7c). This motif is ≥16 Å away from the DNA; thus, it does not appear to contribute to DNA binding in the end-bound complex.

The switch loop is a 16-residue sequence in Saccharolobus Top6B that connects strand C with the stalk helix (Fig. 3b and Extended Data Fig. 7b). It contains a lysine that contacts ATP bound by the GHKL domain’s active site and that is essential for ATPase activity in Topo VI and other topoisomerases25,26,27. Our structure confirms the earlier deduction of the position in the Rec102 sequence equivalent to the switch loop13 but the loop is smaller in Rec102 than previously appreciated (just eight residues, S157KEGNYVE). We tested the importance of this element by assessing the ability of rec102 mutant alleles to complement a rec102-null strain in a heteroallele recombination assay, in which arginine-positive prototrophs can be generated by recombination between two different arg4 mutant alleles (Extended Data Fig. 7d). Where tested, alanine substitutions across the switch loop had no effect on Rec102 activity in vivo and completely replacing the loop sequence with a flexible glycine or serine linker only reduced recombination about fourfold (Extended Data Fig. 7e). These results suggest that the specific amino acid sequence of the switch loop is not essential for Rec102 function, in keeping with the absence of a GHKL domain.

The stalk of Rec102 begins at residue I165, after the switch loop. As in Top6B, the first part of the stalk is an amphipathic helix whose hydrophobic face runs diagonally across strands D through A of the central β-sheet (Fig. 3b and Extended Data Fig. 7f). However, unlike the long, continuous α-helix that makes up the stalk of Top6B, the equivalent region of Rec102 is distorted into four helical segments to make a kinked path that is more compact and engages in more extensive intramolecular and intermolecular contacts within the core complex (Fig. 3b)9. We examined substitutions of four stalk residues predicted to contact Spo11 (I195A, L198A, R199A and W202A). Of these, R199A completely eliminated both meiotic recombination in vivo and yeast two-hybrid (Y2H) interaction with Spo11, without diminishing protein levels (green shading in Fig. 3c,d). R199 forms hydrogen bonds with multiple residues in Spo11 (Extended Data Fig. 8a). The other three substitutions reduced the Y2H interaction but did not disrupt recombination in vivo (Fig. 3c,d). Perhaps other interactions within the DSB-forming machinery can compensate when this interface with Spo11 is only partially weakened.

The Rec102 stalk winds its way to interact with the Spo11 WH domain. A helical hairpin at the end of the stalk aligns with a pair of α helices from Spo11 (Fig. 3b,e,f). The structure of this part of the Spo11–Rec102 interface resembles the computational Spo11–Rec102 model, which was earlier noted to resemble the Top6A–Top6B interface9. However, Spo11 and Rec102 have a much more extensive interface than Top6A and Top6B (1,681 Å2 for Spo11–Rec102 versus 958 Å2 for Topo VI; Extended Data Fig. 8b), mediated by both hydrophobic and hydrogen-bonding interactions (Fig. 3f and Extended Data Fig. 8a). Substitutions across the Rec102 helical hairpin (L214 and residues 220–236) were previously shown to disrupt the Y2H interaction with Spo11 and to eliminate meiotic recombination initiation in vivo14. Meiotic recombination was also compromised by single-alanine substitutions for interfacial residues L207 of Rec102 (yellow shading in Fig. 3c, left) and L60 and L112 in Spo11 (peach shading in Fig. 3c, right). Of these, Spo11-L112A was also deficient for the Y2H interaction with Rec102 despite having normal protein levels (Fig. 3d) but Rec102-L207A and Spo11-L60A retained the Y2H interaction, suggesting that their recombination defects are attributable to another biochemical property of the interface separate from stabilizing the interaction per se. Other tested residues in this interface were dispensable for meiotic recombination in vivo (Spo11 F103, which packs against Rec102 W202, and Spo11 L105) (Fig. 3c,d).

In addition to these elements that are conserved with Top6B, Rec102 sports a 34-residue C-terminal extension beginning at D231. This segment wraps around the stalk, contributes to the extended interface with Spo11 and also makes contacts with the DNA and Rec104 (Fig. 3e,g), as discussed below. Double-alanine substitutions for Rec102 D231;K232 or T235;T236 compromise both meiotic recombination and the Spo11 Y2H interaction14. Rec102 S233 and Q239 form hydrogen bonds with the Spo11 peptide backbone at G95 and L98, respectively (Extended Data Fig. 8a). Rec102-S233A, Rec102-Q239A and Spo11-G95A severely compromise meiotic recombination in vivo and diminish the Spo11–Rec102 Y2H interaction without affecting protein levels (purple shading in Fig. 3c, left, salmon shading in Fig. 3c, right and Fig. 3d).

There is also an extension (25 residues) of the N terminus of Spo11 before the WH domain, of unknown function16. This α-helical segment contacts Rec102’s C-terminal extension and stalk and the N terminus of Rec104 through hydrophobic interactions and hydrogen bonds (Fig. 3e,h). The Spo11 extension places the N termini of Spo11, Rec102 and Rec104 in close proximity, which agrees with crosslinking data and explains why an N-terminal maltose-binding protein tag on any of these proteins resulted in extra density in nearly the same position in negative-stain EM14. Several substitutions in the Spo11 extension (L3A, R7D and L20A) severely compromised meiotic recombination in vivo and diminished Y2H interactions with Rec102 or Rec104 (blue shading in Fig. 3c, right and Extended Data Fig. 8c). Most of the other tested mutants in this region also decreased the Y2H interactions without affecting recombination (Fig. 3c and Extended Data Fig. 8c).

Rec104 structure and protein interactions

Only residues 9–58 (of 182) of Rec104 are visible in the structure, forming three α-helices that interact with both Rec102 and Spo11 (Fig. 4a–c). The structure agrees with previously observed14 crosslinks between Rec104 K43 and Rec104 K10, K36 and K39 (α-carbon-to-α-carbon distances of 12–20 Å). However, the resolution is lower for this part of the structure (Extended Data Fig. 2d and Extended Data Fig. 3d); thus, the details of interfacial contacts should be viewed cautiously.

Fig. 4: Rec104 structure and interactions with other subunits.
figure 4

a–c, Details of protein–protein contacts of Rec104 with Rec102 and Spo11: Helix 1 and the preceding residues of Rec104 contact both Rec102 (strand A and the stalk) and Spo11 (N-terminal extension), primarily through backbone and side chain hydrogen-bonding interactions (a). Helices 1 and 2 in Rec104, plus the intervening turn, contact the Rec102 stalk and C-terminal extension through hydrophobic interactions (b). The third α-helix of Rec104 contacts the Rec102 β-sheet (strands A, B and C) and switch loop through hydrophobic interactions (c). The gapped-DNA-bound core complex is shown in a,b, while the hairpin-DNA-bound complex is shown in c because the resolution was higher for this part of the structure. d, Conservation of Rec104 in Saccharomycetaceae. A multiple-sequence alignment of Rec104 sequences from 12 species was generated and visualized using COBALT45. Aligned residues are colored on the basis of relative entropy, with red indicating more conserved residues. The region with the strongest conservation, shown in detail above, matches the portion of Rec104 visible in the cryo-EM structures. Conserved residues mediating hydrophobic interactions are labeled in green. Topology of the cladogram is based on a previous study46. WGD, whole-genome duplication. e, Position of the structured region of Rec104 relative to the domains of S. shibatae Top6B. The core complex was superposed with Topo VI (PDB 2ZBK)4 by aligning the Cα atoms of strand C from each protein (S147–F156 of Rec102 and M409–T418 of Top6B). Only Rec104 from the core complex is shown (green); the GHKL ATPase domain of Top6B is colored pale cyan, the H2TH domain is colored cyan and the transducer domain is colored dark blue. In the superimposed ensemble, Rec104 wraps around the transducer domain, with helices 1 and 2 lying in front of the stalk and helix 3 lying against the back side of the upper corner of the β-sheet.

Previous efforts to identify Rec104 orthologs failed to find them outside of Saccharomyces species28. The structure of the core complex allowed us to revisit this because the visible parts of Rec104 correspond to a previously unrecognized conserved domain (Fig. 4d). We expanded the collection of Rec104 homologs by focusing on this domain but we were still only able to detect them in family Saccharomycetaceae (Fig. 4d and Methods). Particularly well conserved residues (Fig. 4d) line the interfaces with Rec102 and Spo11, including Rec104 Y21 (Fig. 4a), Rec104 F17, L18, V23, V27, F31 and L33 (Fig. 4b) and Rec104 I41 and F46 (Fig. 4c). The C-terminal portion of Rec104 that is not visible in the cryo-EM structure is poorly conserved (Fig. 4d).

We tested the functional importance of interfacial residues. A Rec104-Y21A mutant was strongly defective for meiotic recombination (Extended Data Fig. 8d) and the Y2H interaction with Spo11 (Extended Data Fig. 8c) but was expressed at normal levels and retained the Y2H interaction with Rec102 (Extended Data Fig. 8e). There was little if any effect for the other Rec104 mutants tested (Extended Data Fig. 8d,e) or for Rec102 β-sheet substitutions L8A, V10A, F11A and F156A (pink shading in Fig. 3c, left and Extended Data Fig. 8e). Interestingly, Rec102 stalk substitutions L179A and Q183A disrupted the Y2H interaction with Rec104 but had only a modest effect (L179A) or no effect (Q183A) in recombination (green shading in Fig. 3c and Extended Data Fig. 8e). Because this Y2H assay is performed in vegetative cells, a plausible interpretation is that these substitutions weaken the Rec102–Rec104 interaction but are compensated for in the context of the full DSB apparatus.

Prior crosslinking experiments suggested that Rec104 lies near where the GHKL domain abuts the transducer domain in Topo VI, supporting the proposal that Rec104 replaces the GHKL domain1,14. We explored this idea using our structure.

The distances between Rec104 K43 and either Rec102 K60 or K64 agree well with crosslinking data (Extended Data Fig. 8f). Rec102 K60 and K64 are in strand D of the central β-sheet, with their side chains on the opposite face of the sheet from the stalk (Extended Data Fig. 7b and Extended Data Fig. 8f). The equivalent region in Top6B is at the interface with the GHKL domain, as previously noted14. However, when we superimposed the core complex on a crystal structure of Top6B (ref. 4), the structured elements of Rec104 were strikingly distinct from the GHKL or HT2H domains of Top6B (Fig. 4e). Although we do not know the disposition of the C-terminal two thirds of Rec104, the end of helix 3 points toward the location of the Top6B GHKL domain (Fig. 4e). Moreover, Rec104 K61, K65, K78 and K79 (which lie just outside the structured segment) crosslink to Rec102 K60, K64 and K79 (ref. 14), all three of which are near the position occupied by the GHKL domain in Top6B.

We conclude that the structured part of Rec104 is not equivalent to any of the domains in Top6B but that the C-terminal part of Rec104 is near or in the GHKL domain position, consistent with the previous proposal about Rec104’s location14. However, it remains unclear whether the unstructured portion of Rec104 evolved from the GHKL domain or is a wholly unrelated fold that replaced it.

Protein–DNA contacts and the specificity of DNA binding

An electrostatic surface representation of the core complex identified three positively charged patches that line the DNA-binding channel and mediate non-sequence-specific protein–DNA contacts (Fig. 5a). Patch 1 anchors overhang nucleotide A39 and its adjacent duplex in the pocket between the Spo11 WH and Toprim domains. R131 on the second α-helix of the WH domain forms a hydrogen bond with the 5′-phosphate of G40, which is the bridging phosphate between the first base of the duplex (G40) and the beginning of the 5′ overhang (A39) (Fig. 5b). This interaction stabilizes the orientation relative to the DNA of the catalytic Y135, which is on the same α-helix (Fig. 5b). In addition, the third α-helix of the WH domain inserts into the major groove, with Q144 interacting with the same phosphate as R131 (Fig. 5b) and R143 interacting with the phosphate backbone of the opposite strand (Fig. 5c). Furthermore, residue K173 at the beginning of the flexible linker connecting the WH and Toprim domains makes one of the rare contacts of Spo11 with a base, forming a hydrogen bond with the N-3 position of G40 (Fig. 5b). This contact is interesting because G is favored at this position in vivo29. Toprim domain residues in patch 1 include R266, which forms a hydrogen bond with the phosphate of A42, and E233, which forms a hydrogen bond with the 3′-terminal hydroxyl group of C71 (Fig. 5b).

Fig. 5: Protein–DNA contacts in the gapped-DNA-bound core complex.
figure 5

a, Electrostatic surface representation of the protein subunits of the Spo11–Ski8–Rec102–Rec104 complex. b–d, Details of protein–DNA contacts associated with patch 1 (b), patch 2 (c) and patch 3 (d). Side chain densities that could not be traced are shown as balls. e, Details of protein–DNA contacts associated with fingers 1 and 2. In finger 1, R344 forms base-specific hydrogen bonds with the O-4′ of T44 and O-2 of T69, further stabilized by hydrogen bonding of S347 with the phosphate of G45, while the F260 main chain nitrogen also forms a hydrogen bond with the O-2 of base C71.

These DNA interactions explain protein sequence conservation and prior experimental findings. Specifically, R131, Q144, R266 and E233 are invariant in Spo11 and Top6A and K173 is a basic residue in nearly all eukaryotes (Extended Data Fig. 9a). Moreover, the contacts of R131 and Q144 with the 5′-phosphate at the beginning of the duplex and of E233 with the 3′-hydroxyl explain the strong selectivity for a free 3′-OH end and 5′ ssDNA overhang14. R131 and E233 are essential for Spo11 function in vivo7; substituting K173 to alanine reduced DNA binding in vitro and delayed and reduced DSB formation in vivo14, while substituting E233 to alanine reduced DNA binding in vitro ~10-fold14.

Patch 2 is in the WH domain (Figs. 2a and 5a). The basic K74KKK loop surrounds the T63-T64-A65-containing strand, while K104 hydrogen bonds with the phosphate of C66 (Figs. 2a and 5c). The density is weak for the K74KKK loop, with K74 and K76 directed toward the DNA backbone. This loop is part of a β-hairpin (the ‘wing’ in the WH domain) and is cleaved by hydroxyl radicals produced by iron-chelating moieties placed nearby on the DNA14 but is highly variable in length and sequence in Spo11 proteins (Extended Data Fig. 9a).

Patch 3 is on the outer segment of the DNA-binding channel (Figs. 2a and 5a). It includes Rec102 R245 and Rec104 R26, which form hydrogen bonds with the phosphates of C50 and G58, respectively (Fig. 5d).

We also identified two sets of residues (‘fingers’) projecting from Spo11 into the minor groove. Finger 1 (F260, Y292 and R344 from the Toprim domain) tracks along the first five base pairs and finger 2 (H101 from the WH domain) forms a main chain amide hydrogen bond with the phosphate of T49 and inserts its imidazole ring into the minor groove (Fig. 5e). F260 is a bulky hydrophobic residue in Spo11 orthologs but glutamine in most Top6A proteins (Extended Data Fig. 9a). The minor groove binding by finger 1 is particularly striking because the base composition bias around Spo11 cleavage sites in vivo suggested a tendency toward a relatively wide and shallow minor groove across precisely this region29. An F260A substitution dramatically changes DSB site preference in vivo and reduces DNA-binding affinity in vitro14, while substituting Y292 to arginine disrupts DSB formation in vivo7.

Metal-ion binding

Type IIA topoisomerases (eukaryotic Topo II and bacterial gyrase and Topo IV) are thought to use a two-metal mechanism involving divalent cations bound by acidic residues of the Toprim domain30,31,32. In this model, metal ion A interacts with both bridging and nonbridging oxygens of the scissile phosphate and has a direct role in catalysis, while metal ion B interacts with an adjacent (nonscissile) phosphate and has a structural role in stabilizing protein–DNA interactions (Fig. 6a). Comparatively little is known in type IIB enzymes3,33.

Fig. 6: Metal-ion binding and models of higher-order protein–DNA complexes.
figure 6

a, Proposed two-metal-ion mechanism for catalysis of strand cleavage by type IIA topoisomerases30. Amino acids in the active site of S. cerevisiae Topo II are indicated, with metal-binding residues in the Toprim domain colored in green and residues from the WH domain colored in blue. Corresponding Spo11 residues are colored in magenta. The general base and acid are unknown. b, Superposition of the metal-binding pockets of Spo11 (magenta) and M. jannaschii Top6A (cyan; PDB 1D3Y)6. Only the metal ion and the side chains of the indicated residues from Top6A are shown. The magnesium in the Spo11 cryo-EM structure is coordinated by D288 and D290, while the magnesium in Top6A is coordinated by D249 and E197, which are equivalent to D288 and E233 in Spo11. The electron density map around the Spo11 triads and Mg2+ is shown as a blue mesh; note the lack of detectable density at the position of the metal (green ball) bound by Top6A. c,d, Models for two Spo11 core complexes bound to opposite ends of DNA duplexes of varying lengths. The plot compares the number of clashes between the two core complexes as a function of duplex DNA length (red) with direct measurement of the ability to form double-end-bound complexes in EMSA experiments (blue; data from ref. 35) (c). The blue points show the fraction of protein–DNA complexes that have both DNA ends bound. Example models are shown of double-end-bound core complexes with DNA duplexes of 22 bp, 24 bp, 28 bp and 31 bp (lengths do not include the ssDNA overhangs at each end) (d). One core complex is colored as in Fig. 1d. The other core complex is colored in gray. Clashes are colored red. The cartoon at the top of d illustrates that monomeric core complexes bound to opposite ends of a duplex mimic the back-to-back arrangement of two adjacent Spo11 dimers that have cut the DNA.

We observed density in both cryo-EM structures consistent with a single metal ion, most likely Mg2+, bound by D288 and D290 of Spo11 (Fig. 6b). When the Spo11 metal-binding pocket was compared with that of Methanocaldococcus jannaschii Top6A (ref. 6), the trio of conserved acidic residues was highly congruent but the single Mg2+ bound by each protein was in a different position, bound by E197 and D249 in Top6A (equivalent to Spo11 E233 and D288) (Fig. 6b). By comparison to yeast Top2 (ref. 30), we infer that site A is occupied in the Top6A structure, whereas site B is occupied in our Spo11 structure (Fig. 6a,b). In type IIA enzymes, site A has a higher affinity for metal than site B but is thought to rely on the scissile phosphate for two of the coordination contacts32. If the same is true for Spo11, the absence of an equivalent of the scissile phosphate at the 3′ end in our structure may explain why Mg2+ is not stably bound at site A. The observed binding in site B is consistent with the presumed structural role for the metal in this position. Occupancy of site B is consistently observed in postcleavage structures of eukaryotic and bacterial topoisomerases30,34, supporting the conclusion that our structure mimics the postcleavage state.

These findings provide a framework for understanding the conservation and functional importance of the metal-binding acidic residues in Spo11. E233 and D288 are invariant in Spo11 and Top6A proteins (Extended Data Fig. 9a) and are essential for DSB formation in vivo in yeast7. Extrapolating from type IIA enzymes, we propose that the essentiality of both residues traces to them directly coordinating the catalysis-critical Mg2+ in site A (Fig. 6a,b). By contrast, D290 is mostly but not strictly conserved (Extended Data Fig. 9a) and it is dispensable for DSB formation in vivo, whereby the substitution with asparagine has little effect on recombination activity in S. cerevisiae while the alanine substitution partially decreases DSB formation7. It may be that this residue is not essential because it contributes to metal binding only in the noncatalytic site B (Fig. 6a,b).

Modeling higher-order complexes

Spacing between adjacent DSBs

Our cryo-EM structure sheds light on spatial patterns of Spo11–DNA interactions both in vivo and in vitro. Multiple Spo11 complexes can sometimes introduce two or more DSBs close together on the same DNA molecule35,36,37. When such double-cutting occurs, it has a preferred spacing between DSBs with a minimum of ~33 bp (measured from the center of each DSB’s 5′ overhang, corresponding to 31 bp of duplex DNA plus the overhangs) and increasing in steps of ~10 bp. The 10-bp periodicity has been proposed to reflect a geometric constraint, in which adjacent Spo11 dimers are co-oriented with their active sites facing in the same direction35 (Extended Data Fig. 9b). In this model, the observed minimum distance reflects steric constraints that prevent adjacent co-oriented Spo11 complexes from being fewer than three helical turns apart.

Electrophoretic mobility shift assay (EMSA) experiments with yeast Spo11 core complexes support this interpretation; both 5′ overhang ends of a DNA substrate are readily bound if the duplex DNA segment between the overhangs is ≥28 bp but duplex lengths of 24–27 bp can only be bound at a single end35 (Fig. 6c). Double-end binding by monomeric core complexes in vitro likely mimics how close two adjacent Spo11 dimers can be on an intact DNA molecule before cleavage (Fig. 6d, top). Interestingly, reducing the duplex length even further (to 22–23 bp) allows core complexes to again bind at both ends35 (Fig. 6c). These results suggest that steric clashes that preclude double-end binding at 24–27 bp are relieved if the two DNA ends are rotated relative to one another. Different orientations of adjacent Spo11 complexes are possible with free proteins in vitro but not with the constrained complexes presumed to be present in vivo (Extended Data Fig. 9b).

These patterns of DNA cleavage in vivo and DNA binding in vitro are well explained by the extensive left-handed wrap of the core complex around the DNA in the cryo-EM structure. Two core complexes could be modeled on each end of a 28-bp DNA duplex with essentially no steric clashes but modeling on a shorter DNA segment (24 bp) resulted in substantial overlap between the Rec102 and Rec104 moieties of the two core complexes (Fig. 6c,d). Consistent with EMSA data, the clashes could be almost entirely resolved by shortening the DNA still further to 22 bp, which rotates the core complexes relative to one another and allows them to interdigitate (Fig. 6c,d). Moreover, a 31-bp duplex is the shortest distance that can accommodate a pair of core complexes that is both co-oriented and lacking in steric clashes (Fig. 6d). This agrees well with in vivo spacing because a 31-bp duplex corresponds to a center-to-center DSB distance of 33 bp (that is, the minimum preferred distance for double cuts)35,37.

Model of a pre-DSB Spo11 dimer

We attempted to model a catalytically competent Spo11 dimer by docking two copies of the monomeric DNA-bound core complex together. If the two DNA copies are aligned to make a roughly B-form duplex, the WH domain from each Spo11 clashes sterically with the Toprim domain from the other (Fig. 7a). In addition, the nonoverhang strand from one complex clashes with the C-terminal part of α-helix 8 of Spo11 from the other complex and the overhang strand collides with a loop near finger 1 (Fig. 7b). Thus, the configurations of both protein and DNA in the cryo-EM structure are incompatible with a plausible structure of a pre-DSB complex on B-form DNA.

Fig. 7: Model of a pre-DSB Spo11 dimer.
figure 7

a, Predicted Spo11–Spo11 clashes when core complexes are docked together on a B-form DNA. Two copies of the DNA-bound core complexes were aligned onto a B-form DNA using the first two nucleotides of their 5′ overhangs. One Spo11 core complex is colored as Fig. 1d and the other is colored in gray. The protein clashes are colored in red. DNA is not shown. b, Predicted clashes between protein and DNA strands. The clashing regions are highlighted in red surface representation. c, Rigid-body motion of the Spo11 WH domain relative to the Toprim domain. The Spo11 core complex cryo-EM structure is shown in a surface representation. The ribbon diagram shows the hypothetical configuration of the core complex if the WH and Toprim domains of Spo11 are superposed separately (separated between Q172 and K173; Methods) onto their cognate domains in S. shibatae Top6A from the crystal structure of the Topo VI holoenzyme4. d, Overview of the model of a Spo11 core complex dimer with inset comparing parts of the Spo11 dimer interface (purple) with the equivalent Top6A dimer interfaces from different species (cyan, M. jannaschii; orange, M. mazei; yellow, S. shibatae.). e, Deformation of DNA in the hypothetical pre-DSB dimer model. Proteins of two core complexes are shown in a surface representation and the two DNA duplexes are shown in ribbon diagram with their positions relative to the Spo11 WH domains preserved from the cryo-EM structures. A surface representation of a B-form DNA is included for comparison. The inset (rotated 30°) highlights the deviation from the B-form DNA path in the model.

Because Spo11 core complexes are flexible in solution14, we asked whether simple rigid-body motions of their domains could resolve the clashes. To answer this, we separately aligned the Spo11 WH domain (together with DNA, Rec102 and Rec104) and the Toprim domain (together with Ski8) to the cognate domains of Top6A in the Saccharolobus shibatae Topo VI dimer structure4 (Extended Data Fig. 9c) because the segment between the domains is thought to be a flexible linker6,14. This alignment rotates the WH domain 45° relative to the Toprim domain and, notably, eliminates all of the steric clashes (Fig. 7c). The model also matches segments of Spo11 well with the two major Top6A–Top6A interfaces in Topo VI dimer structures4,5,6: a pseudocontinuous β-sheet on the underside of the Toprim domain away from the DNA (K206 to P211 from one Spo11 interacting with its match on the other Spo11) and an interface between the WH domain of one Spo11 near the catalytic tyrosine and a C-terminal region of the other Spo11 that includes an invariant residue E386 that is brought in close proximity to the catalytic tyrosine (Fig. 7d and Extended Data Fig. 9a). This model is, thus, a plausible representation of a dimeric pre-DSB complex. We further speculate that DSB formation is accompanied by a Spo11 conformation change into a postcleavage state resembling our cryo-EM structures and that the steric clashes described above may prevent DNA end religation, rendering DSB formation irreversible.

The model predicts an intriguing deformation of the DNA. We preserved the relationship of the DNA to the WH domain because this also preserves the large majority of the protein–DNA contacts from the cryo-EM structure. In doing so, the overhang ends of the two DNA segments come into close proximity (~4.8 Å separating the inferred positions of 5′ and 3′ ends of each strand) but make a V shape with a 150° angle and with the helical axes slightly offset by ~5 Å (Fig. 7e). Empirical evidence for DNA deformation by Spo11 comes from the 130° angle of the DNA in our gapped double-hairpin structure (Fig. 2d) and by previous DNA-binding experiments that revealed bent DNA by atomic force microscopy and apparent preferential binding to bent duplex DNA (inferred from higher affinity for binding 100-bp versus 400-bp minicircles)14. Topo VI and other type II topoisomerases bend DNA before cleavage6,14,38; hence, it is likely that Spo11 core complexes may do so as well. Our model, thus, provides a framework for understanding the structural determinants of DNA bending, for which no high-resolution structural information is currently available for any dimeric Topo VI or Spo11 complex.

Check out our other content

Most Popular Articles