Monday, December 9, 2024

What Would Change with a “New Physics” Breakthrough?

“New physics” is a catch-all term...

Neem seed extract improves effectiveness of pesticide

Pesticides can be made more effective...

The proteogenomic landscape of multiple myeloma reveals insights into disease biology and therapeutic opportunities

BiochemistryThe proteogenomic landscape of multiple myeloma reveals insights into disease biology and therapeutic opportunities


Study cohort

A total of 138 patients were included in the proteomics study (114 NDMM, 17 PCL and 7 MGUS cases). Inclusion criteria were the availability of myeloma cells of appropriate quantity and quality for proteomic and genetic analyses and available information on FISH-bases cytogenetics and clinical parameters. All patients provided written informed consent according to the Declaration of Helsinki. The study was approved by the responsible ethic committees Ulm University (136/20, 307/08) and Charité Universitätsmedizin Berlin (EA2/142/20). Clinical trials of the DSMM and sample collection were approved by the ethics committee of Würzburg University (2008-000007-28, 145-11).

Patient characteristics are summarized in Supplementary Table 1. One hundred out of 114 newly diagnosed patients were treated within one of the DSMM XII–XIV clinical trial and had available outcome data (NCT00925821, NCT01090089 and NCT01685814)64. All of these 100 patients were scheduled to receive a lenalidomide-based intensive therapy within a clinical trial.

DSMM XII/NCT00925821 (N = 12): induction therapy with four cycles lenalidomide/adriamycine/dexamethasone. All patients were scheduled to receive high-dose melphalan/auto-SCT, while the nature of the second SCT was determined by risk stratification: high-risk patients (cytogenetics, ISS) were scheduled to undergo allogeneic stem cell transplantation (N = 3) followed by lenalidomide maintenance while standard-risk patients received a second auto-SCT followed by lenalidomide maintenance for 1 year64.

DSMM XIII, arm A2/NCT01090089 (N = 20): induction therapy with three cycles of lenalidomide/dexamethasone followed by two cycles of high-dose melphalan/auto-SCT and lenalidomide maintenance until progression.

DSMM XIV NCT01685814 (N = 68): induction therapy randomized between four cycles lenalidomide/adriamycine/dexamethasone (N = 36) or three cycles lenalidomide/bortezomib/dexamethasone (N = 32) followed by high-dose melphalan/auto-SCT, second randomization in patients with very good partial response (VGPR) or better directly to lenalidomide maintenance until progression or a second cycle of high-dose melphalan/auto-SCT followed by lenalidomide maintenance for 3 years. Patients not achieving VGPR after the first high-dose melphalan/auto-SCT were randomized to receive a second cycle of high-dose melphalan/auto-SCT followed by lenalidomide maintenance for 3 years or allogeneic stem cell transplantation (N = 3) followed by 1 year of lenalidomide maintenance.

No significant difference was observed for PFS and OS across the patients treated in the three different trials.

Healthy control cells were obtained from orthopedic surgery patients without evidence for malignancy. The median age of the healthy donors was 63 years (range, 57–78 years). All donors provided written informed consent according to the Declaration of Helsinki and the study was approved by the responsible ethic committee Charité – Universitätsmedizin Berlin (EA4/115/21).

Cell isolation

Except for 12 PCL samples from peripheral blood, all samples were collected from bone marrow aspiration. Mononuclear cells were isolated with a Ficoll gradient and plasma cell content was determined morphologically. The majority of samples (89/138) were enriched for CD138+ cells via magnetic-activated cell sorting (MACS) directly after mononuclear cell isolation using magnetic beads conjugated to a human CD138-specific antibody (130-051-301, Miltenyi). Non-MACS enriched samples (49/138) were selected for a plasma cell content >75% and had an average CD138+ purity of 85%. Healthy bone marrow mononuclear cells were isolated by Ficoll gradient and CD34+, CD19+ and CD138+ cells were enriched with MACS antibody bead conjugates (all Miltenyi), according to the manufacturer’s protocol. For each cell population of healthy bone marrow cells, three replicates were obtained. Replicates one and two were obtained from separate individuals and replicate three was obtained by pooling material from three different donors due to limitations in sample material.

FISH analysis

FISH in combination with immunofluorescent detection of light chain-restricted plasma cells was performed on plasma cells from patients. Genetic regions of interest for the diagnosis of MM and their translocation partners were detected. FISH was performed according to standardized protocols using commercially available probes (Abbott Laboratories and MetaSystems).

DNA preparation and nanopore sequencing

DNA was isolated with the AllPrep DNA/RNA kit (QIAGEN, 80204). RNA and DNA were extracted from the same sample while protein was extracted from a different aliquot of the same patient/time point sample. Nanopore DNA sequencing was performed with the Oxford Nanopore Technologies (ONT) platform. Libraries containing either a pool of three samples or just a single sample were prepped with the Rapid Barcoding Sequencing kit (SQK-RBK004, ONT) using approximately 350 ng starting material for each sample in a pool of three or 400 ng of starting material for a single run (Rapid Sequencing kit, SQK RAD004). A maximum amount of 850 ng library was loaded onto the flow cell (FLO-MIN106D, R 9.4.1, ONT) and sequenced on a GridION sequencer (ONT), according to the manufacturer’s instructions.

RNA sequencing library preparation and sequencing

RNA was isolated with the AllPrep DNA/RNA kit (QIAGEN, 80204). Library preparation was performed from 20 to 100 ng of input total RNA per sample using the TruSeq Stranded Exome RNA kit (Illumina), according to the manufacturer’s instructions. The pooled RNA libraries were sequenced on an Illumina HiSeq2000 with 50-bp single-end reads with an average coverage of 36.6 × 106 reads per sample.

Protein extraction and digestion

Samples were lysed at 4 °C with urea lysis buffer as previously described65. Protein lysates were reduced with 5 mM dithiothreitol for 1 h and alkylated with 10 mM iodoacetamide for 45 min in the dark. Samples were subsequently diluted 1:4 with 50 mM Tris–HCl, pH 8 and sequencing grade LysC (Wako Chemicals) was added at a weight-to-weight ratio of 1:50. After 2 h, sequencing grade trypsin (Promega) was added at a weight-to-weight ratio of 1:50 and digestion was completed overnight. Samples were acidified with formic acid and centrifuged to remove precipitated material (20,000g, 15 min). The supernatant was desalted with Sep-Pak C18 cc Cartridges (Waters).

TMTpro labeling of peptides

Desalted and dried peptides were labeled with TMTpro 16 plex reagents (Thermo Scientific) according to the manufacturer’s instructions and at a sample-to-tag ratio of 1:7 (w/w). After confirming successful labeling, TMT-labeled peptides of cohort samples were randomly combined into ten TMTpro plexes (see Supplementary Table 11 for TMT channel allocation). For TMT plex 1–9, 75 µg peptides per channel were used and 45 µg of peptides per channel were used for TMT plex 10. An equal loading internal standard that consisted of a mix of all cohort samples was included in each TMT plex. Samples from healthy bone marrow donors were analyzed in an 11th TMTpro plex with 10 µg peptides per sample and an equal loading internal standard that was the same as for the cohort samples. The 11th TMT plex also contained a booster channel (500 µg peptides) that was identical to the internal standard and the two TMT channels next to it were left empty to prevent signal spillover. Combined TMT samples were dried down and resuspended in liquid chromatography sample buffer (3% acetonitrile (ACN), 0.1% formic acid) before desalting with Sep-Pak C18 cc Cartridges (Waters).

Peptide fractionation of TMT-labeled samples

Dried TMT-labeled samples were resuspended in high pH buffer A (5 mM ammonium formate, 2% ACN) before offline high pH reverse phase fractionation by high-performance liquid chromatography (HPLC) on an UltiMate 3000 HPLC (Thermo Scientific) with an XBridge Peptide BEH C18 (130 A˚, 3.5 µm; 4.6 mm × 250 mm) column (Waters) as previously described (Mertins et al.65). Each fractionated TMT plex was pooled into 24 or 28 fractions and 10% of each fraction was reserved for global proteome measurements. The remaining fractions were further pooled into 12 or 14 fractions per TMT plex for phosphoproteomics. Dried global proteome fractions or immobilized metal affinity chromatography-enriched phosphopeptides were reconstituted in liquid chromatography sample buffer before mass spectrometric measurements.

Phosphopeptide enrichment

Phosphopeptide enrichment was performed with immobilized metal affinity chromatography automated on an AssayMap Bravo System (Agilent) equipped with AssayMAP Fe(III)-NTA cartridges.

Liquid chromatography–mass spectrometry

Samples were fractionated online with a 25-cm column packed in-house with C18-AQ 1.9 µm beads (Dr. Maisch Reprosil-Pur 120). Samples were separated with a gradient of mobile phase A (0.1% formic acid and 3% acetonitrile in water) and mobile phase B (0.1% formic acid, 90% acetonitrile in water) at a flow rate of 250 µl min−1. TMT samples were separated with an EASY nLC 1200 HPLC system and temperature of the column was controlled by a column oven set to 45 °C. For a 2 h gradient, mobile phase B was increased from 4% to 30% in the first 88 min, followed by an increase to 60% B in 10 min and a plateau of 90% B for 5 min, followed by 50% buffer B for 5 min. For a 4 h gradient, mobile phase B was increased from 3% to 30% in the first 192 min followed by an increase to 60% B in 10 min, a plateau of 90% B for 5 min and 5 min 50% buffer B. All TMT fractions were measured with a 2 h gradient. To boost identification in the 11th TMT plex with healthy bone marrow samples, fractions of plex 11 were additionally measured with a 4 h gradient. MS data of TMT samples was acquired in profile centroid mode and data-dependent acquisition on a Q Exactive HF-X (Thermo Fisher). MS1 scans were acquired at 60,000 resolution, scan range of 350–1,500 m/z, maximum injection time (IT) of 10 ms and automatic gain control (AGC) target value of 3e6. The 20 most abundant ion species were picked for fragmentation, normalized collision energy (NCE) was set to 32 and the isolation window was at 0.7 m/z. MS2 scans were acquired at 45,000 resolution, fixed first mass 120 m/z, AGC target value of 3e5 and maximum IT of 86 ms. Dynamic exclusion was set to 30 s and ions with charge state 1, 6 or higher were excluded from fragmentation. For analysis of phosphoproteomic fractions of TMT-labeled samples the liquid chromatography–mass spectrometry parameters were the same, with the exception of MS2 maximum IT that was set to 120 ms.

TMT raw data search and processing

All TMT mass spectrometry raw files were analyzed together in one MaxQuant (v.2.0.3.0)66 run. Data were searched against the human reference proteome (UP000005640) downloaded from UniProt in January 2021 (https://ftp.uniprot.org/pub/databases/uniprot/previous_releases/) and default protein contaminants. TMT correction factors were applied and the minimum reporter precursor intensity fraction was set to 0.5. Fixed modifications were set to carbamidomethylation of C and variable modifications were set to M oxidation and acetylation of protein N-termini. TMT global proteome and phosphopeptides fractions were analyzed in the same MaxQuant run in separate parameter groups using the same settings, except for including also phospho (STY) as a variable modification when searching phosphopeptide fractions. A maximum of five modifications per peptide were allowed. N-terminal acetylation and M-oxidation were used in protein quantification. Only unique and razor peptides were used for protein quantification. Protein FDR was set to 0.01. Protease specificity was set to Trypsin/P. MaxQuant output files were further analyzed in R studio (v.4.1.1). The protein groups file was filtered for reverse hits, potential contaminants and proteins only identified by site. Protein groups were further filtered for at least two peptides and at least one unique or razor peptide. The TMT-based phosphosite table was expanded by multiplicity and reverse database hits and potential contaminants were removed. Corrected reporter ion intensity columns of both tables were log2 transformed and normalized by subtraction of the internal standard channel contained in each TMT plex. The resulting TMT ratios were normalized via median-median absolute deviation (MAD) normalization. Before differential expression analysis, data were filtered for detection in more than 49% of cohort samples. For comparing healthy and malignant samples, only MACS-sorted samples were compared. Proteomic results are available in Supplementary Table 6 (global proteome) and Supplementary Table 7 (phosphoproteome).

Label-free proteomic analysis of cell lines

CD138 MACS sorted and unsorted cell line samples were fractionated online with a 2 h gradient and mass spectrometry data were acquired on a Q Exactive Plus mass spectrometer in data dependent acquisition (DDA) mode (top ten). MS1 scans were acquired at 70,000 resolution, scan range of 350–2000 m/z, maximum IT of 50 ms and AGCtarget value of 3e6. NCE was set to 26 and the isolation window was at 1.6 m/z. MS2 scans were acquired at 17,500 resolution, fixed first mass 120 m/z, AGC target value of 5e4 and maximum IT of 50 ms. Dynamic exclusion was set to 30 s and ions with charge state 1, 6 or higher were excluded from fragmentation. Label-free DDA data were analyzed in MaxQuant 2.0.1.1. using default parameters. The LFQ and match between run options were enabled. Phospho (STY) was included as a variable modification for searching the phosphoproteome data. MaxQuant LFQ intensities were log2 transformed and filtered for contaminants, identified by side, as well as valid values (minimum three per experimental group). The missing values were imputed from a normal distribution with a width of 0.3 times the standard deviation in the sample and a downshift of 1.8 from the observed mean. LFQ intensities were median normalized before differential expression analysis and experimental groups (control and MACS) were compared using a two-sided moderated two-sample t-test.

UBE2Q1 overexpressing samples were analyzed as described previously using data-independent acquisition (DIA)67. Label-free DIA data were searched using DIA-NN 1.8.1 software against the human UniProt reference proteome68. The search was performed in library-free mode with the in silico FASTA digest parameter enabled. The peptide length range was set to 7–30, and the precursor charge range was set to 1–4. The m/z range for precursors was set to 340–1,650, and for fragment ions, it was set to 200–1,800. The rest of parameters were set to default with reannotate and match between run being enabled. LFQ protein intensities from the DIA-NN pg output table were log2 transformed and filtered for contaminants and peptides per protein (>1), as well as valid values (>70%). Imputation was performed as described above and resulting intensities were median normalized before differential expression analysis. Experimental groups (empty overexpression vector (empty OE) and UBE2Q1 overexpression (UBE2Q1 OE)) were compared using a two-sided moderated two-sample t-test.

Cell culture

All cell lines were obtained from the American Type Culture Collection (ATCC) or DSMZ German Collection of Microoganisms and Cell Cultures and were maintained in RPMI-1640 medium containing 10% fetal bovine serum (FBS) and supplemented with 1% penicillin/streptomycin and 1% l-glutamine. NCI-H929 cells were cultured in media supplemented with beta-mercaptoethanol and sodium pyruvate, and INA-6 cells were cultured in media supplemented with IL-6. Cells were maintained at 37 °C with 5% CO2 in the humidified atmosphere.

CRISPR–Cas9 activation screen

Lentiviral plasmid dCAS-VP64_Blast was a gift from Feng Zhang (Addgene plasmid #61425)69 and was used to stably transduce MM.1S cells. The human Calabrese CRISPR activation pooled library set A was a gift from David Root and John Doench (Addgene #92379)70. Lentivirus was produced using HEK293T cells via transfection of the guide library with pSPAX2 and pMD2.G. Virus titration was performed to achieve a MOI of ~0.3 in MM.1S dCas-VP64 cells. A total of 1 × 108 MM.1S dCas-VP64 cells were transduced, and 3 × 107 cells were collected for baseline comparison. The remaining cells were maintained and the media were refreshed every 3 days. On day 28, all cells were collected for genomic DNA analysis. Genomic DNA extraction was performed with Wizard Genomic DNA Purification Kit (A1120). The guide RNA library was amplified and cleaned up with AMPure XP beads. Library single guide (sg)RNAs were sequenced on a NextSeq 500 instrument (Illumina). The MAGeCK algorithm (https://www.bioconductor.org/packages/release/bioc/html/MAGeCKFlute.html) was utilized for analyzing normalized reads and beta score. The beta score indicates the difference in sgRNA abundance between day 4 and day 28, a high score indicating a survival advantage of the respective gene.

Generation of UBE2Q1 overexpression cell lines

UBE2Q1 cDNA was cloned into retroviral vector pRSF91-FLAG-GW-IRES-GFP-T2A-Puro via a Gateway reaction. Retroviral vectors containing empty or UBE2Q1 constructs generated in HEK293T cells were used to stably transduce MM cell lines OPM2 and LP-1. Seven days posttransduction, cells were placed under puromycin selection. At the time of analysis, the purity of stable cell lines was 99% GFP fluorescence as determined by flow cytometry.

Inhibitor treatment and viability assays

NT157 was obtained from SelleckChem (S8228), erdafitinib was purchased from Hölzel Diagnostics (HY-18708). Cells were seeded in 384-well plates with respective treatments and plates were incubated at 37 °C for 96 h. Cell viability readout was measured using CellTiter-Glo Luminescent Cell Viability Assay on a POLARstar Omega plate reader.

FACS analysis of FCRL2 expression

FCRL2 fluorescence-activated cell sorting (FACS) analysis was performed on primary cells, of 14 samples from patients with MM (13 bone marrow aspirates and one ascitic fluid) and 7 healthy donor samples (6 bone marrow samples and one peripheral blood). All samples contained isolated mononuclear cells and were stained with allophycocyanin (APC) anti-FCRL2 (Miltenyi Biotech, 130-107-439). For myeloma cell identification, we used BV421 anti-BCMA (BioLegend, 357519) and FITC anti-SLAMF7 (BioLegend, 331818). The different subpopulations of immune cells were distinguished by PE anti-CD138 (BD Pharmigen, 552026), FITC anti-CD19, PE anti-CD3 (both from BioLegend, 302206 and 344806) as well as PC7 anti-CD13, PE anti-CD33 and PE anti-CD34 (all from Beckman Coulter, B19714, A07775 and A07776). All antibodies were used in a dilution of 1:40. Data analysis was performed with FlowJo v10. Unstained controls were used to set the gates for the fluorochromes.

Survival analysis with bootstrapping and risk score calculation with AIC-optimal model

The analysis was restricted to patients with MM treated with lenalidomide in induction and maintenance therapy as well as high-dose melphalan/auto-SCT within DSMM clinical trials (N = 100 patients). For each fully quantified protein and phosphopeptide, a continuous variable Cox proportional hazard model for PFS was calculated and resulting P values were corrected with Benjamini–Hochberg. We combined the FDR-controlled approach with 1,000-fold bootstrapping to identify the most reproducibly significant proteins in a cohort of the same size randomly sampled with replacement from our data, that is, allowing multiple occurrences of samples in the bootstrap cohort. The 95% confidence interval of P values from the bootstrapping was calculated. Proteins with an upper confidence interval of P values <0.1 and an FDR <0.1 (n = 32) were selected as candidates for the final risk score. A multi-protein Cox PH model was constructed by step-wise addition of optimal proteins based on the Akaike Information Content (AIC), balancing increased model performance versus increased model complexity. The final risk score was calculated on the AIC-optimal multi-protein model, by linear combination of the protein abundance scaled by the model coefficients. This resulted in a protein score containing protein-level information of eight proteins with differing weights. The inclusion of additional proteins or phosphopeptides into the model only led to marginal improvement in the survival prediction accuracy. Differences in survival were analyzed with a log-rank test. For validation, we calculated the protein risk score on untreated myeloma samples analyzed by Kropivsek et al.21 based on the provided protein quantifications (‘CD138 cells’ quantification). The term for PDSS2 was omitted from the risk score since it was not quantified in the Kropivsek et al. cohort. No other adaptations of the risk score were employed. Survival curves were stratified by the median risk score of the respective cohort.

RNA–protein correlation and CNV buffering analysis

For RNA–protein correlation analysis RNAseq samples were filtered for a minimum plasma cell content of 80% and a mapped read count higher than 20 million. Proteome data were collapsed to gene-level information via median and RNA and protein datasets were matched by gene name. Copy number variation (CNV) data were matched with RNA and protein data via the cytogenetic band of the corresponding gene locus. For calculating Pearson correlation across MM samples, the resulting data matrix was filtered for at least ten paired values. To estimate the buffering of CNVs from RNA to protein level we calculated a customized score with the following formula:

$$\mathrmbuffering\; \mathrmscore_g=\left[\mathrmcorr\left(\mathrmRNA_g,\mathrmCN_g\right)-\mathrmcorr\left(\mathrmprotein_g,\mathrmCN_g\right)\right]\times \left|\bar\mathrmCN_g-2\right|$$

For each gene (g) we subtracted the Pearson correlation (corr) of protein to copy number (CN) from the Pearson correlation (corr) of RNA to CN. The resulting delta was corrected with the average copy number effect diverging from a diploid genotype. Pearson correlations and buffering scores were subjected to ssGSEA analysis as described below.

SsGSEA

The ssGSEA implementation available on https://github.com/broadinstitute/ssGSEA2.0 was used to separately project protein and phosphopeptide abundance changes to signaling pathways. The normalized ratio or fold change matrix was collapsed to gene level information via median and subjected to ssGSEA. For ssGSEA of normalized TMT ratios, the gene set databases containing curated gene sets (C2.all.v7.0.symbols.gmt), oncogenic signature gene sets, (c6.all.v7.0.symbols.gmt) and hallmark gene sets (h.all.v7.0.symbols.gmt) were used. For ssGSEA of RNA to protein correlations, the Kyoto Encyclopedia of Genes and Genomes (KEGG) gene sets (c2.cp.kegg.v7.0.symbols.gmt) were used. For ssGSEA of buffering of CNVs from RNA to protein level, databases containing positional genesets (c1.all.v7.0.symbols.gmt) and KEGG gene sets (c2.cp.kegg.v7.0.symbols.gmt) were used. The following parameters were used for all ssGSEA analyses: sample.norm.type = ‘rank’, weight = 0.75, statistic =‘area.under.RES’, output.score.type = ‘NES’, nperm = 1,000, min.overlap = 10, correl.type = ‘z.score’

NMF clustering of ssGSEA enrichment scores

Normalized ssGSEA scores of phosphoproteomic data were used as input for NMF with the NMF R package (v.0.23.0)71 as previously described72. The following parameters were used: K = 2:7, method = ‘brunet’, nrun = 50. The cophenetic correlation coefficient was used to evaluate the clustering quality. After determining the optimal factorization rank k, we repeated the NMF analysis using 500 iterations with random initializations and performed partitioning of samples into clusters.

GO term analysis with Metascape

Gene Ontology (GO) term enrichment analysis of a gene list corresponding to proteins regulated in 1q gain not located on chromosome 1q was performed with the Metascape73 online tool.

Integration of Depmap data

Proteins significantly upregulated in myeloma versus healthy samples (<0.1 FDR) or selectively identified in myeloma samples were further filtered for potential therapeutic targets by integrating the depmap CRISPR KO database (gene effect download file74). First, genes coding proteins in our candidate list were filtered for median dependency in myeloma cell lines <−0.4 (median dependency of the myeloma therapeutic targets IKZF1 and IKZF3). Common essential genes (DepMap Public 22Q2) were excluded from the target list. In addition, genes were filtered for having a minimum difference of median dependency in myeloma versus median dependency in nonmyeloma cell lines >0.1.

RNA sequencing data analysis

RNA sequencing data were aligned and quantified with STAR and messenger RNA reads were identified using an in-house analysis pipeline detecting exons in a shuffled order. To increase comparability to TMT data, RNA gene-level transcripts per million (TPM) values were further normalized as described previously75. First, TPM gene-level data were normalized via median subtraction (by gene) and, subsequently, each sample was normalized by median-MAD normalization. The normalized data are available in Supplementary Table 2.

Nanopore DNA sequencing data analysis

After basecalling, the sequenced reads were aligned with minimap2 (ref. 76) to the University of California, Santa Cruz (UCSC) hg19 genome reference (https://www.ncbi.nlm.nih.gov/grc) without haplotype specific scaffolds. After conversion of the alignment files (SAMtools v.0.1.19, https://github.com/samtools/) SAM format, (https://samtools.github.io/hts-specs/SAMv1.pdf)) sorting and indexing to binary alignment format (BAM format, https://samtools.github.io/hts-specs/SAMv1.pdf) the copy number profiles were generated with the absolute copy number estimate package77 in R (4.2.1, https://cran.r-project.org/) with a bin size of 1 million base pairs. Errors were estimated with ‘maximum absolute error’ and only autosomes were called. The resulting copy number aberrations were reported on to genomic band level to the nearest integer. Ambiguous copy numbers were called by the most prevalent copy number on the particular band. Bands with insufficient reads were marked as NA. For subclonal events, the nearest natural number was chosen, except in the vicinity of two where a deviation threshold of 0.35 was used to maximize the concordance with FISH results.

Ploidy and cellularity (relevant local minimum used) of each sample in absolute copy number estimate were matched to existing FISH data. If FISH data were not available, the profiles were chosen for plausibility, minimizing the number of aberrations and avoiding scaffolds with copy number 0. The processed data are available in Supplementary Table 2. Four additional cases without 9q amplification were assigned to the hyperdiploidy group based on nanopore sequencing

Validation by single-cell sequencing data

Expression of candidates from the proteomic analysis was further validated with single-cell RNA sequencing data of bone marrow from healthy individuals and patients with MM from Lutz et al.51. Uniform manifold approximation and projection (UMAP) plots highlighting normalized expression for genes of interest were generated in R using the FeaturePlot() function from the Seurat package78.

Statistics and reproducibility

No statistical method was used to predetermine sample size, samples were chosen based on availability. As the study focuses on newly diagnosed samples, four TMT labeled samples corresponding to relapse cases were excluded from the analysis. In the TMT plex analyzing healthy cells, carrier channels containing the booster channel and unsorted mononuclear cells were excluded from further analysis; they were present in the TMT plex to increase coverage of low abundant proteins. Patient samples were randomly distributed across TMT plexes. Technical replicates of eight samples were differentially labeled and included in different TMT plexes. Replicates clustered together as expected and had an average Pearson correlation coefficient of 0.8 for global proteome and 0.77 for phosphoproteomic normalized ratios, respectively. We performed four or three biological replicates of cell culture experiments for proteomics or inhibitor treatments, respectively. All attempts of replication were successful and no replicate was excluded from analysis. Differentially expressed proteins were determined with a two-sided moderated two-sample t-test (limma package). The resulting P values were corrected with the Benjamini–Hochberg method. Drug treatments of each cell line were compared to respective dimethyl sulfoxide (DMSO) controls with a Dunnett’s test. For analyzing CRISPR–Cas9 activation screen data, the MAGeCK maximum-likelihood estimation (MLE) algorithm was applied for the analysis of beta scores and P values.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Check out our other content

Most Popular Articles