Increased DNA: RNA hybrid formation is due to the defects in RNA processing pathways leading to genome instability and replication stress across species. R loops threaten genome stability and often form under abnormal conditions where nascent mRNA is improperly processed or RNA half-life is increased, resulting in RNA that can hybridize with template DNA, displacing the transcribed DNA strand. A recent study also found that hybrid formation can occur in trans via Rad51-mediated DNA-RNA strand exchange. Persistent R loops pose a major threat to genome stability through two mechanisms. First, the exposed non-transcribed strand is susceptible to endogenous DNA damage due to the increased exposure of chemically reactive groups. The second, more widespread mechanism, identified in Escherichia coli, Saccharomyces cerevisiae, Caenorhabditis elegans and human cells, involves the R loops and associated stalled transcription complexes, which block DNA replication fork progression. R loop-mediated instability is an area of great interest primarily because genome instability is considered an enabling characteristic of tumor formation. Moreover, mutations in RNA splicing/ processing factors are frequently found in human cancer, heritable diseases like Aicardi-Goutieres syndrome, and a degenerative ataxia associated with Senataxin mutations.
To avoid the deleterious effects of R loops, cells express enzymes for the removal of abnormally formed DNA: RNA hybrids. In S. cerevisiae, RNH1 and RNH201, each encoding RNase H are responsible for one of the best-characterized mechanisms for reducing R loop formation by enzymatically degrading the RNA in DNA: RNA hybrids. Another extensively studied anti-hybrid factor is the THO/TREX complex which functions to suppress hybrid formation at the level of transcription termination and mRNA packaging. In addition, the Senataxin helicase, yeast Sen1, plays an important role in facilitating replication fork progress through transcribed regions and unwinding RNA in hybrids to mitigate R loop formation and RNA polymerase II transcription-associated genome instability.
Several additional anti-hybrid mechanisms have also been identified including topoisomerases and other RNA processing factors. To add to the complexity of DNA: RNA hybrid management in the cell, hybrids also occurs naturally and has important biological functions. In human cells, R loop formation facilitates immunoglobulin class switching protects against DNA methylation at CpG island promoters and plays a key role in pause site-dependent transcription termination. Transcription of telomeres by RNA polymerase II also produces telomeric repeat-containing RNAs (TERRA), which associate with telomeres and inhibit telomere elongation in a DNA: RNA hybrid-dependent fashion. Noncoding (nc)RNA such as antisense transcripts, perform a regulatory role in the expression of sense transcripts that may involve R loops. The proposed mechanisms of antisense transcription regulation are not clearly understood and involve different modes of action specific to each locus.
Current models include chromatin modification resulting from antisense-associated transcription, antisense transcription modulation of transcription regulators, a collision of sense and antisense transcription machinery and antisense transcripts expressed in trans interacting with the promoter for sense transcription. More recently, studies in Arabidopsis thaliana found an antisense transcript that forms R loops, which can be differentially stabilized to modulate gene regulation. Similarly, in mouse cells, the stabilization of an R loop was shown to inhibit antisense transcription.
Here a genome-wide profile of DNA: RNA hybrid prone loci in S. cerevisiae by DNA: RNA immunoprecipitation followed by hybridization on tiling microarrays (DRIP-chip) is described. In wild-type cells DNA: RNA hybrids occurred at highly transcribed regions. DNA: RNA hybrids were significantly associated with genes that have corresponding antisense transcripts, suggesting a role for hybrid formation at these loci in gene regulation. Genes whose expression was altered by overexpression of RNase H were also significantly associated with antisense transcripts. A small-scale cytological screen found that diverse RNA processing mutants had increased hybrid formation and additional DRIP-chip studies revealed specific hybrid-site biases in the RNase H, Sen1 and THO complex subunit Hpr1 mutants. DRIP is a technology for genome-wide profiling of “R-loop”. DRIP-seq utilizes a sequence-independent but structure-specific antibody for DNA-RNA immunoprecipitation (DRIP) to capture R-loops for massively parallel DNA sequencing. These genome-wide analyses enhance our understanding of DNA: RNA hybrid-forming regions in vivo, highlight the role of cellular RNA processing activities in suppressing hybrid formation, and implicate DNA: RNA hybrids in control of a subset of antisense regulated loci.
Strains and plasmids
For RNase H overexpression experiments, recombinant human RNase H1 was expressed from plasmid p425-GPD-RNase H1 (2m, LEU2, GPDpr-RNase H1) and compared to an empty control plasmid p425-GPD (2m, LEU2, GPDpr).
DRIP-chip and qPCR
Briefly, cells were grown overnight, diluted to 0.15 OD600 and grown to 0.7 OD600. Crosslinking was done with 1% formaldehyde for 20 minutes. Chromatin was purified and sonicated to yield approximately 500 bp fragments. 40 mg of the anti-DNA: RNA hybrid monoclonal mouse antibody S9.6 was coupled to 60 mL of protein A magnetic beads. For ChIP-qPCR, crosslinking reversal and DNA purification were followed by qPCR analysis of the immunoprecipitated and input DNA. DNA was analyzed using a real time PCR. Samples were analyzed in triplicate on three independent DRIP samples for wild-type and rnh1Drnh201D (regulator gene).
For DRIP-chip, precipitated DNA was amplified via two rounds of T7 RNA polymerase amplification, biotin labeled and hybridized to S. cerevisiae microarrays. Samples were normalized to a no antibody control sample (mock) using the rMAT software and relative occupancy scores were calculated for all probes using a 300 bp sliding window. All profiles were generated in duplicate and replicates were quantile normalized and averaged. Spearman correlation scores between replicates are found.
Enriched features had at least 50% of the probes contained in the feature above the threshold of 1.5. Only features enriched in both replicates were reported. Transcriptional frequency, GC content and gene length were compared using the Wilcoxon rank sum test. Antisense association was analyzed by using R. Statistical analysis of genomic feature enrichment was performed using a Monte Carlo simulation, which randomly generates start positions for the particular set of features and calculates the proportion of that feature that would be enriched in a given DRIP-chip profile if the feature were distributed at random. 500 simulations were run per feature for each DRIP-chip replicate to obtain mean and standard deviation values. These values were used to calculate the cumulative probability (P) on a normal distribution of seeing a score lower than the observed value by chance.
Relative occupancy scores for each transcript were binned into segments of 150 bp. Transcripts were sorted by their length, transcriptional frequency or GC content and aligned by their Transcription Start Sites (TSS). For transcriptional frequency transcripts were grouped into five classes according to their transcriptional frequency. For GC content transcripts were grouped into four classes according to their GC content. Average gene, tRNA or snoRNA profiles were generated by averaging all the probes that were encompassed by the features of interest. For averaging ORFs, corresponding probes were split into 40 bins while 1500 bp of UTRs and their probes were split into 20 bins. For smaller features like tRNAs and snoRNAs corresponding probes were split into only 3 bins. Average enrichment scores were calculated using in-house scripts that average the score of all the probes encompassed by the feature.
Gene expression microarray
Strains harboring the RNase H1 over-expression plasmid or empty vector were grown in SC-Leucine at 30oC. All profiles were generated in duplicate. Total RNA was isolated from 1 OD600 of yeast cells using a RNA prutification kit, amplified, labeled, fragmented using a RNA Amplification Kit and hybridized to a GeneChIP Yeast Genome microarray using the GeneChip Hybridization, Wash, and Stain Kit. Arrays were scanned by the Gene Chip Scanner and expression data was extracted using Expression Console Software with the MAS5.0 statistical algorithm. All arrays were scaled to a median target intensity of 500. A minimum cut off of pvalue of 0.05 and signal strength of 100 across all samples were implemented and only transcripts that had over a 2-fold change in the RNase H over-expression strain compared to wild type were considered significant. The correlation between duplicate biological samples was: control (r =0.9955), RNase H over-expression (r = 0.9719). For statistical analysis, GC content, transcription frequencies, and antisense association were analyzed as for DRIPchip analysis.
Yeast chromosome spreads
Cells were grown to mid-log phase in YEPD rich media at 30 oC and washed in spheroplasting solution (1.2 M sorbitol, 0.1 M potassium phosphate, 0.5 M MgCl2, pH 7) and digested in spheroplasting solution with 10 mM DTT and 150 g/mL Zymolase 20T at 37 oC for 20 minutes. The digestion was halted by addition of ice-cold stop solution (0.1 M MES, 1 M sorbitol, 1 mM EDTA, 0.5 mM MgCl2, pH 6.4) and spheroplasts were lysed with 1% vol/vol Lipsol and fixed on slides using 4% wt/vol paraformaldehyde/ 3.4% wt/vol sucrose. Chromosome spread slides were incubated with the mouse monoclonal antibody S9.6 (1 mg/mL in blocking buffer of 5% BSA, 0.2% milk and 16 PBS). The slides were further incubated with a secondary Cy3-conjugated goat anti-mouse antibody (diluted 1:1000 in blocking buffer). For each replicate, at least 100 nuclei were visualized and manually counted to obtain the fraction with detectable DNA: RNA hybrids. Each mutant was assayed in triplicate. Mutants were compared to wild-type by the Fisher’s exact test. To correct for multiple hypothesis testing, implement a cut off of p,0.01 divided by the total number of mutants compared to wild-type, meaning mutants with p,0.00024 were considered significantly different from wild-type.
BPS sensitivity assay
10-fold serial dilutions of each strain were spotted on 90 M BPS plates with FeSO4 concentrations of 0, 2.5, 20 or 100 mM and grown at 30oC for 3 days.
The genomic distribution of DNA:RNA hybrids
The genomic distribution of DNA: RNA hybrids DNA:RNA hybrids have been previously immunoprecipitated at specific genomic sites such as rDNA, selected endogenous loci, and reporter constructs. Subsequently, DRIP coupled with deep sequencing in human cells has demonstrated the prevalence of R loops at CpG island promoters with high GC skew. To investigate the global profile of DNA: RNA hybrid prone loci in a tractable model, we performed genome-wide DRIP-chip analysis of wild-type S. cerevisiae (ArrayExpress E-MTAB-2388) using the S9.6 monoclonal antibody which specifically binds DNA: RNA hybrids. DRIP-chip profiles were generated in duplicate (spearman’s r= 0.78 when comparing each of over 2 million probes after normalization and data smoothing) and normalized to a no antibody control.
DNA: RNA hybrids are significantly correlated with genes associated with antisense transcripts Certain DNA: RNA hybrid enriched regions identified by our DRIP-chip analysis such as rDNA and retrotransposons are associated with antisense transcripts. By comparing our list of DNA: RNA prone loci to a list of antisense-associated genes this feature is analyzed for repeatability. Because of the expression of antisense-associated transcripts may be highly dependent on environmental conditions, we based our analysis on a list of transcripts identified in S288c yeast grown to mid-log phase in rich media which most closely mirrors the growth conditions of our cultures analyzed by DRIP-chip. DNA: RNA hybrid enriched genes significantly overlapped with antisense-associated genes, suggesting that DNA: RNA hybrids may play a role in antisense transcript-mediated regulation of gene expression
RNase H overexpression reduces detectable levels of
DNA: RNA hybrids in cytological screens and suppresses genomic instability associated with R loop formation presumably through the degradation of DNA:RNA hybrids. To test for a potential role of DNA:RNA hybrids in antisense-mediated gene regulation, we performed gene expression microarray analysis of an RNase H overexpression strain compared to an empty vector control (GEO GSE46652). This identified genes that had increased mRNA levels (upregulated n= 212) or decreased mRNA levels (downregulated n =88) as a result of RNase H overexpression.
A significant portion of the genes with increased mRNA levels were antisense-associated (Fisher exact test p = 2.9e-7) and tended to have high GC content, similar to DNA: RNA hybrid enriched genes in wild-type. However, the genes with increased mRNA levels under RNase H overexpression and the antisense-associated genes enriched for DNA: RNA hybrids in our DRIP experiment both tended towards lower transcriptional frequencies. These findings suggest that antisense associated DNA: RNA hybrids moderate the levels of gene expression. Indeed, genes that were both modulated by RNase H overexpression and enriched for DNA: RNA hybrids were all found to be antisense-associated. One possibility is that the stress of RNase H overexpression triggers gene expression programs that coincidentally are antisense regulated. Gene ontology (GO) terms enriched among genes whose expression was changed by RNase H overexpression was also analyzed.
Cytological profiling of RNA processing mutants for R loop formation
To gain a broader understanding of factors involved in R loop formation, a cytological screening of RNA processing, transcription, and chromatin modification mutants for DNA: RNA hybrids using the S9.6 antibody was done. In the screen, hybrids in mutants affecting several pathways linked to DNA: RNA hybrid formation such as transcription, nuclear export and the exosome are detected. Consistent with findings in metazoan cells, hybrid formation in some splicing mutants was also observed. Several rRNA processing mutants were enriched for DNA: RNA hybrids (7 out of the 22 positive hits), likely due to DNA: RNA hybrid accumulation at rDNA genes, a sensitized hybrid formation site. It is possible that, as seen in mRNA cleavage and polyadenylation mutants, DNA: RNA hybrid formation may contribute to their CIN phenotypes.The success of this small-scale screen suggests that most RNA processing pathways suppress hybrid formation to some degree and that many DNA: RNA hybrid forming mutants remain undiscovered.
DRIP-chip profiling of R loop forming mutants
To better understand the mechanism by which cells regulate DNA: RNA hybrids, DRIP-chip analysis of rnh1rnh201D, hpr1D, and sen1-1 mutants were performed in order to determine if these contribute differentially to the DNA: RNA hybrid genomic profile. The rnh1Drnh201D, hpr1D, and sen1-1 mutants are particularly interesting because they have well-established roles in the regulation of transcription-dependent DNA: RNA hybrid formation. DRIP-chip profiles revealed that, similar to wild type profiles, the mutant profiles were enriched for DNA: RNA hybrids at rDNA, telomeres, and retrotransposons. The rnh1Drnh201D, hpr1D, and sen1-1 mutants also exhibited DNA: RNA hybrid enrichmentin 1206, 1490 and 1424 Open Reading Frames (ORFs) respectively compared to 1217 DNA: RNA hybrid enriched ORFs identified in wild-type. Interestingly, in addition to the similarities described above, our profiles also identified differential effects of the mutants on the levels of DNA: RNA hybrids. In particular, we observed that deletion of HPR1 resulted in higher levels of DNA: RNA hybrids along the length of most ORFs with a preference for longer genes compared to wild type. This observation is consistent with Hpr1’s role in bridging transcription elongation to mRNA export and its localization of actively transcribed genes. In contrast, mutating SEN1 resulted in higher levels of DNA: RNA hybrids at shorter genes, which is consistent with Sen1’s role in transcription termination particularly for short protein-coding genes. The rnh1Drnh201D mutant revealed higher levels of DNA: RNA hybrids at highly transcribed and longer genes which is supported by a wealth of evidence of RNase H’s role in suppressing R loops in long genes to prevent collisions between transcription and replication machineries.
The analysis also revealed that rnh1Drnh201D and sen1-1 mutants but not the hpr1D mutant had increased DNA: RNA hybrids at tRNA genes (two-tailed unpaired Wilcox test p = 1.56e-19 in the rnh1Drnh201D mutant and 1.68e-15 in the sen1-1 mutant) and this was confirmed by DRIP quantitative PCR (qPCR) of two tRNA genes in wild-type and rnh1Drnh201D. Because tRNAs are transcribed by RNA polymerase III, this observation indicates that Hpr1 is primarily involved in the regulation of RNA polymerase II specific DNA: RNA hybrids while RNase H and Sen1 have roles in a wider range of transcripts. Mutation of SEN1 also led to increased levels DNA: RNA hybrids at snoRNA (two-tailed unpaired Wilcox test p= 1.81e-6) consistent with its role in 39 end processing of snoRNAs.