The RF Fold module is designed to allow transcriptome-wide reconstruction of RNA structures, starting from XML files generated using the RF Norm tool. This tool can process a single, or an entire directory of XML files, and produces the inferred secondary structures (either in dot-bracket notation, or CT format) and their graphical representation (either in Postscript, or SVG format).
Folding inference can be performed using 2 different algorithms:

1. ViennaRNA
2. RNAstructure

Prediction can be performed either on the whole transcript, or through a windowed approach (see next paragraph).

Windowed folding

The windowed folding approach is based on the original method described in Siegfried et al., 2014 (PMID: 25028896), and consists of 3 main steps, outlined below:

RNAFramework pipeline

In step I (optional), a window is sled along the RNA, and pseudoknotted structures are detected using the same approach employed by the ShapeKnots algorithm (Hajdin et al., 2013 (PMID: 23503844)). Our implementation of the ShapeKnots algorithm relies on the ViennaRNA package (instead of RNAstructure as the original implementation did), thus is much faster:

ShapeKnots/RNA Framework comparison

Nonetheless, both algorithms work in single thread. Alternatively, the multi-thread implementation ShapeKnots-smp shipped with the latest RNAstructure version can be used.
If constraints from structure probing experiments are provided, these are incorporated in the form of soft-constraints. Predicted pseudoknotted base-pairs are retained if they apper in >50% of analyzed windows. In case constraints are provided, pseudoknots are retained only if the average reactivity of bases on both sides of the helices is below a certain reactivity cutoff.
In step II, a window is sled along the RNA, and partition function is calculated. If provided, soft-constraints are applied. If step I has been performed, pseudoknotted bases are hard-constrained to be single-stranded. Predicted base-pair probabilities are averaged across all windows in which they have appeared, and base-pairs with >99% probability are retained, and hard-constrained to be paired in step III.
In step III, a window is sled along the RNA, and MFE folding is performed, including (where present) soft-constraints from probing data, and hard-constraints from stages I and II. Predicted base-pairs are retained if they appear in >50% of analyzed windows.

Note

At all stages, increased sampling is performed at the 5'/3'-ends to avoid end biases

At this stage, if step I has been peformed, pseudoknotted base-pairs are added back to the structure, and the free energy is computed. Along with the predicted structure, the windowed method also produces a WIGGLE track file containing per-base Shannon entropies.
Regions with higher Shannon entropies are likely to form alternative structures, while those with low Shannon entropies correspond to regions with well-defined RNA structures, or persistent single-strandedness (Siegfried et al., 2014).
Shannon entropy is calculated as:

Hi=-plog10 pi


where pi is the probability of base i of being base-paired.

Usage

To list the required parameters, simply type:

$ rf-fold -h
Parameter Type Description
-o or --output-dir string Output directory for writing inferred structures (Default: structurome/)
-ow or --overwrite Overwrites the output directory if already exists
-ct or --connectivity-table Writes predicted structures in CT format (Default: Dot-bracket notation)
-m or --folding-method int Folding method (1-2, Default: 1):
1. ViennaRNA
2. RNAstructure
-p or --processors int Number of processors (threads) to use (Default: 1)
-g or --img Enables generation of structure representations (Default: Postscript format)
-s or --svg Structure representations are generated in SVG format (requires -g)
-t or --temperature float Temperature in Celsius degrees (Default: 37.0)
-sl or --slope float Sets the slope used with structure probing data restraints (Default: 1.8 [kcal/mol])
-in or --intercept float Sets the intercept used with structure probing data restraints (Default: -0.6 [kcal/mol])
-md or --maximum-distance int Maximum pairing distance (in nt) between transcript's residues (Default: 0 [no limit])
-i or --ignore-reactivity Ignores XML reactivity data when performing folding (MFE unconstrained prediction)
-hc or --hard-constraint Besides performing soft-constraint folding, allows specifying a reactivity cutoff (specified by -f) for hard-constraining a base to be single-stranded
-f or --cutoff float Reactivity cutoff for constraining a position as unpaired (>0, Default: 0.7)
-w or --windowed Enables windowed folding
Folding method #1 options (ViennaRNA)
-vrf or --vienna-rnafold string Path to ViennaRNA RNAfold executable (Default: assumes RNAfold is in PATH)
-nlp or --no-lonelypairs Disallows lonely base-pairs (1 bp helices) inside predicted structures
-ngu or --no-closing-gu Disallows G:U wobbles at the end of helices
-cm or --constraint-method int Method for converting provided reactivities into pseudo-energies (1-2, Default: 1):
1. Deigan et al., 2009
2. Zarringhalam et al., 2012
Zarringhalam et al., 2012 method options
-cc or --constraint-conversion int Method for converting rf-norm reactivities into pairing probabilities (1-5, Default: 1):
1. Skip normalization step (reactivities are treated as pairing probabilities)
2. Linear mapping according to Zarringhalam et al., 2012
3. Use a cutoff to divide nucleotides into paired, and unpaired
4. Linear model for converting reactivities into probabilities of being unpaired
5. Linear model for converting the logarithm of reactivities into probabilities of being unpaired
-bf or --beta-factor float Sets the magnitude of penalities for deviations from the observed pairing probabilities (Default: 0.5)
-ms or --model-slope float Sets the slope used by the linear model (Default: 0.68 [Method #4], or 1.6 [Method #5]; requires -cc 4 or -cc 5)
-mi or --model-intercept float Sets the intercept used by the linear model (Default: 0.2 [Method #4], or -2.29 [Method #5]; requires -cc 4 or -cc 5)
Folding method #2 options (RNAstructure)
-rs or --rnastructure string Path to RNAstructure Fold executable (Default: assumes Fold is in PATH)
Note: by default, Fold-smp will be used (if available)
-d or --data-path string Path to RNAstructure data tables (Default: assumes DATAPATH environment variable is already set)
Windowed folding options
-pt or --partition string Path to RNAstructure partition executable (Default: assumes partition is in PATH)
Note: by default, partition-smp will be used (if available)
-pp or --probabilityplot string Path to RNAstructure ProbabilityPlot executable (Default: assumes ProbabilityPlot is in PATH)
-fw or --fold-window int Window size (in nt) for performing MFE folding (>=50, Default: 600)
-fo or --fold-offset int Offset (in nt) for MFE folding window sliding (Default: 200)
-pw or --partition-window int Window size (in nt) for performing partition function (>=50, Default: 600)
-po or --partition-offset int Offset (in nt) for partition function window sliding (Default: 200)
-wt or --window-trim int Number of bases to trim from both ends of the partition windows to avoid end biases (Default: 100)
-dp or --dotplot Enables generation of structure dot-plots only for base-pairs present in the final structure
-dpa or --dotplot-all Enables generation of structure dot-plots for any possible base-pair
-sh or --shannon-entropy Enables generation of a WIGGLE track file with per-base Shannon entropies
-pk or --pseudoknots Enables detection of pseudoknots (computationally intensive)
-kp1 or --pseudoknot-penality1 float Pseudoknot penality P1 (Default: 0.35)
-kp2 or --pseudoknot-penality2 float Pseudoknot penality P2 (Default: 0.65)
-kw or --pseudoknot-window int Window size (in nt) for performing pseudoknots detection (>=50, Default: 600)
-ko or --pseudoknot-offset int Offset (in nt) for pseudoknots detection window sliding (Default: 200)
-kc or --pseudoknot-cutoff float Reactivity cutoff for retaining a pseudoknotted helix (0-1, Default: 0.5)
-km or --pseudoknot-method int Algorithm for pseudoknots prediction (1-2, Default: 1):
1. RNA Framework
2. ShapeKnots
RNA Framework pseudoknots detection algorithm options
-vrs or --vienna-rnasubopt string Path to ViennaRNA RNAsubopt executable (Default: assumes RNAsubopt is in PATH)
-ks or --pseudoknot-suboptimal int Number of suboptimal structures to evaluate for pseudoknots prediction (>0, Default: 1000)
-kh or --pseudoknot-helices int Number of candidate pseudoknotted helices to evaluate (>0, Default: 100)
ShapeKnots pseudoknots detection algorithm options
-sk or --shapeknots string Path to ShapeKnots executable (Default: assumes ShapeKnots is in PATH)
Note: by default, ShapeKnots-smp will be used (if available)

Information

For additional details relatively to ViennaRNA soft-constraint prediction methods, please refer to the ViennaRNA documentation, or to Lorenz et al., 2016 (PMID: 26353838).

Information

For additional details relatively to ShapeKnots pseudoknots detection parameters, please refer to Hajdin et al., 2013 (PMID: 23503844).


Output dot-plot files

When options -dp or -dpa are provided, RF Fold produces a dot-plot file for each transcript being analyzed, with the following structure:

1549                                   # RNA's length
i       j       -log10(Probability)    # Header 
8       254     0.459355416499312
9       253     0.446335563943221
10      252     0.456738523239413
11      251     0.454733421725068
12      250     0.46965667808714
13      249     0.47837140333524
21      35      0.268192200569539
22      34      0.0183400615262171
23      33      0.0166665677814708
24      32      0.0128927546134575
25      31      0.0148601207296645
26      30      0.0252017532628297

-- cut --

1497    1510    0.0147874890078331
1498    1509    0.0102803152157546
1499    1508    0.0137510190884233
1500    1507    0.0402352346970943

where i and j are the positions (1-based) of the bases involved in a given base-pair, followed by the -log10 of their base-pairing probability.
These files can be easily viewed using the Integrative Genomics Viewer (IGV) (for additional details, please refer to the official Broad Institute's IGV page).