RF Silico calculates partition function folding for a given set of RNAs, using either ViennaRNA, RNAstructure, or their combination. The probability of each base of being unpaired is then reported in the form of a XML file, along with per-base Shannon entropies.
Regions with higher Shannon entropies are likely to form alternative structures, and those with low Shannon entropies correspond to regions with well-defined RNA structures, or persistent single-strandedness (Siegfried et al., 2014).
Shannon entropy is calculated as:

Hi=-plog10 pi


where pi is the probability of base i of being base-paired.

Usage

To list the required parameters, simply type:

$ rf-silico -h
Parameter Type Description
-f or --fasta string Path to a multi-FASTA file containing transcript sequences
-o or --output-dir string Output directory for writing probability data in XML format (Default: rf_silico/)
-ow or --overwrite Overwrites the output directory if already exists
-p or --processors int Number of processors (threads) to use (Default: 1)
-t or --tmp-dir string Path to a directory for temporary files creation (Default: /tmp)
Note: If the provided directory does not exist, it will be created
-m or --method int Partition function calculation method (1-3, Default: 1):
1. ViennaRNA
2. RNAstructure
3. Combined
Note: method #3 calculates base-pair probabilities using both ViennaRNA and RNAstructure, and produces a XML file containing the per-base average of the two algorithms
-e or --temperature float Temperature in Celsius degrees (Default: 37.0)
-md or --maximum-distance int Maximum pairing distance (in nt) between transcript's residues (Default: 0 [no limit])
-v or --viennarna string Path to ViennaRNA RNAfold executable (Default: assumes RNAfold is in PATH)
-pr or --partition string Path to RNAstructure partition executable (Default: assumes partition is in PATH)
-pp or --probability-plot string Path to RNAstructure ProbabilityPlot executable (Default: assumes ProbabilityPlot is in PATH)
-dp or --data-path string Path to RNAstructure data tables (Default: assumes DATAPATH environment variable is already set)
-w or --window-size int Window's size (in nt) for base-pair probability calculation (≥3, Default: full transcript)
-wo or --window-offset int Offset for window sliding (≥1, Default: none)
-kb or --keep-bases string Bases to report in the XML file (Default: N [ACGT])
Note: This parameter accepts any IUPAC code, or their combination (e.g. -kb M, or -kb AC). Any other base will be reported as NaN
-d or --decimals int Number of decimals for reporting base probabilities (1-10, Default: 3)

Note

When using methods #2 or #3, if possible, RF Silico uses RNAstructure partition-smp instead of partition to speed-up execution

Output XML files

RF Silico produces a XML file for each transcript being analyzed, with the following structure:

<?xml version="1.0" encoding="UTF-8"?>
<data [attributes]>
    <transcript id=”Transcript ID” length=”Transcript length”>
        <sequence>
            Transcript sequence
        </sequence>
        <probability>
            Comma-separated list of probability values
        </probability>
    </transcript>
</data>

The data tag’s attributes allow keeping track of the analysis performed:

Attribute Possible values Description
tool rf-silico The tool that generated this XML file
algorithm ViennaRNA, RNAstructure, or Combined Algorithm used for partition function calculation
keep [ACGT] Kept bases
win Positive integer ≥ 3 Window's size (in nt) for partition function calculation
offset Positive integer ≥ 1 Offset for window sliding
maxdist Positive integer ≥ 0 Maximum distance (in nt) between base-paired residues