The RF ModCall module takes two RC files generated by the RF Count module, and performs transcriptome-wide single-base resolution calling of Ψ/2'-OMe residues from Ψ-seq/Pseudo-seq and 2OMe-seq experiments.

Information

For more details, please refer to Carlile et al., 2014 (Pseudo-seq, PMID: 25192136), Schwartz et al., 2014 (Ψ-seq, PMID: 25219674), and Incarnato et al., 2016 (2OMe-seq, PMID: 27614074).

For each transcript's position, two measures are computed:

Si=w×nTi-nUij=i-w2i+w2(nTj+nUj) - nTi-nUi


Ri=nTicTi
where Si and Ri are respectively the score and the ratio at position i of the transcript, w is the size (in nt) of a window centered on position i, nTi and nUi are respectively the number of RT-stops in the CMCT treated (or low dNTP) and CMCT untreated (or high dNTP) samples, and cTi is the read coverage at position i in the CMCT treated (or low dNTP) sample.
The score is a measure of the RT-stop enrichment in the CMCT treated (or low dNTP) sample at a given position, with respect to the surrounding bases, and to the CMCT untreated (or high dNTP) sample.
The ratio is a relative quantitation of the modification stoichiometry at a given position in the CMCT treated (or low dNTP) sample.

Warning

When processing SAM/BAM files from Ψ-seq/Pseudo-seq or 2OMe-seq experiments with RF Count, avoid using the --no-mapped-count parameter, otherwise RF ModCall will not be able to perform library size scaling.

Usage

To list the required parameters, simply type:

$ rf-modcall -h
Parameter Type Description
-u or --untreated string Path to the RC file for the CMCT untreated (or high dNTP)
-t or --treated string Path to the RC file for the CMCT treated (or low dNTP) sample
-i or --index string[,string] A comma separated (no spaces) list of RCI index files for the provided RC files
Note #1: RCI files must be provided in the order 1. Untreated, 2. Treated
Note #2: If a single RTI file is specified, it will be used for all RC files
Note #3: If no RCI index is provided, it will be generated at runtime, and stored in the same folder of the untreated/treated samples
-p or --processors int Number of processors (threads) to use (Default: 1)
-o or --output-dir string Output directory for writing site scores and ratios in XML format (Default: <treated>_vs_<untreated>/)
-ow or --overwrite Overwrites the output directory if already exists
-w or --window int Window size (in nt) for score calculation (≥3, Default: 150)
-ts or --to-smaller The larger sample will be scaled toward the smaller one (Default: scale smaller sample to the larger one)
-mc or --mean-coverage float Discards any transcript with mean coverage below this threshold (≥0, Default: 0)
-ec or --median-coverage float Discards any transcript with median coverage below this threshold (≥0, Default: 0)
-D or --decimals int Number of decimals for reporting scores/ratios (1-10, Default: 3)
-n or --nan int Positions of transcript with read coverage behind this threshold, will be reported as NaN in the reactivity profile (>0, Default: 10)


Output XML files

RF ModCall produces a XML file for each transcript being analyzed, with the following structure:

<?xml version="1.0" encoding="UTF-8"?>
<data [attributes]>
    <transcript id=”Transcript ID” length=”Transcript length”>
        <sequence>
            Transcript sequence
        </sequence>
        <score>
            Comma-separated list of scores
        </score>
        <ratio>
            Comma-separated list of ratios
        </ratio>
    </transcript>
</data>

The data tag’s attributes allow keeping track of the analysis performed:

Attribute Possible values Description
win Positive integer ≥ 3 Window's size (in nt) for score calculation
tosmaller TRUE/FALSE Whether the larger dataset has been scaled to the size of the smaller one