The RF ModCall module takes two RC files generated by the RF Count module, and performs transcriptome-wide single-base resolution calling of Ψ/2'-OMe residues from Ψ-seq/Pseudo-seq and 2OMe-seq experiments.
For each transcript's position, two measures are computed:
where Si and Ri are respectively the score and the ratio at position i of the transcript, w is the size (in nt) of a window centered on position i, nTi and nUi are respectively the number of RT-stops in the CMCT treated (or low dNTP) and CMCT untreated (or high dNTP) samples, and cTi is the read coverage at position i in the CMCT treated (or low dNTP) sample.
The score is a measure of the RT-stop enrichment in the CMCT treated (or low dNTP) sample at a given position, with respect to the surrounding bases, and to the CMCT untreated (or high dNTP) sample.
The ratio is a relative quantitation of the modification stoichiometry at a given position in the CMCT treated (or low dNTP) sample.
When processing SAM/BAM files from Ψ-seq/Pseudo-seq or 2OMe-seq experiments with RF Count, avoid using the
--no-mapped-count parameter, otherwise RF ModCall will not be able to perform library size scaling.
To list the required parameters, simply type:
$ rf-modcall -h
|-u or --untreated||string||Path to the RC file for the CMCT untreated (or high dNTP)|
|-t or --treated||string||Path to the RC file for the CMCT treated (or low dNTP) sample|
|-i or --index||string[,string]||A comma separated (no spaces) list of RCI index files for the provided RC files
Note #1: RCI files must be provided in the order 1. Untreated, 2. Treated
Note #2: If a single RTI file is specified, it will be used for all RC files
Note #3: If no RCI index is provided, it will be generated at runtime, and stored in the same folder of the untreated/treated samples
|-p or --processors||int||Number of processors (threads) to use (Default: 1)|
|-o or --output-dir||string||Output directory for writing site scores and ratios in XML format (Default: <treated>_vs_<untreated>/)|
|-ow or --overwrite||Overwrites the output directory if already exists|
|-w or --window||int||Window size (in nt) for score calculation (≥3, Default: 150)|
|-ts or --to-smaller||The larger sample will be scaled toward the smaller one (Default: scale smaller sample to the larger one)|
|-mc or --mean-coverage||float||Discards any transcript with mean coverage below this threshold (≥0, Default: 0)|
|-ec or --median-coverage||float||Discards any transcript with median coverage below this threshold (≥0, Default: 0)|
|-D or --decimals||int||Number of decimals for reporting scores/ratios (1-10, Default: 3)|
|-n or --nan||int||Positions of transcript with read coverage behind this threshold, will be reported as NaN in the reactivity profile (>0, Default: 10)|
Output XML files
RF ModCall produces a XML file for each transcript being analyzed, with the following structure:
<?xml version="1.0" encoding="UTF-8"?> <data [attributes]> <transcript id=”Transcript ID” length=”Transcript length”> <sequence> Transcript sequence </sequence> <score> Comma-separated list of scores </score> <ratio> Comma-separated list of ratios </ratio> </transcript> </data>
The data tag’s attributes allow keeping track of the analysis performed:
|win||Positive integer ≥ 3||Window's size (in nt) for score calculation|
|tosmaller||TRUE/FALSE||Whether the larger dataset has been scaled to the size of the smaller one|