Performance of JAFFA and four other tools on cancer RNA-Seq. (A) A ROC-style curve for the ranking of candidate fusions in the Edgren dataset. The Edgren dataset consists of between 7 and 21 million 50 bp read pairs of the BT-474, SK-BR-3, KPL-4 and MCF-7 cell lines. The number of true positives fusions are plotted against the number of other reported fusions from a ranked list of fusion candidates. Probable true positives (see text for detail) are removed. Higher curves indicate a better ranking of the true positives. For each fusion detection tool, we ranked the candidates using the tools own scoring system, or if absent, the supporting data that maximised the area under the curve. SOAPfuse ranked true positives higher than other tools, followed by FusionCatcher and JAFFA. (B) On long read data - the ENCODE dataset consisting of 20 million 100 bp read pairs of the MCF-7 cell line - JAFFA ranks true positives higher than any other tool. (C) JAFFA’s sensitivity is confirmed on a second long read dataset - 13 glioma samples with read depths in the range of 15 to 35 million 100 bp read-pairs. JAFFA identifies 30 of the 31 true positives (total true positives are indicated by the dashed line). Downsampling the data to mimic smaller read depths indicates that JAFFA has similar sensitivity with 2 million read pairs per samples as other tools on 10 million read pairs per sample.