{"id":1261,"date":"2020-12-14T15:50:30","date_gmt":"2020-12-14T15:50:30","guid":{"rendered":"https:\/\/www.meb.ki.se\/sites\/biostatwiki\/?p=1261"},"modified":"2025-04-28T09:25:39","modified_gmt":"2025-04-28T09:25:39","slug":"circall","status":"publish","type":"post","link":"https:\/\/www.meb.ki.se\/sites\/biostatwiki\/circall\/","title":{"rendered":"circall"},"content":{"rendered":"<h1>A fast and accurate methodology for discovery of circular RNAs from paired-end RNA-sequencing data<\/h1>\n<h1>Contents<\/h1>\n<p style=\"padding-left: 30px\"><a href=\"#user-content-1-introduction\">1. Introduction<\/a><br \/>\n<a href=\"#user-content-2-download-and-installation\">2. Download and installation<\/a><br \/>\n<a href=\"#user-content-3-prepare-bsj-reference-database-and-sqlite-annotation-files\">3. Prepare BSJ reference database and annotation files<\/a><br \/>\n<a href=\"#user-content-4-indexing-transcriptome-and-bsj-reference-database\">4. Indexing transcriptome and BSJ reference database<\/a><br \/>\n<a href=\"#user-content-5-run-circall-pipeline\">5. Run Circall pipeline<\/a><br \/>\n<a href=\"#user-content-6-a-practical-copy-paste-example-of-running-circall\">6. A practical copy-paste example of running Circall<\/a><br \/>\n<a href=\"#user-content-7-circall-simulator\">7. Circall simulator<\/a><\/p>\n<h2><a id=\"user-content-update-news\" class=\"anchor\" href=\"https:\/\/www.meb.ki.se\/sites\/biostatwiki\/circall\/\/#update-news\" aria-hidden=\"true\"><\/a>Update news<\/h2>\n<p><strong>28 April 2025: <a href=\"https:\/\/drive.google.com\/file\/d\/1sYvjPzLzBo7wR_GiaQObzRnATNxkM67k\/view?usp=drive_linkhttps:\/\/drive.google.com\/file\/d\/1sYvjPzLzBo7wR_GiaQObzRnATNxkM67k\/view?usp=drive_link\">version 1.1.0 (click to download)<\/a><\/strong><\/p>\n<ul>\n<li>Support non-exonic circular RNAs provided by circbase database.<\/li>\n<\/ul>\n<p><strong>11 July 2022: <a href=\"https:\/\/github.com\/datngu\/Circall\/archive\/refs\/tags\/v1.0.0.tar.gz\">version 1.0.1 (click to download)<\/a><\/strong><\/p>\n<ul>\n<li>Some changes in C++ binary (rebuild the software).<\/li>\n<\/ul>\n<h4><a id=\"user-content-30-may-2020-version-000\" class=\"anchor\" href=\"https:\/\/www.meb.ki.se\/sites\/biostatwiki\/circall\/\/#30-may-2020-version-000\" aria-hidden=\"true\"><\/a>19 June 2020: <a href=\"https:\/\/github.com\/datngu\/Circall\/archive\/refs\/tags\/v0.1.0.tar.gz\">version 0.1.0 (click to download)<\/a><\/h4>\n<ul>\n<li>First submission<\/li>\n<\/ul>\n<h2><a id=\"user-content-1-introduction\" class=\"anchor\" href=\"https:\/\/www.meb.ki.se\/sites\/biostatwiki\/circall\/\/#1-introduction\" aria-hidden=\"true\"><\/a>1. Introduction<\/h2>\n<p>Circall is a novel method for fast and accurate discovery of circular RNAs from paired-end RNA-sequencing data. The method controls false positives by two-dimensional local false discovery method and employs quasi-mapping for fast and accurate alignments. The details of Circall are described in its manuscript. In this page, we present the Circall tool and how to use it.<\/p>\n<h3><a id=\"user-content-software-requirements\" class=\"anchor\" href=\"https:\/\/www.meb.ki.se\/sites\/biostatwiki\/circall\/\/#software-requirements\" aria-hidden=\"true\"><\/a>Software requirements:<\/h3>\n<p>Circall is implemented in R and C++. We acknowledge for materials from Sailfish, Rapmap and other tools used in this software.<\/p>\n<ul>\n<li>A C++-11 compliant compiler version of GCC (g++ &gt;= 4.8.2)<\/li>\n<li>R packages version 3.6.0 or later with the following installed packages: GenomicFeatures, Biostrings, foreach, and doParallel.<\/li>\n<\/ul>\n<h3><a id=\"user-content-annotation-reference\" class=\"anchor\" href=\"https:\/\/www.meb.ki.se\/sites\/biostatwiki\/circall\/\/#annotation-reference\" aria-hidden=\"true\"><\/a>Annotation reference<\/h3>\n<p>Circall requires<\/p>\n<ol>\n<li>a fasta file of transcript sequences and a gtf file of transcript annotation: can be downloaded from public repositories such as Ensembl (ensembl.org)<\/li>\n<li>a genome file of transcript sequences and a gtf file of transcript annotation: can be downloaded from public repositories such as Ensembl (ensembl.org)<\/li>\n<li>an RData file of supporting annotation: A description of how to create the RData file for new annotation versions or species is available in the following Section.<\/li>\n<\/ol>\n<p>The current Circall version was tested on the human genome, transcriptome with ensembl annotation version GRCh37.75. Specifically, the following files are required:<\/p>\n<ul>\n<li>Sequences of genome (ensembl website)\u00a0<a href=\"http:\/\/ftp.ensembl.org\/pub\/release-75\/fasta\/homo_sapiens\/dna\/Homo_sapiens.GRCh37.75.dna.primary_assembly.fa.gz\" rel=\"nofollow\">GRCh37.75 genome fasta<\/a><\/li>\n<li>Sequences of transcripts (ensembl website)\u00a0<a href=\"http:\/\/ftp.ensembl.org\/pub\/release-75\/fasta\/homo_sapiens\/cdna\/Homo_sapiens.GRCh37.75.cdna.all.fa.gz\" rel=\"nofollow\">GRCh37.75 cdna fasta<\/a><\/li>\n<li>Gtf annotation of transcripts (ensembl website)\u00a0<a href=\"http:\/\/ftp.ensembl.org\/pub\/release-75\/gtf\/homo_sapiens\/Homo_sapiens.GRCh37.75.gtf.gz\" rel=\"nofollow\">GRCh37.75 gtf annotation<\/a><\/li>\n<\/ul>\n<h3><a id=\"user-content-versions\" class=\"anchor\" href=\"https:\/\/www.meb.ki.se\/sites\/biostatwiki\/circall\/\/#versions\" aria-hidden=\"true\"><\/a>Versions<\/h3>\n<p>The latest version and information of Circall is updated at:\u00a0<a href=\"https:\/\/www.meb.ki.se\/sites\/biostatwiki\/circall\/\">https:\/\/www.meb.ki.se\/sites\/biostatwiki\/circall\/<\/a><\/p>\n<h2><a id=\"user-content-2-download-and-installation\" class=\"anchor\" href=\"https:\/\/www.meb.ki.se\/sites\/biostatwiki\/circall\/\/#2-download-and-installation\" aria-hidden=\"true\"><\/a>2. Download and installation<\/h2>\n<p><strong>If you use the binary verion of Circall:<\/strong><\/p>\n<ul>\n<li>Download the latest binary version from Circall website<\/li>\n<\/ul>\n<div class=\"highlight highlight-source-shell\">\n<pre>wget --no-check-certificate -O Circall_v0.1.0_linux_x86-64.tar.gz https:\/\/www.meb.ki.se\/sites\/biostatwiki\/wp-content\/uploads\/sites\/4\/2021\/04\/Circall_v0.1.0_linux_x86-64.tar_.gz<\/pre>\n<\/div>\n<ul>\n<li>Uncompress to folder<\/li>\n<\/ul>\n<div class=\"highlight highlight-source-shell\">\n<pre>tar -xzvf Circall_v0.1.0_linux_x86-64.tar.gz<\/pre>\n<\/div>\n<ul>\n<li>Move to the\u00a0<em>Circall_home<\/em>\u00a0directory and do configuration for Circall<\/li>\n<\/ul>\n<div class=\"highlight highlight-source-shell\">\n<pre><span class=\"pl-c1\">cd<\/span> Circall_v0.1.0_linux_x86-64\nbash config.sh\n<span class=\"pl-c1\">cd<\/span> ..<\/pre>\n<\/div>\n<ul>\n<li>Add paths of lib folder and bin folder to LD_LIBRARY_PATH and PATH<\/li>\n<\/ul>\n<div class=\"highlight highlight-source-shell\">\n<pre><span class=\"pl-k\">export<\/span> LD_LIBRARY_PATH=\/path\/to\/Circall_v0.1.0_linux_x86-64\/linux\/lib:<span class=\"pl-smi\">$LD_LIBRARY_PATH<\/span>\n<span class=\"pl-k\">export<\/span> PATH=\/path\/to\/Circall_v0.1.0_linux_x86-64\/linux\/bin:<span class=\"pl-smi\">$PATH<\/span><\/pre>\n<\/div>\n<ul>\n<li><em>Do not forget to replace &#8220;\/path\/to\/&#8221; with your local path or use this command to automatically replace your path:<\/em><\/li>\n<\/ul>\n<div class=\"highlight highlight-source-shell\">\n<pre><span class=\"pl-k\">export<\/span> LD_LIBRARY_PATH=<span class=\"pl-smi\">$PWD<\/span>\/Circall_v0.1.0_linux_x86-64\/linux\/lib:<span class=\"pl-smi\">$LD_LIBRARY_PATH<\/span>\n<span class=\"pl-k\">export<\/span> PATH=<span class=\"pl-smi\">$PWD<\/span>\/Circall_v0.1.0_linux_x86-64\/linux\/bin:<span class=\"pl-smi\">$PATH<\/span><\/pre>\n<\/div>\n<p><strong>If you want to build Circall from sources:<\/strong><\/p>\n<ul>\n<li>Download Circall from <a href=\"https:\/\/www.meb.ki.se\/sites\/biostatwiki\/circall\/\/\">Circall website<\/a> and move to <em>Circall_home<\/em>\u00a0directory<\/li>\n<\/ul>\n<div class=\"highlight highlight-source-shell\">\n<pre>wget --no-check-certificate -O Circall_v0.1.0.tar.gz https:\/\/www.meb.ki.se\/sites\/biostatwiki\/wp-content\/uploads\/sites\/4\/2021\/04\/Circall_v0.1.0.tar_.gz\ntar -xzvf Circall_v0.1.0.tar.gz\n<span class=\"pl-c1\">cd<\/span> Circall_v0.1.0\nbash config.sh<\/pre>\n<\/div>\n<ul>\n<li>Circall requires information of flags from Sailfish including DFETCH_BOOST, DBOOST_ROOT, DTBB_INSTALL_DIR and DCMAKE_INSTALL_PREFIX. Please refer to the Sailfish website for more details of these flags.<\/li>\n<li>Do installation by the following command:<\/li>\n<\/ul>\n<div class=\"highlight highlight-source-shell\">\n<pre>DBOOST_ROOT=\/path\/to\/boostDir\/ DTBB_INSTALL_DIR=\/path\/to\/tbbDir\/ DCMAKE_INSTALL_PREFIX=\/path\/to\/Circall_home bash install.sh<\/pre>\n<\/div>\n<p>-After the installation is finished, remember to add the paths of lib folder and bin folder to LD_LIBRARY_PATH and PATH<\/p>\n<div class=\"highlight highlight-source-shell\">\n<pre><span class=\"pl-k\">export<\/span> LD_LIBRARY_PATH=\/path\/to\/Circall_home\/lib:<span class=\"pl-smi\">$LD_LIBRARY_PATH<\/span>\n<span class=\"pl-k\">export<\/span> PATH=\/path\/to\/Circall_home\/bin:<span class=\"pl-smi\">$PATH<\/span><\/pre>\n<\/div>\n<p><strong>Install Circall from sources in Ubuntu<\/strong><\/p>\n<div class=\"highlight highlight-source-shell\">\n<pre>##########################\n### This contain scripts in the copy-and-paste manner (line-by-line) to install Circall from source codes\n### The scripts have been successfully tested in Ubuntu 16, 19 and 20.\n\n##########################\n### download Circall\nwget wget --no-check-certificate -O Circall_v0.1.0.tar.gz https:\/\/www.meb.ki.se\/sites\/biostatwiki\/wp-content\/uploads\/sites\/4\/2021\/04\/Circall_v0.1.0.tar_.gz\ntar -xzvf Circall_v0.1.0.tar.gz\ncd Circall_v0.1.0\n\n#config to run Circall\nbash config.sh\n\n### install boost_1_55_0\nwget http:\/\/sourceforge.net\/projects\/boost\/files\/boost\/1.58.0\/boost_1_58_0.tar.gz\ntar -xvzf boost_1_58_0.tar.gz\ncd boost_1_58_0\n\nsudo apt-get update\nsudo apt-get install build-essential g++ python-dev autotools-dev libicu-dev build-essential libbz2-dev libboost-all-dev\nsudo apt-get install aptitude\naptitude search boost\n\n.\/bootstrap.sh --prefix=boost_1_58_0_build\n.\/b2\n.\/b2 install\n\n#The Boost C++ Libraries were successfully built!\n#add the lib and folder to paths\nexport LD_LIBRARY_PATH=$PWD\/boost_1_58_0_build\/stage\/lib:$LD_LIBRARY_PATH\nexport PATH=$PWD\/boost_1_58_0_build:$PATH\n\n\n### install tbb44_20160526oss\ncd ..\nwget https:\/\/www.threadingbuildingblocks.org\/sites\/default\/files\/software_releases\/source\/tbb44_20160526oss_src_0.tgz\ntar xvf tbb44_20160526oss_src_0.tgz\nsudo apt-get install libtbb-dev\n\n### install cmake for ubuntu: cmake 3.5.1\nsudo apt install cmake\n### install curl\nsudo apt install curl\n### install autoconf\nsudo apt-get install autoconf\n### install zlib\nsudo apt install zlib1g-dev\nsudo apt install zlib1g\n### update all installations\nsudo apt-get update\n\n### install Circall\nDBOOST_ROOT=$PWD\/boost_1_58_0\/boost_1_58_0_build\/ DTBB_INSTALL_DIR=$PWD\/tbb44_20160526oss\/ DCMAKE_INSTALL_PREFIX=Circall_0.1.0_build bash install.sh\n\n#The Circall_0.1.0 was successfully built!\n###########\n\n#add lib and bin folders to paths\nexport LD_LIBRARY_PATH=$PWD\/Circall_0.1.0_build\/lib:$LD_LIBRARY_PATH\nexport PATH=$PWD\/Circall_0.1.0_build\/bin:$PATH\n\n#done\n###########<\/pre>\n<\/div>\n<h2><a id=\"user-content-3-prepare-bsj-reference-database-and-sqlite-annotation-files\" class=\"anchor\" href=\"https:\/\/www.meb.ki.se\/sites\/biostatwiki\/circall\/\/#3-prepare-bsj-reference-database-and-sqlite-annotation-file\" aria-hidden=\"true\"><\/a>3. Prepare BSJ reference database and annotation files<\/h2>\n<h3><a id=\"user-content-download-genome-fasta-transcript-fasta-and-gtf-annotation-files\" class=\"anchor\" href=\"https:\/\/www.meb.ki.se\/sites\/biostatwiki\/circall\/\/#download-genome-fasta-transcript-fasta-and-gtf-annotation-files\" aria-hidden=\"true\"><\/a>Download genome fasta, transcript fasta and gtf annotation files.<\/h3>\n<div class=\"highlight highlight-source-shell\">\n<pre>wget http:\/\/ftp.ensembl.org\/pub\/release-75\/fasta\/homo_sapiens\/dna\/Homo_sapiens.GRCh37.75.dna.primary_assembly.fa.gz\ngunzip Homo_sapiens.GRCh37.75.dna.primary_assembly.fa.gz\nwget http:\/\/ftp.ensembl.org\/pub\/release-75\/fasta\/homo_sapiens\/cdna\/Homo_sapiens.GRCh37.75.cdna.all.fa.gz\ngunzip Homo_sapiens.GRCh37.75.cdna.all.fa.gz\nwget http:\/\/ftp.ensembl.org\/pub\/release-75\/gtf\/homo_sapiens\/Homo_sapiens.GRCh37.75.gtf.gz\ngunzip Homo_sapiens.GRCh37.75.gtf.gz<\/pre>\n<\/div>\n<h3><a id=\"user-content-create-sqlite\" class=\"anchor\" href=\"https:\/\/www.meb.ki.se\/sites\/biostatwiki\/circall\/\/#create-sqlite\" aria-hidden=\"true\"><\/a>Create sqlite<\/h3>\n<div class=\"highlight highlight-source-shell\">\n<pre>Rscript Circall_v0.1.0_linux_x86-64\/R\/createSqlite.R Homo_sapiens.GRCh37.75.gtf Homo_sapiens.GRCh37.75.sqlite<\/pre>\n<\/div>\n<h3><a id=\"user-content-create-bsj-reference-database\" class=\"anchor\" href=\"https:\/\/www.meb.ki.se\/sites\/biostatwiki\/circall\/\/#create-bsj-reference-database\" aria-hidden=\"true\"><\/a>Create BSJ reference database<\/h3>\n<p>The BSJ reference database for Homo_sapiens.GRCh37.75 was generated and able to download from <a href=\"https:\/\/www.meb.ki.se\/sites\/biostatwiki\/wp-content\/uploads\/sites\/4\/files\/circall\/Homo_sapiens.GRCh37.75_BSJ_sequences.fa.gz\">Homo_sapiens.GRCh37.75_BSJ_sequences.fa<\/a>. This file was generated by the following command:<\/p>\n<div class=\"highlight highlight-source-shell\">\n<pre>Rscript Circall_v0.1.0_linux_x86-64\/R\/buildBSJdb.R gtfSqlite=Homo_sapiens.GRCh37.75.sqlite genomeFastaFile=Homo_sapiens.GRCh37.75.dna.primary_assembly.fa bsjDist=250 maxReadLen=150 output=Homo_sapiens.GRCh37.75_BSJ_sequences.fa<\/pre>\n<\/div>\n<h2><a id=\"user-content-4-indexing-transcriptome-and-bsj-reference-database\" class=\"anchor\" href=\"https:\/\/www.meb.ki.se\/sites\/biostatwiki\/circall\/\/#4-indexing-transcriptome-and-bsj-reference-database\" aria-hidden=\"true\"><\/a>4. Indexing transcriptome and BSJ reference database<\/h2>\n<h3><a id=\"user-content-index-transcriptome\" class=\"anchor\" href=\"https:\/\/www.meb.ki.se\/sites\/biostatwiki\/circall\/\/#index-transcriptome\" aria-hidden=\"true\"><\/a>Index transcriptome<\/h3>\n<div class=\"highlight highlight-source-shell\">\n<pre>Circall_v0.1.0_linux_x86-64\/linux\/bin\/TxIndexer -t Homo_sapiens.GRCh37.75.cdna.all.fa -o IndexTranscriptome<\/pre>\n<\/div>\n<h3><a id=\"user-content-index-bsj-reference-database\" class=\"anchor\" href=\"https:\/\/www.meb.ki.se\/sites\/biostatwiki\/circall\/\/#index-bsj-reference-database\" aria-hidden=\"true\"><\/a>Index BSJ reference database<\/h3>\n<div class=\"highlight highlight-source-shell\">\n<pre>Circall_v0.1.0_linux_x86-64\/linux\/bin\/TxIndexer -t Homo_sapiens.GRCh37.75_BSJ_sequences.fa -o IndexBSJ<\/pre>\n<\/div>\n<p><em><strong>Now, all annotation data are generated and ready to run Circall.<\/strong><\/em><\/p>\n<h2><a id=\"user-content-5-run-circall-pipeline\" class=\"anchor\" href=\"https:\/\/www.meb.ki.se\/sites\/biostatwiki\/circall\/\/#5-run-circall-pipeline\" aria-hidden=\"true\"><\/a>5. Run Circall pipeline<\/h2>\n<p>Suppose sample_01_1.fasta and sample_01_2.fasta are the input fastq files. For convenience, we prepared a toy example to test the pipeline, which can be downloaded here:<\/p>\n<div class=\"highlight highlight-source-shell\">\n<pre>wget https:\/\/www.meb.ki.se\/sites\/biostatwiki\/wp-content\/uploads\/sites\/4\/2021\/07\/sample_01_1.fasta_.gz\nwget https:\/\/www.meb.ki.se\/sites\/biostatwiki\/wp-content\/uploads\/sites\/4\/2021\/07\/sample_01_2.fasta_.gz<\/pre>\n<\/div>\n<p>Circall can be run in one command wrapped in a bash script:<\/p>\n<div class=\"highlight highlight-source-shell\">\n<pre>bash Circall_v0.1.0_linux_x86-64\/Circall.sh -genome Homo_sapiens.GRCh37.75.dna.primary_assembly.fa -gtfSqlite Homo_sapiens.GRCh37.75.sqlite -txFasta Homo_sapiens.GRCh37.75.cdna.all.fa -txIdx IndexTranscriptome -bsjIdx IndexBSJ -dep Circall_v0.1.0_linux_x86-64\/Data\/Circall_depdata_human.RData -read1 sample_01_1.fasta.gz -read2 sample_01_2.fasta.gz -p 4 -tag testing_sample -c FALSE -o Testing_out<\/pre>\n<\/div>\n<h3><a id=\"user-content-obligatory-parameters-are\" class=\"anchor\" href=\"https:\/\/www.meb.ki.se\/sites\/biostatwiki\/circall\/\/#obligatory-parameters-are\" aria-hidden=\"true\"><\/a>Inputs and parameters<\/h3>\n<p><em><strong>Annotation data:<\/strong><\/em><\/p>\n<ul>\n<li>genome &#8212; genome in fasta format<\/li>\n<li>gtfSqlite &#8212; genome annotation in Sqlite format<\/li>\n<li>txFasta &#8212; transcripts (cDNA) in fasta format<\/li>\n<li>txIdx &#8212; quasi-index of txFasta<\/li>\n<li>bsjIdx &#8212; quasi-index of BSJ reference fasta file<\/li>\n<\/ul>\n<p><em><strong>Input data:<\/strong><\/em><\/p>\n<ul>\n<li>read1 &#8212; input read1: should be in gz format<\/li>\n<li>read2 &#8212; input read2: should be in gz format<\/li>\n<\/ul>\n<p><em><strong>Other parameters:<\/strong><\/em><\/p>\n<ul>\n<li>dep &#8212; data contain depleted circRNAs: to specify the null data (depleted circRNA) for the two-dimensional local false discovery rate method. For convenience, we collect the null data from three human cell lines datasets Hela, Hs68, and Hek293 and provided in the tool: Circall_v0.1.0_linux_x86-64\/Data\/Circall_depdata_human.RData<\/li>\n<li>p &#8212; the number of threads: Default is 4<\/li>\n<li>tag &#8212; tag name of results: Default is &#8220;Sample&#8221;<\/li>\n<li>td &#8212; generation of tandem sequences: TRUE\/FALSE value, default is TRUE<\/li>\n<li>c &#8212; clean intermediate data: TRUE\/FALSE value, default is TRUE<\/li>\n<li>o &#8212; output folder: Default is the current directory<\/li>\n<\/ul>\n<h3>Output<\/h3>\n<p>The main output of Circall is provided in *_Circall_final.txt. In this file, each row indicates one circular RNA, and the information of one circular RNA is presented in 8 columns:<\/p>\n<ul>\n<li>chr: chromosome<\/li>\n<li>start: start position<\/li>\n<li>end: end position<\/li>\n<li>geneID: gene name that the circRNA belongs to<\/li>\n<li>circID: the ID of circRNA in the format &#8220;chr__start__end&#8221;<\/li>\n<li>junction_fragment_count: the number of fragment counts supporting the back-splicing-junction (BSJ)<\/li>\n<li>median_circlen: the median length of the circular RNA<\/li>\n<li>fdr: the false discovery rate computed from the two-dimensional local false discovery method<\/li>\n<\/ul>\n<h2><a id=\"user-content-6-a-practical-copy-paste-example-of-running-circall\" class=\"anchor\" href=\"https:\/\/www.meb.ki.se\/sites\/biostatwiki\/circall\/\/#6-a-practical-copy-paste-example-of-hs68-dataset\" aria-hidden=\"true\"><\/a>6. A practical copy-paste example of running Circall<\/h2>\n<p>In this section, we provide a practical example of using Circall in a copy-paste manner for a Hs68 cell line dataset.<\/p>\n<h3><a id=\"user-content-download-and-install-circall\" class=\"anchor\" href=\"https:\/\/www.meb.ki.se\/sites\/biostatwiki\/circall\/\/#download-and-install-circall\" aria-hidden=\"true\"><\/a>Download and install Circall<\/h3>\n<div class=\"highlight highlight-source-shell\">\n<pre>wget --no-check-certificate -O Circall_v0.1.0_linux_x86-64.tar.gz https:\/\/www.meb.ki.se\/sites\/biostatwiki\/wp-content\/uploads\/sites\/4\/2021\/04\/Circall_v0.1.0_linux_x86-64.tar_.gz<\/pre>\n<\/div>\n<ul>\n<li>Uncompress to folder<\/li>\n<\/ul>\n<div class=\"highlight highlight-source-shell\">\n<pre>tar -xzvf Circall_v0.1.0_linux_x86-64.tar.gz<\/pre>\n<\/div>\n<ul>\n<li>Move to the\u00a0<em>Circall_home<\/em>\u00a0directory and do configuration for Circall<\/li>\n<\/ul>\n<div class=\"highlight highlight-source-shell\">\n<pre><span class=\"pl-c1\">cd<\/span> Circall_v0.1.0_linux_x86-64\nbash config.sh\n<span class=\"pl-c1\">cd<\/span> ..<\/pre>\n<\/div>\n<ul>\n<li>Add paths of lib folder and bin folder to LD_LIBRARY_PATH and PATH<\/li>\n<\/ul>\n<div class=\"highlight highlight-source-shell\">\n<pre><span class=\"pl-k\">export<\/span> LD_LIBRARY_PATH=<span class=\"pl-smi\">$PWD<\/span>\/Circall_v0.1.0_linux_x86-64\/linux\/lib:<span class=\"pl-smi\">$LD_LIBRARY_PATH<\/span>\n<span class=\"pl-k\">export<\/span> PATH=<span class=\"pl-smi\">$PWD<\/span>\/Circall_v0.1.0_linux_x86-64\/linux\/bin:<span class=\"pl-smi\">$PATH<\/span><\/pre>\n<\/div>\n<h3><a id=\"user-content-download-genome-fasta-transcript-fasta-and-bsj-databases-and-annotation-file\" class=\"anchor\" href=\"https:\/\/www.meb.ki.se\/sites\/biostatwiki\/circall\/\/#download-genome-fasta-transcript-fasta-and-bsj-databases-and-annotation-file\" aria-hidden=\"true\"><\/a>Download genome fasta, transcript fasta and BSJ databases and annotation file.<\/h3>\n<div class=\"highlight highlight-source-shell\">\n<pre><span class=\"pl-c\"># genome from ENSEMBL website<\/span>\nwget http:\/\/ftp.ensembl.org\/pub\/release-75\/fasta\/homo_sapiens\/dna\/Homo_sapiens.GRCh37.75.dna.primary_assembly.fa.gz\ngunzip Homo_sapiens.GRCh37.75.dna.primary_assembly.fa.gz\n<span class=\"pl-c\">\n# cDNA (transcript) and Gene annotation (gft) from ENSEMBL website<\/span>\nwget http:\/\/ftp.ensembl.org\/pub\/release-75\/fasta\/homo_sapiens\/cdna\/Homo_sapiens.GRCh37.75.cdna.all.fa.gz\ngunzip Homo_sapiens.GRCh37.75.cdna.all.fa.gz\nwget http:\/\/ftp.ensembl.org\/pub\/release-75\/gtf\/homo_sapiens\/Homo_sapiens.GRCh37.75.gtf.gz\ngunzip Homo_sapiens.GRCh37.75.gtf.gz\n\n<span class=\"pl-c\"># pre-built BSJ databases from Circall website<\/span>\nwget https:\/\/www.meb.ki.se\/sites\/biostatwiki\/wp-content\/uploads\/sites\/4\/files\/circall\/Homo_sapiens.GRCh37.75_BSJ_sequences.fa.gz\ngunzip Homo_sapiens.GRCh37.75_BSJ_sequences.fa.gz\n<span class=\"pl-c\">\n# Genarate Sqlite annotation<\/span>\nRscript Circall_v0.1.0_linux_x86-64\/R\/createSqlite.R Homo_sapiens.GRCh37.75.gtf Homo_sapiens.GRCh37.75.sqlite\n\n<\/pre>\n<\/div>\n<h3><a id=\"user-content-index-transcriptome-1\" class=\"anchor\" href=\"https:\/\/www.meb.ki.se\/sites\/biostatwiki\/circall\/\/#index-transcriptome-1\" aria-hidden=\"true\"><\/a>Index transcriptome<\/h3>\n<div class=\"highlight highlight-source-shell\">\n<pre>Circall_v0.1.0_linux_x86-64\/linux\/bin\/TxIndexer -t Homo_sapiens.GRCh37.75.cdna.all.fa -o IndexTranscriptome<\/pre>\n<\/div>\n<h3><a id=\"user-content-index-bsj-reference-database-1\" class=\"anchor\" href=\"https:\/\/www.meb.ki.se\/sites\/biostatwiki\/circall\/\/#index-bsj-reference-database-1\" aria-hidden=\"true\"><\/a>Index BSJ reference database<\/h3>\n<div class=\"highlight highlight-source-shell\">\n<pre>Circall_v0.1.0_linux_x86-64\/linux\/bin\/TxIndexer -t Homo_sapiens.GRCh37.75_BSJ_sequences.fa -o IndexBSJ<\/pre>\n<\/div>\n<h3><a id=\"user-content-download-hs68-rna-seq-data\" class=\"anchor\" href=\"https:\/\/www.meb.ki.se\/sites\/biostatwiki\/circall\/\/#download-hs68-rna-seq-data\" aria-hidden=\"true\"><\/a>Download Hs68 cell line RNA-seq data<\/h3>\n<div class=\"highlight highlight-source-shell\">\n<pre>wget ftp:\/\/ftp.sra.ebi.ac.uk\/vol1\/fastq\/SRR444\/SRR444975\/SRR444975_1.fastq.gz\nwget ftp:\/\/ftp.sra.ebi.ac.uk\/vol1\/fastq\/SRR444\/SRR444975\/SRR444975_2.fastq.gz<\/pre>\n<\/div>\n<h3><a id=\"user-content-run-circall\" class=\"anchor\" href=\"https:\/\/www.meb.ki.se\/sites\/biostatwiki\/circall\/\/#run-circall\" aria-hidden=\"true\"><\/a>Run Circall<\/h3>\n<div class=\"highlight highlight-source-shell\">\n<pre>bash Circall_v0.1.0_linux_x86-64\/Circall.sh -genome Homo_sapiens.GRCh37.75.dna.primary_assembly.fa -gtfSqlite Homo_sapiens.GRCh37.75.sqlite -txFasta Homo_sapiens.GRCh37.75.cdna.all.fa -txIdx IndexTranscriptome -bsjIdx IndexBSJ -dep Circall_v0.1.0_linux_x86-64\/Data\/Circall_depdata_human.RData -read1 SRR444975_1.fastq.gz -read2 SRR444975_2.fastq.gz -p 4 -tag testing_sample -o SRR444975<\/pre>\n<p>In our experience, it takes around 8 CPU hours with a single CPU in total to complete.<\/p>\n<\/div>\n<h2><a id=\"user-content-7-circall-simulator\" class=\"anchor\" href=\"https:\/\/www.meb.ki.se\/sites\/biostatwiki\/circall\/\/#7-instruction-of-circall-simulator\" aria-hidden=\"true\"><\/a>7.\u00a0 Circall simulator<\/h2>\n<h3><a id=\"user-content-introduction\" class=\"anchor\" href=\"https:\/\/www.meb.ki.se\/sites\/biostatwiki\/circall\/\/#introduction\" aria-hidden=\"true\"><\/a>Introduction<\/h3>\n<p>Circall simulator is a tool integrated in Circall to generate RNA-seq data of both circRNA and tandem RNA. The source codes are provided in R\/<span class=\"pl-s\">Circall_simulator.R of the Circall tool. The main function of the simulator is Circall_simulator() which is able to be run in R console. This function requires the following parameters:<\/span><\/p>\n<h3><a id=\"user-content-input-parametes\" class=\"anchor\" href=\"https:\/\/www.meb.ki.se\/sites\/biostatwiki\/circall\/\/#input-parametes\" aria-hidden=\"true\"><\/a>Parameter setting:<\/h3>\n<ul>\n<li>circInfo: a data frame that contains 6 columns which are: Chr, start_EXONSTART, end_EXONEND, GENEID, cCount and FPKM. Chr is chromosome name with formated as 1:22, X, Y, Mt. start_EXONSTART is starting position starting exon of circRNA, end_EXONEND is ending position ending exon of circRNA, GENEID is gene ID contains circRNA (used to get gene model), cCount are number of read pair want to generate for the target circRNA and FPKM are Fragments Per Kilobase of transcript per Million of target circRNAs. This is used to simulate circular RNAs<\/li>\n<li>tandemInfo: a data frame similar to circInfo to simulate tandem RNAs. tandemInfo=NULL (the default value) to not simulate tandem RNAs<\/li>\n<li>error_rate: sequencing error rate, the default value is 0.005<\/li>\n<li>set.seed: set seed for reproducibility, the default value is 2018<\/li>\n<li>gtfSqlite: path to your annotation file, Sqlite formated (generated by GenomicFeatures)<\/li>\n<li>genomeFastaFile: path to your genome fasta file<\/li>\n<li>txFastaFile: path to your transcript fasta file (cDNA)<\/li>\n<li>out_name: prefix output folders, the default value is &#8220;Circall_simuation&#8221;<\/li>\n<li>out_dir: the directory contains output, the default value is the current directory<\/li>\n<li>lib_size: expected library size used when useFPKM=TRUE, the default value is NULL<\/li>\n<li>useFPKM boolean value to use FPKM or not, the default value is FALSE. When this useFPKM=TRUE, users need to set value for lib_size, and the simulator will use the abundance in column FPKM of circInfo\/tandemInfo for simulation<\/li>\n<\/ul>\n<h3><a id=\"user-content-example-assummed-that-your-working-directory-is-the-folder-contain-installed-circall-and-annotation\" class=\"anchor\" href=\"https:\/\/www.meb.ki.se\/sites\/biostatwiki\/circall\/\/#example-assummed-that-your-working-directory-is-the-folder-contain-installed-circall-and-annotation\" aria-hidden=\"true\"><\/a>A toy example for using Circall simulator<\/h3>\n<p>For an illustration of using Circall simulator, we provide in this section a toy example. Suppose your current working directory contains the installed Circall and the annotation data. First, we need to load the functions of the simulator into your R console:<\/p>\n<div class=\"highlight highlight-source-r\">\n<pre>source(<span class=\"pl-s\"><span class=\"pl-pds\">\"<\/span>Circall_v0.1.0_linux_x86-64\/R\/Circall_simulator.R<span class=\"pl-pds\">\"<\/span><\/span>)\n<\/pre>\n<\/div>\n<p>Then we create objects <strong>circInfo<\/strong> and <strong>tandemInfo<\/strong> containing the information of CircRNAs and tandem RNAs<\/p>\n<div class=\"highlight highlight-source-r\">\n<pre>Chr = c(7,7,3,5,17,4,7,1,3,1,17,12,14,10,18,17,5,20,16,17)\n\nstart_EXONSTART = c(131113792,99795401,172363413,179296769,36918664,151509200,2188787,51906019,57832924,225239153,76187051,111923075,104490906,101556854,196637,21075331,74981032,60712420,56903641,80730328)\n\nend_EXONEND = c(131128461,99796580,172365904,179315312,36918758,151509336,2270359,51913807,57882659,225528403,76201599,111924628,104493276,101572901,199316,21087123,74998635,60716000,56904648,80772810)\n\nGENEID = c(\"ENSG00000128585\",\"ENSG00000066923\",\"ENSG00000144959\",\"ENSG00000197226\",\"ENSG00000108294\",\"ENSG00000198589\",\"ENSG00000002822\",\"ENSG00000085832\",\"ENSG00000163681\",\"ENSG00000185842\",\"ENSG00000183077\",\"ENSG00000204842\",\"ENSG00000156414\",\"ENSG00000023839\",\"ENSG00000101557\",\"ENSG00000109016\",\"ENSG00000152359\",\"ENSG00000101182\",\"ENSG00000070915\",\"ENSG00000141556\")\n\nset.seed(2021)\ncCount = sample(2:2000,20)\nFPKM = rep(0,20)\n\nBSJ_info = data.frame(Chr = Chr, start_EXONSTART = start_EXONSTART, end_EXONEND = end_EXONEND, GENEID = GENEID, cCount = cCount, FPKM = FPKM)\n\ncircSet=c(1:15)\ncircInfo = BSJ_info[circSet,]\ntandemInfo = BSJ_info[-circSet,]<\/pre>\n<\/div>\n<p>Finally, we run the simulator:<\/p>\n<div class=\"highlight highlight-source-r\">\n<pre>simulation = Circall_simulator(circInfo = circInfo, tandemInfo = tandemInfo, useFPKM=FALSE, out_name = \"Tutorial\", gtfSqlite = \"Homo_sapiens.GRCh37.75.sqlite\", genomeFastaFile = \"Homo_sapiens.GRCh37.75.dna.primary_assembly.fa\", txFastaFile = \"Homo_sapiens.GRCh37.75.cdna.all.fa\", out_dir= \".\/simulation_test\")\n<\/pre>\n<\/div>\n<p>You can find in the &#8220;.\/simulation_test&#8221; that contains the outputs including:<\/p>\n<ul>\n<li>simulation_setting: setting information of simulation of both circRNAs and tandem RNAs.<\/li>\n<li>circRNA_data: RNA seq data of CircRNAs<\/li>\n<li>tandem_data: RNA seq data of tandem RNA<\/li>\n<li>fasta sequences of tandem RNAs<\/li>\n<li>fasta sequences of circular RNAs<\/li>\n<\/ul>\n<h2>8. License<\/h2>\n<p>Circall uses GNU General Public License GPL-3.<\/p>\n<div class=\"csl-bib-body\">\n<div>\n<h2>9. References<\/h2>\n<\/div>\n<div class=\"csl-entry\">Nguyen, Dat Thanh, Quang Thinh Trac, Thi-Hau Nguyen, Ha-Nam Nguyen, Nir Ohad, Yudi Pawitan, and Trung Nghia Vu. 2021. \u201cCircall: Fast and Accurate Methodology for Discovery of Circular RNAs from Paired-End RNA-Sequencing Data.\u201d <i>BMC Bioinformatics<\/i> 22 (1): 495. <a href=\"https:\/\/doi.org\/10.1186\/s12859-021-04418-8\">https:\/\/doi.org\/10.1186\/s12859-021-04418-8<\/a>.<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>A fast and accurate methodology for discovery of circular RNAs from paired-end RNA-sequencing data Contents 1. Introduction 2. Download and installation 3. Prepare BSJ reference database and annotation files 4. Indexing transcriptome and BSJ reference database 5. Run Circall pipeline 6. A practical copy-paste example of running Circall 7. Circall simulator Update news 28 April [&hellip;]<\/p>\n","protected":false},"author":20,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-1261","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"jetpack_featured_media_url":"","_links":{"self":[{"href":"https:\/\/www.meb.ki.se\/sites\/biostatwiki\/wp-json\/wp\/v2\/posts\/1261","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.meb.ki.se\/sites\/biostatwiki\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.meb.ki.se\/sites\/biostatwiki\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.meb.ki.se\/sites\/biostatwiki\/wp-json\/wp\/v2\/users\/20"}],"replies":[{"embeddable":true,"href":"https:\/\/www.meb.ki.se\/sites\/biostatwiki\/wp-json\/wp\/v2\/comments?post=1261"}],"version-history":[{"count":27,"href":"https:\/\/www.meb.ki.se\/sites\/biostatwiki\/wp-json\/wp\/v2\/posts\/1261\/revisions"}],"predecessor-version":[{"id":1378,"href":"https:\/\/www.meb.ki.se\/sites\/biostatwiki\/wp-json\/wp\/v2\/posts\/1261\/revisions\/1378"}],"wp:attachment":[{"href":"https:\/\/www.meb.ki.se\/sites\/biostatwiki\/wp-json\/wp\/v2\/media?parent=1261"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.meb.ki.se\/sites\/biostatwiki\/wp-json\/wp\/v2\/categories?post=1261"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.meb.ki.se\/sites\/biostatwiki\/wp-json\/wp\/v2\/tags?post=1261"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}