This page is only a brief introduction to PIntron. Please refer to the full documentation for a complete description of its options.
Download and install PIntron as described here.
In the rest of this example, we will assume that PIntron executables
have been installed in directory
/home/pintron/bin with the directory where
PIntron has been installed to (or omit the path if PIntron
executables are in a directory listed in
PIntron requires two input files: the
genomic sequence (in this case
genomic.txt), and the EST/mRNA
sequences (in this case
ests.txt), specified respectively with the
genomic.txt is a FASTA file containing a single sequence, while
ests.txt is a MultiFASTA file, where each sequence is considered as a
single transcript (EST or mRNA).
We strictly require a specific header format. Please refer to the documentation for a detailed description.
After the preparation, the directory tree should be as follows (other files may exist).
/home/pintron ├── bin │ ├── cds-annotation │ ├── compact-compositions │ ├── est-fact │ ├── gene-structure │ ├── intron-agreement │ ├── maximal-transcripts │ ├── min-factorization │ └── pintron └── doc └── example ├── ests.txt └── genomic.txt
Assuming that we want to generate all output files in the current working directory, the following command executes PIntron on the example.
/home/pintron/bin/pintron \ --bin-dir=/home/pintron/bin \ --genomic=/home/pintron/doc/example/genomic.txt \ --EST=/home/pintron/doc/example/ests.txt \ --organism=human \ --gene=TP53 \ --output=pintron-full-output.json \ --gtf=pintron-cds-annotated-isoforms.gtf \ --extended-gtf=pintron-all-isoforms.gtf \ --logfile=pintron-pipeline-log.txt \ --general-logfile=pintron-log.txt
Most of the options have sensible default values and a short version. Therefore, a shorter equivalent command line is:
/home/pintron/bin/pintron \ -b /home/pintron/bin \ -g /home/pintron/doc/example/genomic.txt \ -s /home/pintron/doc/example/ests.txt \ -n human \ -e TP53
Please notice that options
--gene/-e are optional
and can be omitted (in that case, the default value
unknown is assumed).
PIntron produces the following output files:
pintron-full-output.json, the complete description of the results computed by PIntron in JSON format. This file is both human- and machine-readable (parsing libraries exist in all the major programming languages). Moreover, the format is almost self-documenting and can be easily adapted to the future needs. Please refer to its description for additional information.
pintron-cds-annotated-isoforms.gtf, the set of all (CDS-annotated, respectively) full-length isoforms computed by PIntron in standard GTF2.2 format. These files can be used for some standard downstream analyzes. For example, they can be uploaded to the UCSC Genome Browser as custom tracks (as shown in the example below).
pintron-pipeline-log.txt, the logs of main program and of each step of the pipeline. These files could contain important information if an error has occurred. Please upload them with any issue report.
The output files of PIntron on the example gene TP53 are located
in the subdirectory
dist-docs/example/sample-output while a graphical
representation of the reconstructed isoforms is as follows: