rPredictorDB

Introduction

rPredictorDB is a predictive database of secondary structures of individual RNAs and their formatted plots. The structures are generated by template-based prediction of RNA secondary structure with experimentally identified structures as templates. RNAs with large secondary structure are visualized using a template-based visualization method allowing for their formatted and readable display. The rPredictorDB web also allows for secondary structure template-based prediction for user-uploaded RNA sequences using templates stored in rPredictorDB.

Following is a brief, usage documentation. A technical, more detailed, documentation is available here.

Citations:

Predict page

The Predict page allows for template–based RNA secondary structure prediction for user sequences using templates stored in rPredictorDB.

The page allows for sequence upload (by copy – paste) and selection of the template. The selection is either automatic, when the template with most similar sequence to the query sequence is identified and used for prediction, or manual. Templates are secondary structures extracted from experimentally identified structures, found mostly in PDB.

Download page

The Download page allows to download rPredictorDB database in CSV format or as a database dump. There is also publicly available the used prediction algorithm.

List of families and template structures included in rPredictorDB

RNASource of sequencesTemplates and their sourceTemplate sequence length (nucleotides)
16S rRNASilvaE. coli 16S rRNA (PDB ID 2ZM6)11542
18S rRNA ChordataH. sapiens 18S rRNA (PDB ID 4V6X)11869
18S rRNA DipteraD. melanogaster 18S rRNA (PDB ID 4V6W)11995
5S rRNA BacteriaRfam (RF00001)E. coli 5S (PDB ID 1C2X)1120
5S rRNA EukaryaH. sapiens 5S (PDB ID 6EKO)1120
5.8S rRNARfam (RF00002)Trypanosoma cruzi 5.8S RNA (PDB ID 5T5H)169
6S RNARfam (RF00013)E. coli 6S RNA (Wassarman et al. 2000)2,3184
B. subtilis 6S RNA (Ando et al.2002)2,3187
9S rRNARfam (RF02545)Trypanosoma brucei 9S rRNA (PDB ID 6HIY)621
Cobalamin riboswitchRfam (RF00174)Symbiobacterium thermophilum (PDB ID 4GXY)172
C-DI-AMP riboswitchRfam (RF00379)Thermovirga lienii C-DI-AMP riboswitch (PDB ID 4QK9)123
CRPV-IRESRfam (RF00458)Mammalian CRPV-IRES (PDB ID 6D9J)190
CSFV IRESRfam (RF00209)Viral CSFV IRES (PDB ID 4C4Q)233
FMN riboswitchRfam (RF00050)PDB ID 3F2Y4112
Fungi U3Rfam (RF01846)S. cerevisiae u3 (PDB ID 5WYK)333
gcvBSharma et al. 20073S. typhimurium gcvB (Sharma et al. 2007)3206
GLMS ribosymeRfam (RF00234)Bacillus anthracis GLMS ribosyme (PDB ID 3L3C)141
Group I catalytic intronRfam (RF00028)Staphylococcus virus Twort (PDB ID 1Y0Q)192
Group II intron lariatNCBI5Oceanobacillus iheyensis group II intron (PDB ID 5J02)418
Group II intron lariat in post-catalytic state6NCBI5Pylaiella littoralis (PDB ID 6CIH)621
IRES HCVRfam (RF00061)H. sapiens IRES HCV (PDB ID 5A2Q)257
Lariat capping ribozymeRfam (RF01807)Didymium iridis lariat capping ribozyme (PDB ID 4P8Z)188
Lysine riboswitchRfam (RF00168)T. maritima lysine riboswitch (PDB ID 4ERL)161
Mammalian CPEB3 ribozymeRfam (RF00622)H. sapiens CPEB3 (Salehi-Ashtiani et al. 2006)378
M-boxRfam (RF00380)B. subtilis M-box (PDB ID 3PDR)161
micFRfam (RF00033)E. coli micF (Esterling et al. 1994)395
MLV encapsidation signalRfam (RF00374)Viral MLV (PDB ID 1U6P)101
ms1Hnilicova et al. 20143M. smegmatis ms1 (Panek et al. 2011, Hnilicova et al. 2014)3304
oxySRfam (RF00035)E. coli oxyS (Argaman et al. 2000)3109
PHI29 PROHEAD RNARfam (RF00044)Bacteriophage PHI29 (PDB ID 1FOQ)117
RNaseP archRfam (RF00373)Pyrococcus furiosus RNaseP347
RNaseP bact aNCBI5T. tengcongensis RNaseP bact a (PDB ID 3Q1R)347
RNaseP bact bRfam (RF00011)PDB ID 2A644414
RNaseP nucRfam (RF00009)H. sapiens RNaseP (Marquez et al. 2005)3341
ryhBDavis et al. 20053E. coli ryhB (Davis et al. 2005)390
SAM IRfam (RF00162)T. tengcongensis SAM I (PDB ID 2GIS)94
spot42Rfam (RF00021)E. coli spot42 (Moller et al. 2002)119
SRP bact smallRfam (RF00169)E. coli SRP (SRPDB ID esccol3d-97-11-17-stretched.pdb)114
SRP bact largeRfam (RF01854)B. subtilis SRP (PDB ID 4UE4)266
SRP MetazoaNCBI5H. sapiens SRP (PDB ID 4P3E)301
Tetrahymena ribozymeNCBI5PDB ID 1X8W4247
Tetrahymena telomerase RNARfam (RF00025)Tetrahymena Telomerase RNA (PDB ID 6D6V)159
THF riboswitchRfam (RF01831)PDB ID 4LVV489
tmRNARfam (RF00023)E. coli tmRNA (PDB ID 3IZ4)377
TPP riboswitchNCBI5E. coli TPP (PDB ID 4NYG)83
tRNA Gly eukaryoticRfam (RF00005)H. sapiens tRNA Gly (PDB ID 5E6M)174
tRNA Gly bacterialG. kaustophilus tRNA Gly (PDB ID 4MGM)175
u1Rfam (RF00003)H. sapiens u1 (Nagai et al. 2002)3163
u2Rfam (RF00004)H. sapiens u2 (Nagai et al. 2002)3188
u4Rfam (RF00015)H. sapiens u4 (Krol et al. 1981)3144
u5Rfam (RF00020)H. sapiens u5 (Sievers et al. 2011)3116
u6Rfam (RF00026)H. sapiens u6 (PDB ID 5LQW)112
vertebrate Telomerase RNARfam (RF00024)H. sapiens Telomerase RNA (Bentley et al. 2002)3451
yeast u1Rfam (RF00488)S. cerevisiae (PDB ID 5ZWN)565

1 The template is applied to sequences according to taxonomy, i.e. a eukaryotic template to eukaryotic sequences, a prokaryotic template to prokaryotic sequences.
2 It is impossible to distinguish which template should be used based on taxonomy, as some bacteria, e.g. Firmicutes, contain 6S RNAs of both template types. Therefore the template producing a structure with a better z-score is used for each 6S RNA.
3 Sequences and/or template structure were copied from the paper publishing the template structure.
4 Organism not described or a synthetic expression system used.
5 The sequences were obtained by NCBI BLAST search with "somewhat similar sequences" parameters against nr database with query sequences taken from PDB. The reason was that the sequences in an appropriate Rfam family seemed incompatible with PDB structure, as they either were short fragments or had very low sequence similarity to the PDB sequence.
6 This family contains several very short fragments producing substructures that are hard to match with the template structure. Nevertheless, we included them into rPredictorDB as they had significant BLAST E-values (< 1.10-12) and also, as they represent a good example of RNAs with extremely fragmented sequences.

Table Of Contents


Statistics

  • The rPredictorDB contains 7365 structures
  • Last update: 26.02.2019
  • Data sources: