GtRNAdb Gene Symbol

In older versions of GtRNAdb, tRNA genes were named after the tRNA numbers sequentially generated by tRNAscan-SE in combination with the chromosome or contig names where the genes were located, for example, chr1.tRNA8-AlaAGC. However, this numbering scheme is not persistent when a new genome assembly of a species becomes available. For example, chr6.tRNA166 is tRNA-Ala-AGC in human GRCh37 but tRNA-Ile-AAT in GRCh38. This is caused by tRNAscan-SE functioning as a standalone gene predictor for any sequences without the concept of genome assemblies and prior gene annotations. To minimize confusion when referencing the genes, we introduced a new naming convention in GtRNAdb release 16 for predicted tRNA genes in model organisms such as human, yeast, and fruit fly, which remains persistence across genome assemblies of the same species or strains. Starting from release 18, all tRNAs in the database include this gene symbol in their annotations. Our collaboration with multiple research communities including HUGO Gene Nomenclature Committee (HGNC) and FlyBase leads to the adoption of this naming convention (or its slightly modified variants) in other data resources and allows the promotion of standardized tRNA gene symbols in the same manner as protein-coding genes.

GtRNAdb Gene Symbol

The GtRNAdb gene symbol consists of five parts as shown in the figure above.

  1. Prefix - tRNA genes that are high scoring and not predicted as pseudogenes have "tRNA" as the gene symbol prefix. Otherwise, 'tRX" is used as the prefix to indicate uncertainty on gene functions or identity.

  2. Isotype - Three-letter amino acid code that stands for tRNA isotype is determined by the detected anticodon in the predicted gene sequence. In cases when the isotype and anticodon cannot be determined, "Und" is used.

  3. Anticodon - Anticodon detected in the predicted gene sequence. In cases when the anticodon cannot be determined or genes with "tRX" prefix (as described above), "NNN" is used.

  4. Transcript ID - Numeric ID of a unique tRNA transcript or "isodecoder" with a particular isotype and anticodon. In most cases, the smaller the number, the higher prediction score the gene has.

  5. Gene locus ID - For tRNA genes that have multiple identical copies, this gene locus ID represents the particular gene copy in the genome.