Personal tools

Protein identifier mapping

From irefindex
Revision as of 03:02, 21 October 2010 by Sabry (Talk | contribs)

Jump to: navigation, search

Last edited: 2010-10-21

We have made a file which provides mappings between iRefIndex identifiers and popular external identifiers. The file is a tab delimited text file and the first row starting with the "#" provides the column headers. The current files contains all UniProt, allRefSeq identifier (please refer for version information) and an other identifiers in selected cases. Other identifiers are provided as accession/identifiers for iRefindex identifiers provided only when they do not have a UniProt or RefSeq identifier.

File download location:

The column descriptions:

Column number Column name Description
1 db Source of the external identifier (e.g. UniProt, RefSeq)
2 acc The external identifier (e.g. Q4U9M9)
3 entrezGeneid Entrez gene id. This is provided only for RefSeq identifiers for other identifiers the value is -1 from this field.
4 irogid Integer version redundant group identifier(e.g. 3156116, current maximum value=14005379, this is a MySQL int(11) field).
5 rogid String version of the redundant object group (64 bit version of the hash digest of primary amino acid sequence with the NSBI taxonomy identifier appended at the end)
6 icrogid Integer version of the canonical(1) redundant object group (A selected irogid to represent the canonical group)
7 crogid String version of the canonical(1) redundant object group (A selected rogid to represent the canonical group)

(1) Please refer the following page for details on canonicalization process.