SWISS-PROT Format说明
来源:互联网 发布:股票量化交易软件 编辑:程序博客网 时间:2024/06/06 02:02
SWISS-PROT Format
SWISS-PROT is a curated protein sequence database which strives to provide a high level of annotations (such as the description of the function of a protein, its domain structure, post-translational modifications, variants, etc), a minimal level of redundancy and high level of integration with other databases.
Record format
Each sequence entry is composed of lines. Different types of lines, each with their own format, are used to record the various data which make up the entry.
Each line begins with a two-character line code, which indicates the type of data contained in the line. The current line types and line codes and the order in which they appear in an entry, are shown below:
ID - Identification. AC - Accession number(s). DT - Date. DE - Description. GN - Gene name(s). OS - Organism species. OG - Organelle. OC - Organism classification. RN - Reference number. RP - Reference position. RC - Reference comments. RX - Reference cross-references. RA - Reference authors. RL - Reference location. CC - Comments or notes. DR - Database cross-references. KW - Keywords. FT - Feature table data. SQ - Sequence header. - (blanks) sequence data. // - Termination line.
The program ignores all the description lines and uses only these line types: 'ID', 'DE', 'OS', 'SQ' and '//'.
- The program uses the 'ENTRY_NAME' which is the first field of the ID line as the first line of the title
- The data of the 'DE' and 'OS' lines are collected by the program and are used as the remaining lines of the title
- The 'SQ' line is used to identify the beginning of the sequence. The program collect all the following lines until the teminalion line is found or end is reached
Useful links:
More information about SWISS-PROT
THE SWISS-PROT PROTEIN SEQUENCE DATA BANK - USER MANUAL
Example of 'ID' lines:
1 2 3 4 5 6 71234567890123456789012345678901234567890123456789012345678901234567890ID CYC_BOVIN STANDARD; PRT; 104 AA.ID GIA2_GIALA PRELIMINARY; PRT; 296 AA.
Example of 'DE' lines:
1 2 3 4 5 6 71234567890123456789012345678901234567890123456789012345678901234567890DE NADH DEHYDROGENASE (EC 1.6.99.3).DE LYSOPINE DEHYDROGENASE (EC 1.5.1.16) (OCTOPINE SYNTHASE)DE (LYSOPINE SYNTHASE) (FRAGMENT).
Example of 'OS' lines:
1 2 3 4 5 6 71234567890123456789012345678901234567890123456789012345678901234567890OS ESCHERICHIA COLI.OS HOMO SAPIENS (HUMAN).OS ROUS SARCOMA VIRUS (STRAIN SCHMIDT-RUPPIN).OS NAJA NAJA (INDIAN COBRA), AND NAJA NIVEA (CAPE COBRA).
Example of a sequence specification:
1 2 3 4 5 6 71234567890123456789012345678901234567890123456789012345678901234567890SQ SEQUENCE 233 AA; 25644 MW; 666D7069 CRC32; MSTESMIRDV ELAEEALPKK TGGPQGSRRC LFLSLFSFLI VAGATTLFCL LHFGVIGPQR EEFPRDLSLI SPLAQAVRSS SRTPSDKPVA HVVANPQAEG QLQWLNRRAN ALLANGVELR DNQLVVPSEG LYLIYSQVLF KGQGCPSTHV LLTHTISRIA VSYQTKVNLL SAIKSPCQRE TPEGAEAKPW YEPIYLGGVF QLEKGDRLSA EINRPDYLDF AESGQVYFGI IAL//
- SWISS-PROT Format说明
- Swiss-Prot注释
- String.Format格式说明
- String.Format格式说明
- String.Format格式说明
- String.Format格式说明
- String.Format格式说明
- String.Format格式说明
- String.Format格式说明
- String.Format格式说明
- Delphi Format方法说明
- String.Format格式说明
- String.Format格式说明
- String.Format格式说明
- String.Format格式说明
- String.Format格式说明
- String.Format格式说明
- String.Format格式说明
- 关于图片或者文件在数据库的存储方式归纳
- Leetcode_candy
- 编译asm形式的helloworld在android下运行
- Linux3.4内核Nand Flash驱动的移植
- 设置端口可以重用 setsockopt()
- SWISS-PROT Format说明
- java控制台万年历2
- 黑马程序员---java高新技术之类加载器
- 转载 由笑话体悟人生
- GStreamer播放教程09——数字音频传输
- [分享]给C++初学者的50个忠告
- 红黑树原理
- 今天我开始了jQuery之路,在百度主页发现了正能量!
- jquery源码分析(1) 绪论