Common File Formats
互联网
- Abstract
- Table of Contents
- Figures
- Literature Cited
Abstract
This appendix discusses a few of the file formats frequently encountered in bioinformatics. Specifically, it reviews the rules for generating FASTA files and provides guidance for interpreting NCBI descriptor lines, commonly found in FASTA files. In addition, it reviews the construction of GenBank, Phylip, MSF and Nexus files.
Keywords: file format; FASTA; NCBI descriptor lines; GenBank; Phylip; MSF; Nexus
Table of Contents
- FASTA Files
- GenBank Flat Files
- Phylip Files
- MSF Files
- Nexus Files
- Converting between File Formats
- Disclaimer
- Figures
- Tables
Materials
Figures
-
Figure a0.1B.1 A sample FASTA file that contains the sequences for two homologous proteins, actophorin and yeast cofilin. Note that a greater‐than sign (>) designates the beginning of each entry and that each of the lines of sequence contains less than 80 characters. View Image -
Figure a0.1B.2 A sample GenBank record. Circled numbers identify the fields listed in Table . View Image -
Figure a0.1B.3 A sample PHYLIP‐formatted file. The five sequences shown are HIV‐1 and HIV‐2 gag proteins from a variety of isolates. See text for details. View Image -
Figure a0.1B.4 A sample MSF‐formatted file. The five sequences shown are HIV‐1 and HIV‐2 gag proteins from a variety of isolates. See text for details. View Image -
Figure a0.1B.5 A sample Nexus‐formatted file. The five sequences shown are HIV‐1 and HIV‐2 gag proteins from a variety of isolates. See text for details. View Image
Videos
Literature Cited
Internet Resources | |
http://iubio.bio.indiana.edu/cgi‐bin/readseq.cgi | |
ReadSeq biosequence interconversion tool. | |
http://www.ebi.ac.uk/clustalw | |
ClustalW multiple sequence alignment interface. |