ABIF File Format
ABIF File Format
ABIF File Format
September 2009
SUBJECT:
In this document
Introduction
This document is intended for programmers or bioinformatics groups who wish to
perform additional analysis or other manipulation of ab1 and/or fsa files. The ab1
file is a file type produced by Data Collection software generating sequencing data,
with the extension ".ab1". The fsa file is a file type produced by Data Collection
software generating fragment analysis data, with the extension ".fsa". Both the ab1
and fsa files use the ABIF file format.
The ABIF file format specifies the general rules on how the file is constructed, and
therefore the rules on how it can be read. Elements of data stored in the file are
associated with tags, which are analogous to the keys in a (key, value) mapping.
The ABIF file format by itself does not specify the schema for the ab1 and fsa files,
i.e. which tags are written and when. These schema are specific to the instrument and
software version which created the file.
This user bulletin describes the ABIF format. Following the ABIF specification are
the schemas for each instrument-software combination, for both the ab1 and fsa files
starting on page 21.
Schemas (tables with the valid tags) for the following instruments are given:
ABI PRISM 3100 and 3100-Avant Genetic Analyzer tags (page 21)
Applied Biosystems 3130/3130xl Genetic Analyzer tags (page 27)
Applied Biosystems 3500/3500xl Genetic Analyzer tags (page 33)
Applied Biosystems 3730/3730xl DNA Analyzer tags (page 44)
Some tags exist in the ab1 and fsa files for backward compatibility with earlier
versions of Applied Biosystems software. They are no longer used by current
versions of downstream analysis applications.
The ab1 and fsa schema documentation is provided for a specific instrument model
and software version. There is no guarantee that the tagged data described will be
consistent with files produced by earlier software releases and/or other instrument
models.
Forward
compatibility
The critical data in the ab1 and fsa files is stable, and in general new data will extend
the existing schema. However, Applied Biosystems provides no guarantee that all
tagged data elements will be present, consistent, or supported in future versions of
the software, particularly for data that pertains to the details of instrument control or
software integration.
Compatibility of
edited sample files
There are two ways to modify sample files (ab1 and fsa files), either by adding new
tags or changing existing tags. Sample files with new tags added by a user following
Applied Biosystems' instructions as set forth in "Detailed structure of the ABIF file"
on page 7, should continue to be compatible with Applied Biosystems software. Any
modification to sample files by changing the existing tags may result in the file no
longer being compatible with Applied Biosystems software.
IMPORTANT! Applied Biosystems does not recommend any modification of the
software files. Applied Biosystems does not support the editing of sample files in
any way and makes no guarantees as to the compatibility of such files with Applied
Biosystems software.
The ABIF file format is a binary file format for storing data. Elements of data stored
in the file are associated with tags, which are analogous to the keys in a (key, value)
mapping.
The ABIF format can accommodate a moderate number (<1000) of heterogeneous
data items. A data item can be a scalar value or an array. The basic data types are 8,
16, and 32 bit integers, 32 bit floating-point values, and ASCII characters. There are
also two compound data types: date and time. The data type of each item is identified
by an element type code.
Each data item is uniquely identified (tagged) within a file by a tag name and a tag
number. The tag name and tag number are stored internally as 32-bit integers; the tag
name is intended to be defined as a string of four 8-bit ASCII characters stored in
big-endian order. For example, the tag named ABCD is represented by the hex value
0x41424344. The ABIF format also includes a directory of all the tagged data
elements that are contained in a particular file.
Compatibility goals
Previous versions of ABIF libraries implicitly defined a complex format with many
features that may have been under utilized.
The goal of this user bulletin is to reduce this complexity by creating a restricted
definition that still meets the following goals:
Provide full read and write support for all data types needed by current (2009)
applications.
Provide full read and write support for the thumbprint and boolean legacy data
types.
Provide read and write support for user types only in the form of raw byte
arrays.
Provide read support only for all other legacy data types, only in the form of raw
byte arrays.
Ensure that any file written according to this specification can be read by the
current (2009) Applied Biosystems software.
Ensure that any ABIF file written by an application released during or after
1998 can be read according to this specification.
Eliminate any ambiguity in implementation requirements.
Historical notes
Applied Biosystems modeled the ABIF format after Tag Image File Format (TIFF), a
format for graphics files, and the Macintosh OS Resource Manager. "ABIF" is an
abbreviation of Applied Biosystems, Inc. Format.
The original ABIF specification was written in an era when a typical computer had 1
MB of RAM and operated at 16 MHz. Therefore, the ABIF libraries were designed
to perform input/output operations in several small pieces to minimize the amount of
data resident in RAM at one time.
The early ABIF file was expected to also serve as a simple database or nonresident
data structure, since virtual memory was not a feature of the operating system at that
time. The format was originally implemented on the Classic Mac OS, which used
floating blocks of memory called handles. This is the origin of the datahandle field in
ABIF directory entries. The datahandle field is reserved for internal use by libraries;
it has no meaning in the file itself, but it should not be modified or used for any other
purpose. Part of the header area was reserved for managing range-locking of data
items. This was part of a plan to implement multi-user access controls, which were
never implemented.
To avoid the effort involved in rewriting a file from scratch, the original ABIF
specification allowed for multiple, linked tag directories (as does TIFF) so that data
could easily be appended to an existing file. Also like TIFF, ABIF originally allowed
for little-endian (as well as big-endian) byte ordering, which would be indicated by
the order of the letters (A B I F) in the first four bytes of the file. These features were
probably never used, and they have been eliminated from the current specification.
The ABIF format supports storage of the basic low-level data types common to most
programming languages. These include char (a single byte character), short (a two
byte integer), long (a four byte integer), float (a four byte floating point value) and
double (an eight byte floating point value). Data stored as any of these basic types
can be either scalar values or arrays.
In addition, two types of strings are supported. The cString type is a C-style string
(null terminated). The pString type is a Pascal-style string (the length of the string is
stored in the first byte of data). Values stored as pString are required to be less than
256 characters.
Some additional storage formats have been defined and are described below.
Data tags
Tags are used to index the data contained in the file and can be thought of as (name,
number) pairs. In practice, the names are required to be four characters and thus can
always be converted to four-byte integers.
Unique (name, number) combinations define unique tags. For example, tags with the
same name but different numbers are allowed to be of different types and contain
unrelated information.
Data storage
A designated section of the ABIF file contains a directory. The directory entries
contain the tag (name, number) information, data type, number of elements, etc. For
data values that are four bytes or less, the value is stored in the directory entry.
Otherwise, an offset to the binary data in the file is stored. Details of the directory
and binary storage formats are described below.
Header
File signature
1
A
2
B
3
I
The first four bytes of the file are the ASCII codes for A, B, I, F. You
implementations should check these bytes to verify that a file's format is ABIF.
Version number
4
101
The next two bytes comprise a 16-bit integer corresponding to the version number of
the format.
The version number is listed in earlier libraries as being equal to "version number
field x100", suggesting that the current value of 101 would be interpreted as "version
1.1," i.e., the first minor variation number of the first major version.
A common interpretation of major and minor version numbers is that a major version
change indicates a break in code compatibility, while a minor version change
indicates a change only in interpretation or content. Given that, files conforming to
the specification in this document would have a version number of 102, because the
major compatibility is the same, but some obsolete features have been formally
dropped. This is not the case, however. As long as compatibility with Applied
Biosystems software is required, your implementations must continue to write a
value of 101 here.
Your implementations must read this value to check for compatibility between the
file's format and the current version of the library; perform the check by dividing this
value by 100 to get the major version number, and then comparing that value with the
major version of the library. If the values differ, your implementation must return an
error without attempting to read further. If the major version numbers are the same,
reading may continue and it is up to the client application to handle any difference in
minor version number.
Directory entry
structure
The next 28 bytes comprise a single directory entry structure that points to the
directory. A directory entry is a packed structure (no padding bytes) of the following
form:
struct DirEntry{
SInt32 name;
//tag name
SInt32 number;
//tag number
SInt16 elementtype;
SInt16 elementsize;
SInt32 numelements;
SInt32 datasize;
SInt32 dataoffset;
SInt32 datahandle;
//reserved
Header
10
1
14
16
1023
28
18
num elements
22
data size
26
data offset
30
0
Your implementations which write ABIF should use the values shown above for tag
number, element type and the other items in the DirEntry struct. The directory size
(datasize) should be exactly the size required for the entries (numelements x
elementsize).
Your implementations that read ABIF must extract the numelements field, a
32-bit integer at byte18, and the dataoffset field, a 32-bit integer at byte 26.
These specify the number of entries in the directory and the location of the directory.
The other fields should be ignored.
Note: Previous libraries may have reserved additional space in the directory, and
Unused space in
the header
The DirEntry is followed by 47 2-byte integers, all to be ignored on the input and set
to zero on output.
35
34
36
37
0
0
...
35
34
36
37
0
The original spec reserved these fields to implement range-locking for a multi-user
access scheme, but that feature was never implemented.
Directory
The directory is located at the offset specified in the header, and consists of an array
of directory entries.
directory offset
directory entry #1
directory offset + 28
directory entry #2
directory offset + 56
directory entry #3
...
Fields in a directory
entry
10
Directory
Tag number
11
Number of elements
Dataoffset value
15 0x0F000000
0x02414200
0x00010002
Data handle
12
Data types
Data types
This specification describes three data types:
Current types (see below)
Legacy data types which should be supported (see page 15 )
Legacy data types which do not need to be supported (see page 16)
Current data types
Name
byte
Element type
Element size
1 byte
Description
Name
char
Element type
Element size
1 byte
Description
Name
word
Element type
Element size
2 bytes
Description
Name
short
Element type
Element size
2 bytes
Description
Name
long
Element type
Element size
4 bytes
Description
13
Name
float
Element type
Element size
4 bytes
Description
Name
double
Element type
Element size
8 bytes
Description
Name
date
Element type
10
Element size
4 bytes
Description
Name
time
Element type
11
Element size
4 bytes
Description
14
Data types
Name
pString
Element type
18
Element size
1 byte
Description
Name
cString
Element type
19
Element size
1 byte
Description
Supported legacy
data types
Name
thumb
Element type
12
Element size
10 bytes
Description
{
SInt32 d;
SInt32 u;
UInt8 c;
UInt8 n;
}
15
Unsupported
legacy data types
Name
bool
Element type
13
Element size
1 byte
Description
One-byte boolean value, with zero meaning false and any other value
meaning true.
Name
user
Element type
1024 or greater
Element size
1 byte
Description
Name
rational
Element type
Element size
8 bytes
Description
{
SInt32 numerator;
SInt32 denominator
}
16
Name
BCD
Element type
Element size
unknown
Description
Data types
Name
point
Element type
14
Element size
4 bytes
Description
{
SInt16 v;
SInt16 h;
}
Name
rect
Element type
15
Element size
8 bytes
Description
{
SInt16
SInt16
SInt16
SInt16
top;
left;
bottom;
right;
Name
vPoint
Element type
16
Element size
8 bytes
Description
{
SInt32 v;
SInt32 h;
}
17
Name
vRect
Element type
17
Element size
16 bytes
Description
{
SInt32
SInt32
SInt32
SInt32
top;
left;
bottom;
right;
Name
Tag
Element type
20
Element size
8 bytes
Description
{
SInt32 name;
SInt32 number;
}
18
Name
deltaComp
Element type
128
Element size
Description
Compressed data.
Name
LZWComp
Element type
256
Element size
Description
Compressed data.
Name
deltaLZW
Element type
384
Element size
Description
Compressed data.
FooS
Number
42
Element type
user
Element size
12
Num elements
Instead, use three separate data items, one for each field, as shown below:
Name
Alph
Number
42
Element type
byte
Element size
Num elements
19
Name
Beta
Number
42
Element type
short
Element size
Num elements
Name
Gamm
Number
42
Element type
long
Element size
Num elements
contain additional tags or the contents and/or format of the existing tags may be
modified.
20
Table 1: ab1 File Tags from ABI PRISM 3100/3100-Avant Analyzer Data Collection
Software v2.0 on the ABI PRISM 3100/3100-Avant Genetic Analyzer (below)
Table2: fsa File Tags from ABI PRISM 3100/3100-Avant Analyzer Data Collection
Software v2.0 on the ABI PRISM 3100/3100-Avant Genetic Analyzer (page 24)
Optional tags are shown in italic text.
Table 1 ab1 File Tags from ABI PRISM 3100/3100-Avant Analyzer Data
Collection Software v2.0 on the ABI PRISM 3100/3100-Avant Genetic Analyzer
Name
Number
ABIF Type
Description
APFN
pString
APXV
cString
APrN
cString
APrV
cString
APrX
char
CMNT
pString
CTID
cString
CTNM
cString
CTTL
pString
Comment title
CpEP
char
Is Capillary Machine?
DATA
short
DATA
short
DATA
short
DATA
short
DATA
short
DATA
short
DATA
short
DATA
short
DATA
105
short
DSam
short
Downsampling factor
DySN
pString
Dye#
short
DyeN
pString
Dye 1 name
DyeN
pString
Dye 2 name
DyeN
pString
Dye 3 name
DyeN
pString
Dye 4 name
21
Name
Number
ABIF Type
DyeW
short
Dye 1 wavelength
DyeW
short
Dye 2 wavelength
DyeW
short
Dye 3 wavelength
DyeW
short
Dye 4 wavelength
EPVt
long
EVNT
pString
EVNT
pString
EVNT
pString
EVNT
pString
FWO_
char
GTyp
pString
InSc
long
InVt
long
LANE
short
Lane/Capillary
LIMS
pString
Sample tracking ID
LNTD
short
Length to detector
LsrP
long
MCHN
pString
MODF
pString
MODL
char
Model number
NAVG
short
NLNE
short
Number of capillaries
OfSc
long
OvrI
1-N
long
One value for each dye. List of scan number indices for scans with color
data values >32767. Values cannot be greater than 32000. (optional)
OvrV
1-N
long
One value for each dye. List of color data values for the locations listed in
the OvrI tag. Number of OvrV tags must be equal to the number of OvrI
tags. (optional)
PDMF
pString
PXLB
long
RGCm
cString
RGNm
cString
RMXV
cString
RMdN
cString
RMdV
cString
RMdX
char
22
Description
Base order
Gel type description
Name
Number
ABIF Type
Description
RPrN
cString
RPrV
cString
RUND
date
RUND
date
RUND
date
RUND
date
RUNT
time
RUNT
time
RUNT
time
RUNT
time
Rate
user
Scanning rate
RunN
cString
SCAN
long
SMED
pString
SMLt
pString
SMPL
pString
Sample name
SVER
pString
SVER
pString
Satd
long
Array of longs representing the scan numbers of data points, which are
flagged as saturated by data collection (optional)
Scal
float
Scan
short
TUBE
pString
Tmpr
long
User
pString
Run Name
Number of scans
Well ID
Run temperature setting
Name of user who created the plate (optional)
23
Table 2 fsa File Tags from ABI PRISM 3100/3100-Avant Analyzer Data Collection
Software v2.0 on the ABI PRISM 3100/3100-Avant Genetic Analyzer
Name
Number
ABIF Type
ANME
cString
CMNT
1-N
pString
CTID
cString
CTNM
cString
CTTL
pString
Comment title
CpEP
char
Is Capillary Machine?
DATA
short
DATA
short
DATA
short
DATA
short
DATA
short
DATA
short
DATA
short
DATA
short
DATA
105
short
DSam
short
Downsampling factor
DySN
pString
Dye#
short
Number of dyes
DyeB
char
DyeB
char
DyeB
char
DyeB
char
DyeB
char
DyeN
pString
Dye 1 name
DyeN
pString
Dye 2 name
DyeN
pString
Dye 3 name
DyeN
pString
Dye 4 name
DyeN
pString
DyeW
short
Dye 1 wavelength
DyeW
short
Dye 2 wavelength
DyeW
short
Dye 3 wavelength
DyeW
short
Dye 4 wavelength
DyeW
short
24
Description
Name
Number
ABIF Type
Description
EPVt
long
EVNT
pString
EVNT
pString
EVNT
pString
EVNT
pString
GTyp
pString
InSc
long
InVt
long
LANE
short
Lane/Capillary
LIMS
pString
Sample tracking ID
LNTD
short
Length to detector
LsrP
long
MCHN
pString
MODF
pString
MODL
char
Model number
NAVG
short
NLNE
short
Number of capillaries
OfSc
long
OvrI
1-N
long
One value for each dye. List of scan number indices for scans with color
data values >32767. Values cannot be greater than 32000. (optional)
OvrV
1-N
long
One value for each dye. List of color data values for the locations listed in
the OvrI tag. Number of OvrV tags must be equal to the number of OvrI
tags. (optional)
PANL
cString
PXLB
long
RGCm
cString
RGNm
cString
RMXV
cString
RMdN
cString
RMdV
cString
RMdX
char
RPrN
cString
RPrV
cString
RUND
date
RUND
date
RUND
date
25
Name
Number
ABIF Type
RUND
date
RUNT
time
RUNT
time
RUNT
time
RUNT
time
Rate
user
Scan rate
RunN
cString
Run Name
SCAN
long
SMED
pString
SMLt
pString
STYP
cString
SVER
pString
SVER
pString
SVER
pString
Sample File Format Version, containing the version of the sample file
format used to write the file
Satd
long
Array of longs representing the scan numbers of data points, which are
flagged as saturated by data collection (optional)
Scal
float
Scan
short
SpNm
pString
StdF
pString
TUBE
pString
Well ID
Tmpr
long
User
pString
26
Description
Number of scans
Table 3: ab1 File Tags from Applied Biosystems 3130/3130xl Data Collection
Software v3.0 on the Applied Biosystems 3130/3130xl Genetic Analyzer (below)
Table 4: fsa File Tags from Applied Biosystems 3130/3130xl Data Collection
Software v3.0 on the Applied Biosystems 3130/3130xl Genetic Analyzer (page 30)
Optional tags are shown in italic text.
Table 3 ab1 File Tags from Applied Biosystems 3130/3130xl Data Collection
Software v3.0 on the Applied Biosystems 3130/3130xl Genetic Analyzer
Name
Number
ABIF Type
Description
APFN
pString
APXV
cString
APrN
cString
APrV
cString
APrX
char
CMNT
pString
CTID
cString
CTNM
cString
CTOw
cString
Container owner
CTTL
pString
Comment title
CpEP
char
Is Capillary Machine?
DATA
short
DATA
short
DATA
short
DATA
short
DATA
short
DATA
short
DATA
short
DATA
short
DATA
105
short
DSam
short
Downsampling factor
DySN
pString
Dye#
short
DyeN
pString
Dye 1 name
DyeN
pString
Dye 2 name
DyeN
pString
Dye 3 name
27
Name
Number
ABIF Type
DyeN
pString
DyeW
short
Dye 1 wavelength
DyeW
short
Dye 2 wavelength
DyeW
short
Dye 3 wavelength
DyeW
short
Dye 4 wavelength
EPVt
long
EVNT
pString
EVNT
pString
EVNT
pString
EVNT
pString
FWO_
char
GTyp
pString
HCFG
cString
Instrument Class
HCFG
cString
Instrument Family
HCFG
cString
HCFG
cString
Instrument Parameters
InSc
long
InVt
long
LANE
short
Lane/Capillary
LIMS
pString
Sample tracking ID
LNTD
short
Length to detector
LsrP
long
MCHN
pString
MODF
pString
MODL
char
Model number
NAVG
short
NLNE
short
Number of capillaries
OfSc
long
OvrI
1-N
long
One value for each dye. List of scan number indices for scans with color
data values >32767. Values cannot be greater than 32000. (optional)
OvrV
1-N
long
One value for each dye. List of color data values for the locations listed in
the OvrI tag. Number of OvrV tags must be equal to the number of OvrI
tags. (optional)
PDMF
pString
PXLB
long
RGCm
cString
28
Description
Dye 4 name
Base order
Name
Number
ABIF Type
Description
RGNm
cString
RMXV
cString
RMdN
cString
RMdV
a1
cString
RMdX
char
RPrN
cString
RPrV
cString
RUND
date
RUND
date
RUND
date
RUND
date
RUNT
time
RUNT
time
RUNT
time
RUNT
time
Rate
user
Scanning rate
RunN
cString
SCAN
long
SMED
pString
SMLt
pString
SMPL
pString
Sample name
SVER
pString
SVER
pString
Satd
long
Array of longs representing the scan numbers of data points, which are
flagged as saturated by data collection (optional)
Scal
float
Scan
short
TUBE
pString
Tmpr
long
User
pString
Run Name
Number of scans
Well ID
Run temperature setting
Name of user who created the plate (optional)
29
Table 4 fsa File Tags from Applied Biosystems 3130/3130xl Data Collection
Software v3.0 on the Applied Biosystems 3130/3130xl Genetic Analyzer
Name
Number
ABIF Type
1-N
pString
CTID
cString
CTNM
cString
CTOw
cString
Container owner
CTTL
pString
Comment title
CpEP
char
Is Capillary Machine?
DATA
short
DATA
short
DATA
short
DATA
short
DATA
short
DATA
short
DATA
short
DATA
short
DATA
105
short
DCHT
short
DSam
short
Downsampling factor
DySN
pString
Dye#
short
Number of dyes
DyeB
char
DyeB
char
DyeB
char
DyeB
char
DyeB
char
DyeN
pString
Dye 1 name
DyeN
pString
Dye 2 name
DyeN
pString
Dye 3 name
DyeN
pString
Dye 4 name
DyeN
pString
DyeW
short
Dye 1 wavelength
DyeW
short
Dye 2 wavelength
DyeW
short
Dye 3 wavelength
DyeW
short
Dye 4 wavelength
CMNT
30
Description
Name
Number
ABIF Type
Description
DyeW
short
EPVt
long
EVNT
pString
EVNT
pString
EVNT
pString
EVNT
pString
GTyp
pString
HCFG
cString
Instrument Class
HCFG
cString
Instrument Family
HCFG
cString
HCFG
cString
Instrument Parameters
InSc
long
InVt
long
LANE
short
Lane/Capillary
LIMS
pString
Sample tracking ID
LNTD
short
Length to detector
LsrP
long
MCHN
pString
MODF
pString
MODL
char
Model number
NAVG
short
NLNE
short
Number of capillaries
OfSc
long
OvrI
1-N
long
One value for each dye. List of scan number indices for scans with color
data values >32767. Values cannot be greater than 32000. (optional)
OvrV
1-N
long
One value for each dye. List of color data values for the locations listed in
the OvrI tag. Number of OvrV tags must be equal to the number of OvrI
tags. (optional)
PANL
cString
PSZE
long
PTYP
cString
PXLB
long
RGCm
cString
RGNm
cString
RMXV
cString
31
Name
Number
ABIF Type
RMdN
cString
RMdV
cString
RMdX
char
RPrN
cString
RPrV
cString
RUND
date
RUND
date
RUND
date
RUND
date
RUNT
time
RUNT
time
RUNT
time
RUNT
time
Rate
user
Scan rate
RunN
cString
Run Name
SCAN
long
SMED
pString
SMLt
pString
SVER
pString
SVER
pString
SVER
pString
Sample File Format Version, containing the version of the sample file
format used to write the file
Satd
long
Array of longs representing the scan numbers of data points, which are
flagged as saturated by data collection (optional)
Scal
float
Scan
short
SpNm
pString
TUBE
pString
Well ID
Tmpr
long
User
pString
32
Description
Number of scans
Table 5: ab1 File Tags from Applied Biosystems 3500/3500xl Data Collection
Software v3.0 on the Applied Biosystems 3500/3500xl Genetic Analyzer (below)
Table 6: fsa File Tags from Applied Biosystems 3500/3500xl Data Collection
Software v3.0 on the Applied Biosystems 3500/3500xl Genetic Analyzer (page 39)
Optional tags are shown in italic text.
Table 5 abi File Tags from Applied Biosystems 3500/3500xl Genetic Analyzer
Data Collection Software v1.0 on the Applied Biosystems 3500/3500xl Genetic
Analyzer
Name
Number
ABIF Type
Description
AAct
boolean
ABED
cString
Anode buffer expiration date using ISO 8601 format using the patterns
YYYY-MM-DDTHH:MM:SS.ss+/-HH:MM. Hundredths of a second are
optional.
ABID
cString
ABLt
cString
ABRn
long
ABTp
cString
AEPt
short
AEPt
short
AmbT
short[]
APCN
cString
Amplicon name
APFN
pString
APrN
cString
APrV
cString
APrX
char[]
APXV
cString
ARTN
long
ASPF
short
ASPt
short
ASPt
short
AsyN
cString
AsyC
char[]
AsyV
cString
AUDT
char[]
AVld
cString
B1Pt
short
33
Name
Number
ABIF Type
Description
B1Pt
short
Reference scan number for mobility and spacing curves for last analysis
BcRn
long
Basecalling qc code
BcRs
cString
BcRs
cString
BCTS
pString
CAED
cString
CALt
cString
CARn
long
CASN
cString
CBED
cString
CBID
cString
CBLt
cString
CBRn
long
CBTp
cString
CkSm
cString
File checksum
CLRG
long
CLRG
long
CMNT
pString
Sample Comment
CpEP
byte
CRLn
long
CRLn
cString
CTID
cString
CTNM
cString
CTOw
cString
CTTL
pString
Comment Title
DATA
1-4, 105199
short[]
DATA
short[]
DATA
short[]
Short Array holding measured milliAmps trace (EP current) during run
DATA
short[]
DATA
short[]
DATA
9-12, 205299
short[]
DCEv
cString
DCHT
short
34
Name
Number
ABIF Type
Description
DOEv
cString
DSam
short
Downsampling rate
Dye#
short
Number Of Dyes
DyeN
1-N
pString
DyeW
1-N
short
DySN
pString
ESig
char[]
EVNT
pString
EPVt
Dye Name
Dye wavelength
Dye Set Name
long
Electronic signature record used across 3500 software
EVNT
pString
EVNT
pString
EVNT
pString
Feat
FTab
FVoc
FWO_
char[]
GTyp
pString
HCFG
cString
The Instrument Class. All upper case, no spaces. Initial valid value: CE
HCFG
cString
The Instrument Family. All upper case, no spaces. Valid values: 31XX or
37XX for UDC, 35XX (for 3500)
HCFG
cString
The official instrument name. Mixed case, minus any special formatting.
Initial valid values: 3130, 3130xl, 3730, 3730xl, 3500, 3500xl.
HCFG
cString
InjN
cString
Injection name
InSc
long
InVt
long
LANE
short
LAST
cString
LIMS
pString
Sample Tracking ID
LNTD
short
LsrP
long
MCHN
pString
MODF
pString
Run Module filename. This is redundant with the new tag RMdN
MODL
char[]
NAVG
short
NLNE
short
35
Name
Number
ABIF Type
Description
NOIS
float[]
The estimate of rms baseline noise (S/N ratio) for each dye for a
successfully analyzed sample. Corresponds in order to the raw data in
tags DATA 1-4. KB basecaller only.
OfSc
long[]
OvrI
1-N
long[]
One for each dye (unanalyzed and/or analyzed data). List of scan
number indexes that have values greater than 32767 but did not saturate
the camera. In Genemapper samples, this can have indexes with values
greater than 32000. In sequencing samples, this cannot have indexes
with values greater than 32000.
OvrV
1-N
long[]
One for each dye (unanalyzed and/or analyzed data). List of color data
values found at the locations listed in the OvrI tag. Optional. There must
be exactly as many numbers in this array as in the OvrI array.
P1RL
short[]
P1AM
short[]
P1WD
short[]
P2BA
char[]
P2RL
short[]
P2AM
short[]
PBAS
char[]
PBAS
char[]
PCON
char[]
PCON
char[]
PDMF
pString
PDMF
pString
phAR
float
phCH
pString
phDY
pString
phQL
short
phTR
short
phTR
float
Trim probability
PLOC
short[]
PLOC
short[]
PROJ
cString
PRJT
cString
PSZE
long
PTYP
cString
PuSc
long
36
Name
Number
ABIF Type
Description
PXLB
long
QcPa
cString
QcRn
long
QcRs
cString
QcRs
cString
QV20
long
QV20
cString
Rate
User type
RevC
RGCm
cString
RGNm
cString
RGOw
cString
The name entered as the Owner of a Results Group, in the Results Group
Editor. Implemented as the user name from the results group.
RInj
long
Reinjection number. The reinjection number that this sample belongs to.
Not present if there was no reinjection.
Raman normalization factor
QV20+ value
One of "Pass", "Fail", or "Check"
Scanning Rate. Milliseconds per frame.
Flag for whether the sequence has been complemented
RNmF
float
RMdN
cString
RMdV
cString
RMdX
char[]
RMXV
cString
RPrN
cString
RPrV
cString
RUND
date
RUNT
time
RUND
date
RUNT
time
RUND
date
RUNT
time
RUND
date
RUNT
time
RunN
cString
Satd
long[]
Scal
float
Rescaling Divisor for reducing the dynamic range of the color data
SCAN
long
Scan
short
ScPa
cString
ScSt
long
S/N%
short[]
SMED
pString
37
Name
Number
ABIF Type
Description
SMID
cString
SMLt
pString
SMPL
pString
SMRn
long
SPAC
float
SPAC
pString
SPAC
float
SPEC
cString
SpeN
cString
SVER
pString
SVER
pString
SVER
pString
SVER
pString
Tmpr
long
TrPa
cString
TrSc
long
TrSc
cString
TUBE
pString
User
pString
38
Trace score.
Table 6 fsa File Tags from Applied Biosystems 3500/3500xl Genetic Analyzer
Data Collection Software v1.0 on the Applied Biosystems 3500/3500xl Genetic
Analyzer
Name
Number
ABIF Type
Description
AAct
boolean
ABED
cString
Anode buffer expiration date using ISO 8601 format using the patterns
YYYY-MM-DDTHH:MM:SS.ss+/-HH:MM. Hundredths of a second are
optional.
ABID
cString
ABLt
cString
ABRn
long
ABTp
cString
AmbT
short[]
Anld
boolean
ANME
cString
APrN
cString
APrV
cString
APrX
char[]
APXV
cString
AsyN
cString
AsyC
char[]
AsyV
cString
AUDT
char[]
AVld
cString
CAED
cString
CALt
cString
CARn
long
CASN
cString
CBED
cString
CBID
cString
CBLt
cString
CBRn
long
CBTp
cString
CkSm
cString
File checksum
CMNT
pString
Sample Comment
CpEP
byte
CTID
cString
39
Name
Number
ABIF Type
Description
CTNM
cString
CTOw
cString
CTTL
pString
Comment Title
DATA
1-4, 105199
short[]
DATA
short[]
DATA
short[]
Short Array holding measured milliAmps trace (EP current) during run
DATA
short[]
DATA
short[]
DATA
9-12
short[]
DCEv
cString
DCHT
short
DOEv
cString
DSam
short
Downsampling rate
Dye#
short
Number Of Dyes
DyeN
1-N
pString
Dye Name
DyeW
1-N
short
Dye wavelength
DySN
pString
long
EPVt
ESig
char[]
EVNT
pString
EVNT
pString
EVNT
pString
EVNT
pString
FWO_
char[]
GTyp
pString
HCFG
cString
The Instrument Class. All upper case, no spaces. Initial valid value: CE
HCFG
cString
The Instrument Family. All upper case, no spaces. Valid values: 31XX or
37XX for UDC, 35XX (for 3500)
HCFG
cString
The official instrument name. Mixed case, minus any special formatting.
Initial valid values: 3130, 3130xl, 3730, 3730xl, 3500, 3500xl.
HCFG
cString
InjN
cString
Injection name.
InSc
long
InVt
long
40
Name
Number
ABIF Type
Description
LANE
short
LIMS
pString
Sample Tracking ID
LNTD
short
LsrP
long
MCHN
pString
MODF
pString
Run Module filename. This is redundant with the new tag RMdN
MODL
char[]
NAVG
short
NLNE
short
NrmF
1-N
double
NrmS
double
OffS
cString
OfSc
long[]
OvrI
1-N
long[]
One for each dye (unanalyzed and/or analyzed data). List of scan
number indexes that have values greater than 32767 but did not saturate
the camera. In Genemapper samples, this can have indexes with values
greater than 32000. In sequencing samples, this cannot have indexes
with values greater than 32000.
OvrV
1-N
long[]
One for each dye (unanalyzed and/or analyzed data). List of color data
values found at the locations listed in the OvrI tag. Optional. There must
be exactly as many numbers in this array as in the OvrI array.
PANL
cString
PANm
cString
PDMF
pString
Peak
short[]
Peak
int[]
Peak
int[]
Peak
int[]
Peak
short[]
FWHM of peak
Peak
double[]
Peak
int[]
Peak
int[]
Peak
int[]
Peak
10
int[]
Peak
11
double[]
Peak
12
double[]
Peak
13
double[]
Peak
14
double[]
Peak
15
double[]
Peak
16
double[]
41
Name
Number
ABIF Type
Description
Peak
17
double[]
Peak
18
double[]
Peak
19
cString
Peak
20
cString
Peak
21
double[]
Peak
22
cString
Peak
23
cString
Peak
24
cString
Peak
25
cString
PSZE
long
PTYP
cString
PXLB
long
Rate
User type
RGCm
cString
RGNm
cString
RGOw
cString
The name entered as the Owner of a Results Group, in the Results Group
Editor. Implemented as the user name from the results group.
RInj
long
Reinjection number. The reinjection number that this sample belongs to.
Not present if there was no reinjection.
RNmF
float
RMdN
cString
RMdV
cString
RMdX
char[]
RMXV
cString
RPrN
cString
RPrV
cString
RUND
date
RUNT
time
RUND
date
RUNT
time
RUND
date
RUNT
time
RUND
date
RUNT
time
RunN
cString
SamQ
cString
Satd
long[]
Scal
float
Rescaling Divisor for reducing the dynamic range of the color data.
42
Name
Number
ABIF Type
Description
SCAN
long
Scan
short
ScPa
cString
ScSt
long
SnpS
pString
SMAP
int[]
SMAP
double[]
SMED
pString
SMID
cString
SMLt
pString
Sm#P
int
SMPL
pString
SMRn
long
SpeN
cString
SpNm
pString
StdF
pString
STYP
cString
SVER
pString
SVER
cString
Sizecaller version
SVER
pString
SVER
pString
SzFt
double[]
SzFt
double
SZTS
cString
Tmpr
long
TUBE
pString
UDEF
1-10
cString
User
pString
UsrE
boolean
The flag indicating whether size match results have been edited by the
user. Unused for 3500. Reserved for backward compatibility.
43
Table 7: ab1 File Tags from Applied Biosystems 3730/3730xl Data Collection
Software v2.0 on the Applied Biosystems 3730/3730xl Genetic Analyzer (below)
Table 8: fsa File Tags from Applied Biosystems 3730/3730xl Data Collection
Software v2.0 on the Applied Biosystems 3730/3730xl DNA Analyzer (page 47)
Table 9: ab1 File Tags from Applied Biosystems 3730/3730xl Data Collection
Software v3.0 on the Applied Biosystems 3730/3730xl DNA Analyzer DNA
Analyzer (page 50)
Table 10: fsa File Tags from Applied Biosystems 3730/3730xl Data Collection
Software v3.0 on the Applied Biosystems 3730/3730xl DNA Analyzer (page 53)
Optional tags are shown in italic text.
Table 7 ab1 File Tags from Applied Biosystems 3730/3730xl Data Collection
Software v2.0 on the Applied Biosystems 3730/3730xl Genetic Analyzer
Name
Number
ABIF Type
APFN
pString
APXV
cString
APrN
cString
APrV
cString
APrX
char
BufT
short
CMNT
pString
CTID
cString
CTNM
cString
CTTL
pString
Comment title
CpEP
char
Is Capillary Machine?
DATA
short
DATA
short
DATA
short
DATA
short
DATA
short
DATA
short
DATA
short
DATA
short
DSam
short
Downsampling factor
DySN
pString
44
Description
Name
Number
ABIF Type
Description
Dye#
short
DyeN
pString
Dye 1 name
DyeN
pString
Dye 2 name
DyeN
pString
Dye 3 name
DyeN
pString
Dye 4 name
DyeW
short
Dye 1 wavelength
DyeW
short
Dye 2 wavelength
DyeW
short
Dye 3 wavelength
DyeW
short
Dye 4 wavelength
EPVt
long
EVNT
pString
EVNT
pString
EVNT
pString
EVNT
pString
FWO_
char
GTyp
pString
InSc
long
InVt
long
LANE
short
Lane/Capillary
LIMS
pString
Sample tracking ID
LNTD
short
Length to detector
LsrP
long
MCHN
pString
MODF
pString
MODL
char
Model number
NAVG
short
NLNE
short
Number of capillaries
OfSc
long
OvrI
1-N
long
One value for each dye. List of scan number indices for scans with color
data values >32767. Values cannot be greater than 32000. (optional)
PDMF
pString
PXLB
long
RGCm
cString
RGNm
cString
RMXV
cString
Number of dyes
Base order
Gel type description
45
Name
Number
ABIF Type
RMdN
cString
RMdV
cString
RMdX
char
RPrN
cString
RPrV
cString
RUND
date
RUND
date
RUND
date
RUND
date
RUNT
time
RUNT
time
RUNT
time
RUNT
time
Rate
user
Scan rate.
RunN
cString
Run Name
SCAN
long
SMED
pString
SMLt
pString
SMPL
pString
Sample name
SVER
pString
SVER
pString
Satd
long
Array of longs representing the scan numbers of data points, which are
flagged as saturated by data collection (optional)
Scal
float
Scan
short
TUBE
pString
Tmpr
long
User
pString
46
Description
Number of scans
Well ID
Run temperature setting
Name of user who created the plate (optional)
Table 8 fsa File Tags from Applied Biosystems 3730/3730xl Data Collection
Software v2.0 on the Applied Biosystems 3730/3730xl DNA Analyzer
Name
Number
ABIF Type
ANME
cString
BufT
short
1-N
pString
CTID
cString
CTNM
cString
CTTL
pString
Comment title
CpEP
char
Is Capillary Machine?
DATA
short
DATA
short
DATA
short
DATA
short
DATA
short
DATA
short
DATA
short
DATA
short
DATA
105
short
DSam
short
Downsampling factor
DySN
pString
Dye#
short
Number of dyes
DyeB
char
DyeB
char
DyeB
char
DyeB
char
DyeB
char
DyeN
pString
Dye 1 name
DyeN
pString
Dye 2 name
DyeN
pString
Dye 3 name
DyeN
pString
Dye 4 name
DyeN
pString
Dye 5 name
DyeW
short
Dye 1 wavelength
DyeW
short
Dye 2 wavelength
DyeW
short
Dye 3 wavelength
DyeW
short
Dye 4 wavelength
CMNT
Description
GeneMapper software analysis method name
Buffer tray heater temperature (degrees C)
47
Name
Number
ABIF Type
DyeW
short
EPVt
long
EVNT
pString
EVNT
pString
EVNT
pString
EVNT
pString
GTyp
pString
InSc
long
InVt
long
LANE
short
Lane/Capillary
LIMS
pString
Sample tracking ID
LNTD
short
Length to detector
LsrP
long
MCHN
pString
MODF
pString
MODL
char
Model number
NAVG
short
NLNE
short
Number of capillaries
OfSc
long
OvrI
1-N
long
One value for each dye. List of scan number indices for scans with color
data values >32767. Values cannot be greater than 32000. (optional)
OvrV
1-N
long
One value for each dye. List of color data values for the locations listed in
the OvrI tag. Number of OvrV tags must be equal to the number of OvrI
tags. (optional)
PANL
cString
PXLB
long
RGCm
cString
RGNm
cString
RMXV
cString
RMdN
cString
RMdV
cString
RMdX
char
RPrN
cString
RPrV
cString
RUND
date
RUND
date
48
Description
Name
Number
ABIF Type
Description
RUND
date
RUND
date
RUNT
time
RUNT
time
RUNT
time
RUNT
time
Rate
user
Scan rate.
RunN
cString
Run Name
SCAN
long
SMED
pString
SMLt
pString
STYP
cString
SVER
pString
SVER
pString
SVER
pString
Sample File Format Version, containing the version of the sample file
format used to write the file
Satd
long
Array of longs representing the scan numbers of data points, which are
flagged as saturated by data collection (optional)
Scal
float
Scan
short
SpNm
pString
StdF
pString
TUBE
pString
Well ID
Tmpr
long
User
pString
Number of scans
49
Table 9 ab1 File Tags from Applied Biosystems 3730/3730xl Data Collection
Software v3.0 on the Applied Biosystems 3730/3730xl DNA Analyzer DNA
Analyzer
Name
Number
ABIF Type
APFN
pString
APXV
cString
APrN
cString
APrV
cString
APrX
char
BufT
short
CMNT
pString
CTID
cString
CTNM
cString
CTOw
cString
Container owner
CTTL
pString
Comment title
CpEP
char
Is Capillary Machine?
DATA
short
DATA
short
DATA
short
DATA
short
DATA
short
DATA
short
DATA
short
DATA
short
DATA
105
short
DCHT
short
DSam
short
Downsampling factor
DySN
pString
Dye#
short
DyeN
pString
Dye 1 name
DyeN
pString
Dye 2 name
DyeN
pString
Dye 3 name
DyeN
pString
Dye 4 name
DyeW
short
Dye 1 wavelength
DyeW
short
Dye 2 wavelength
DyeW
short
Dye 3 wavelength
50
Description
Name
Number
ABIF Type
Description
DyeW
short
Dye 4 wavelength
EPVt
long
EVNT
pString
EVNT
pString
EVNT
pString
EVNT
pString
FWO_
char
GTyp
pString
HCFG
cString
Instrument Class
HCFG
cString
Instrument Family
HCFG
cString
HCFG
cString
Instrument Parameters
InSc
long
InVt
long
LANE
short
Lane/Capillary
LIMS
pString
Sample tracking ID
LNTD
short
Length to detector
LsrP
long
MCHN
pString
MODF
pString
MODL
char
Model number
NAVG
short
NLNE
short
Number of capillaries
OfSc
long
OvrI
1-N
long
One value for each dye. List of scan number indices for scans with color
data values >32767. Values cannot be greater than 32000. (optional)
OvrV
1-N
long
One value for each dye. List of color data values for the locations listed in
the OvrI tag. Number of OvrV tags must be equal to the number of OvrI
tags. (optional)
PDMF
pString
PSZE
long
PTYP
cString
PXLB
long
RGCm
cString
RGNm
cString
Base order
51
Name
Number
ABIF Type
RMXV
cString
RMdN
cString
RMdV
cString
RMdX
char
RPrN
cString
RPrV
cString
RUND
date
RUND
date
RUND
date
RUND
date
RUNT
time
RUNT
time
RUNT
time
RUNT
time
Rate
user
Scan rate.
RunN
cString
Run Name
SCAN
long
SMED
pString
SMLt
pString
SMPL
pString
Sample name
SVER
pString
SVER
pString
Satd
long
Array of longs representing the scan numbers of data points, which are
flagged as saturated by data collection (optional)
Scal
float
Scan
short
TUBE
pString
Tmpr
long
User
pString
52
Description
Number of scans
Well ID
Run temperature setting
Name of user who created the plate (optional)
Table 10 fsa File Tags from Applied Biosystems 3730/3730xl Data Collection
Software v3.0 on the Applied Biosystems 3730/3730xl DNA Analyzer
Name
Number
ABIF Type
ANME
cString
BufT
short
1-N
pString
CTID
cString
CTNM
cString
CTOw
cString
Container owner
CTTL
pString
Comment title
CpEP
char
Is Capillary Machine?
DATA
short
DATA
short
DATA
short
DATA
short
DATA
short
DATA
short
DATA
short
DATA
short
DATA
105
short
DCHT
short
DSam
short
Downsampling factor
DySN
pString
Dye#
short
Number of dyes
DyeB
char
DyeB
char
DyeB
char
DyeB
char
DyeB
char
DyeN
pString
Dye 1 name
DyeN
pString
Dye 2 name
DyeN
pString
Dye 3 name
DyeN
pString
Dye 4 name
DyeN
pString
DyeW
short
Dye 1 wavelength
DyeW
short
Dye 2 wavelength
CMNT
Description
GeneMapper analysis method name
Buffer tray heater temperature (degrees C)
53
Name
Number
ABIF Type
DyeW
short
Dye 3 wavelength
DyeW
short
Dye 4 wavelength
DyeW
short
EPVt
long
EVNT
pString
EVNT
pString
EVNT
pString
EVNT
pString
GTyp
pString
HCFG
cString
Instrument Class
HCFG
cString
Instrument Family
HCFG
cString
HCFG
cString
Instrument Parameters
InSc
long
InVt
long
LANE
short
Lane/Capillary
LIMS
pString
Sample tracking ID
LNTD
short
Length to detector
LsrP
long
MCHN
pString
MODF
pString
MODL
char
Model number
NAVG
short
NLNE
short
Number of capillaries
OfSc
long
OvrI
1-N
long
One value for each dye. List of scan number indices for scans with color
data values >32767. Values cannot be greater than 32000. (optional)
OvrV
1-N
long
One value for each dye. List of color data values for the locations listed in
the OvrI tag. Number of OvrV tags must be equal to the number of OvrI
tags. (optional)
PANL
cString
PSZE
long
PTYP
cString
PXLB
long
RGCm
cString
54
Description
Name
Number
ABIF Type
Description
RGNm
cString
RMXV
cString
RMdN
cString
RMdV
cString
RMdX
char
RPrN
cString
RPrV
cString
RUND
date
RUND
date
RUND
date
RUND
date
RUNT
time
RUNT
time
RUNT
time
RUNT
time
Rate
user
Scan rate.
RunN
cString
Run Name
SCAN
long
SMED
pString
SMLt
pString
STYP
cString
SVER
pString
SVER
pString
SVER
pString
Sample File Format Version, containing the version of the sample file
format used to write the file
Satd
long
Array of longs representing the scan numbers of data points which are
flagged as saturated by data collection (optional)
Scal
float
Scan
short
SpNm
pString
StdF
pString
TUBE
pString
Well ID
Tmpr
long
User
pString
Number of scans
55
Information in this document is subject to change without notice. Applied Biosystems assumes no
responsibility for any errors that may appear in this document.
Headquarters
850 Lincoln Centre Drive
Foster City, CA 94404 USA
Phone: +1 650.638.5800
Toll Free (In North America): +1 800.345.5224
Fax: +1 650.638.5884
TRADEMARKS:
The trademarks mentioned herein are the property of Life Technologies Corporation or their respective
owners.
All other trademarks are the sole property of their respective owners.
www.appliedbiosystems.com
09/2009