Arc Info Binary Coverage Format Analysis

来源:互联网 发布:java multimatchquery 编辑:程序博客网 时间:2024/05/17 09:25

 

Arc/Info Binary Coverage Format Analysis

 

Last Update: 2006-06-14, Daniel Morissette, dmorissette@mapgears.com

 


 

TABLE OF CONTENTS

 

  • 1. Introduction
    • 1.1 PC Arc/Info and other variants 1.2 Byte Ordering
  • 2. ARC Coverage Files
    • 2.1 File Header
    • 2.2 Index Files
    • 2.3 ARC
    • 2.4 PAL
    • 2.5 LAB
    • 2.6 CNT
    • 2.7 PRJ
    • 2.8 LOG
    • 2.9 TOL
    • 2.10 TX6/TX7 Annotations
    • 2.11 TXT Annotations
    • 2.12 RXP - Specific to Region coverages
    • 2.13 RPL - Specific to Region coverages
  • 3. The Attribute Files
  • 3.1 INFO Files in V7.x Coverages
    • 3.1.1 INFO/ARC.DIR
    • 3.1.2 INFO/ARC####.DAT
    • 3.1.3 INFO/ARC####.NIT
    • 3.1.4 Table Data files (.adf, ...)
    • 3.1.5 Name and location of TABLE DATA files
  • 3.2 INFO Files in "Weird" Coverages
  • 3.3 DBF Files in PC Coverages

 

 

1. INTRODUCTION

This is an attempt to document the binary vector coverage files used by Arc/Info V7.x for Unix and Windows NT. Since the coverage file's format is not documented by ESRI, this document is mainly based on the analysis of binary dumps of the files... this implies that the information may be incomplete (or even inaccurate!) in some cases. As for any document of this type, it is expected that it will evolve as we learn more.

Another great source of information to help understanding the format would be the (world famous) "ANALYSIS OF ARC EXPORT FILE FORMAT FOR ARC/INFO (REV 6.1.1)" (from which I "borrowed" some extracts ;-)... you can find it at:

 

http://www.geocities.com/~vmushinskiy/fformats/files/e00.txt

Since the contents of the E00 and binary coverage files are very close, the current document will often refer you to an updated version of the E00 Analysis Document mentioned above instead of duplicating the details about a specific file.

The first section of this document covers the coverage vector files (ARC, PAL, CNT, LAB, ...) and the second section covers the INFO files.

 

 

1.1 PC ARC/INFO COVERAGES AND VARIANTS

Even though this document covers mainly Arc/Info V7.x for Unix coverages, some notes have been included to document the differences between the V7.x Unix coverages and some variants.

In each section, the Unix V7.x coverage format is always discussed first (sometimes referred to as "V7 Coverages"), and when applicable, the following variants will also be discussed:

 

  • "PC Coverages V1": Coverages produced by the 16 bits version of Arc/Info for PCs (DOS or Windows?).

     

  • "PC Coverages V2": Look like an hybrid between PC Coverages V1 and V7.x Coverages. Probably produced on Unix systems with Motorola byte ordering. They use DBF files for the info tables (located in the coverage directory) just like PC Coverages V1, but use .adf files for the other files like V7.x Coverages. They also use the same byte ordering as V7.x Coverages.

     

  • "Weird Coverages": Refers to some kind of hybrid between V7 and PC coverages. Probably produced by an early version of Arc/Info for Unix.
    These coverages use the same byte ordering as V7 Coverages.
    The attribute files in these coverages are located in an INFO directory and have names similar (but not identical) to V7 Coverages. The coverage files (ARC, PAL, etc.) are named the same way as in PC Coverages, but they do not have the first 256 bytes header of PC Coverages and they are not padded to a multiple of 256 bytes at the end.
    The name "Weird coverages" refers to our reaction when we saw those coverages for the first time. ;-)

 

1.2 BYTE ORDERING

V7.x Coverages always use MSB-First (Motorola) byte ordering for both the ARC coverage files and the INFO tables. This is true even for coverages produced by Arc/Info V7.x for Windows NT on an Intel platform.

PC Arc/Info coverages V1 always use LSB-First (Intel) byte ordering.

PC Arc/Info coverages V2 always use MSB-First (Motorola) byte ordering.

The Weird coverages use MSB-First (Motorola) byte ordering (Same as V7 Coverages).

 


 

2. ARC COVERAGE FILES

All the vector (ARC) coverage files are stored in the same directory. The name of this directory is the name of the coverage.

The name of the coverage directory (and thus the name of the coverage) appears to be limited to 13 characters.

 

2.1 File Header

 

     

    2.1.1 V7.x Coverage File Header

    Most of these files have a 100 bytes header:

     

    BytesTypeDescription0-3int32Signature - Constant for a given file type4-7int32Precision - Usually > 0 for single precision,                             and < 0 for double precision, but                                    there are exceptions.8-11int32Record size, for files with fixed size records                (or 0 for variable length records)12-23All zeros24-27int32File size (in 2 byte words), including header size 28-99All zeros

     

     

    2.1.2 PC Coverage File Header

    PC Coverages V1 first start with a 256 bytes header, followed by the 100 bytes header described above, for a total of 356 bytes of header. All the files that have this 256 bytes header have an actual size which is a multiple of 256 bytes, padded with junk at the end. So it is very important to take the size specified in the header into account when reading these files.

    Here is what we find in this 256 bytes header specific to PC coverages:

     

    BytesTypeDescription0-1int16Signature ??? 0x0400 or 0x00002-5int32File size (in 2 byte words), including the 100 bytes                        header size, but not including this 256 bytes header.                This same value will be repeated in bytes 24-27 ofthe 100 bytes header.6-255All zeros

    Also note that PC Coverages are ALWAYS SINGLE PRECISION, no matter what the value in bytes 4-7 in the 100 bytes header is. (i.e. The preccision flag value is sometimes negative, but the data is really always single precision.)

     

    2.1.3 PC V2 and Weird Coverage File Header

    PC V2 and Weird Coverages have only one 100 bytes header, just like V7 Coverages.

    These coverages can exist in both single and double precision form.

     

 

2.2 Index files

 

    The files that contain variable length records (i.e. ARC, PAL, CNT, etc.) are accompanied by an index file. The name of the index file (when present) will be specified in the documentation for each file type below.

    All index files have the same 100 bytes header as the file that they correspond to (it is identical, except for the size value at byte 24).

    Then starting at byte 100 in the file, you have one index entry for each object from the master file:

     

    BytesTypeDescription0-3int32Start position of the record in the file.  This                value is the number of 2 byte words from the beginningof the file.  So the position for the first objectis always 50.4-7int32Record length, excluding the first 8 bytes of the                record (number of 2 byte words - 4).  This value                is the same value that we usually find at byte 4 in the corresponding object record.  See 

    PC Coverage V1 index files start with the 256 bytes header followed by the usual 100 bytes header, followed by index entries.

    PC Coverage V2 index files are identical to V7 indexes.

    Weird Coverage index files are identical to V7 indexes, except that they have a different filename.

 

2.3 ARC.ADF

 

    The "arc.adf" file contains the arcs definitions and their vertices.

    It comes with an index file called "arx.adf".

     

    2.3.1 ARC.ADF file in V7.x Coverages

    The file starts with the usual 100 bytes header:

     

    BytesTypeDescription0-3int32Signature - 99944-7int32Precision - +1 for single precision,                             and -1 for double precision.8-11int32Record size (always 0: variable length records)12-23All zeros24-27int32File size (in 2 byte words), including header size 28-99All zeros

    Then variable length arc records follow:

     

    BytesTypeDescription0-3int32Arc_Id4-7int32Record length, number of 2 byte words that follow                the current value. (= (12 + size of vertices list)/2)7-11int32Arc_UserId12-15int32From_Node16-19int32To_Node20-23int32Left_Poly24-27int32Right_Poly28-31int32Num_Vertices32+             Vertices list (see below)

    The the vertices follow (Num_Vertices pairs of x,y values).

    For SINGLE PRECISION:

     

    BytesTypeDescription32-35floatx136-39floaty140-43floatx2 ...44-47floaty2 .........

    For DOUBLE PRECISION:

     

    BytesTypeDescription32-39doublex140-47doubley148-55doublex2 ...56-63doubley2 .........

     

    2.3.2 ARC file in PC Coverages V1

    In PC Coverages V1, the main file is called "ARC" and the index "ARX".

    They both start with the 256 bytes header specific to PC Coverages, followed by the 100 bytes header and the data records as described above.

    Note that PC Coverages files are ALWAYS single precision.

     

    2.3.3 ARC file in PC Coverages V2

    Identical to V7.x

     

     

    2.3.4 ARC file in Weird Coverages

    Same as V7 Coverages, except that the main file is called "ARC" and the index "ARX".

 

2.4 PAL.ADF

 

  •  
    • Arc_Id will be negative if the direction of the arc is reversed
    • From_Node_Id is the arc's FNODE#. If the arc is reversed, then From_Node_Id will be the arc's TNODE#.
    • Adjacent_Polygon_Id is the Id of the polygon that shares this arc with the current polygon.
  • The "pal.adf" file contains the polygon definitions. It is present only inside coverages with clean polygon topology.

    It comes with an index file called "pax.adf".

     

    2.4.1 PAL.ADF file in V7.x Coverages

    The file starts with the usual 100 bytes header:

     

    BytesTypeDescription0-3int32Signature - 99944-7int32Precision - +11 for single precision,                             and -11 or 1011 (yep!) for double prec.8-11int32Record size (always 0: variable length records)12-23All zeros24-27int32File size (in 2 byte words), including header size 28-99All zeros

    Then variable length polygon records follow:

    For SINGLE PRECISION:

     

    BytesTypeDescription0-3int32Polygon Id4-7int32Record Length, number of 2 byte words that follow                        the current value.8-11floatMin. X coordinate12-15floatMin. Y coordinate16-19floatMax. X coordinate20-23floatMax. Y coordinate24-27int32Number of Arcs28+     int32List of Arc records (see below)

    For DOUBLE PRECISION:

     

    BytesTypeDescription0-3int32Polygon Id4-7int32Record Length, number of 2 byte words that follow                        the current value.8-15doubleMin. X coordinate16-23doubleMin. Y coordinate24-31doubleMax. X coordinate32-39doubleMax. Y coordinate40-43int32Number of Arcs24+     int32List of Arc records (see below)

    For each arc in the arc list, we have a fixed length record:

     

    BytesTypeDescription0-3int32Arc_Id4-7int32From_Node_Id8-11int32Adjacent_Polygon_Id

     

     

    2.4.2 PAL file in PC Coverages V1

    In PC Coverages, the main file is called "PAL" and the index "PAX".

    They both start with the 256 bytes header specific to PC Coverages, followed by the 100 bytes header and the data records as described above.

    Note that PC Coverages files are ALWAYS single precision.

     

    2.4.3 PAL file in Weird Coverages

    Same as V7 Coverages, except that the main file is called "PAL" and the index "PAX".

 

2.5 LAB.ADF

 

    The "lab.adf" file contains label point records.

    This file has no associated index since it has fixed size records.

     

    2.5.1 LAB.ADF file in V7.x Coverages

    The file starts with the usual 100 bytes header:

     

    BytesTypeDescription0-3int32Signature - 99934-7int32Precision - +2 for single precision,                             and -2 for double precision.8-11int32Label Record size, in 2 byte words                     (16 for single prec. and 28 for double prec.)12-23All zeros24-27int32File size (in 2 byte words), including header size 28-99All zeros

    Then fixed size label point records follow:

    For SINGLE PRECISION:

     

    BytesTypeDescription0-3int32Label Value4-7int32Polygon_Id8-11floatLabel X coord.12-15floatLabel Y coord.16-19floatLabel X coord.20-23floatLabel Y coord.24-27floatLabel X coord.28-31floatLabel Y coord.

    For DOUBLE PRECISION:

     

    BytesTypeDescription0-3int32Label Value4-7int32Polygon_Id8-15doubleLabel X coord.16-23doubleLabel Y coord.24-31doubleLabel X coord.32-39doubleLabel Y coord.40-47doubleLabel X coord.48-55doubleLabel Y coord.

     

    2.5.2 LAB file in PC Coverages V1

    In PC Coverages, this file is called "LAB".

    It starts with the 256 bytes header specific to PC Coverages, followed by the 100 bytes header and the data records as described above.

    Note that PC Coverages files are ALWAYS single precision.

     

    2.5.3 LAB file in Weird Coverages

    Same as V7 Coverages, except that the file is called "LAB".

 

2.6 CNT.ADF

 

    The "cnt.adf" file contains polygon centroid information.

    It comes with an index file called "cnx.adf".

     

    2.6.1 CNT.ADF file in V7.x Coverages

    The file starts with the usual 100 bytes header:

     

    BytesTypeDescription0-3int32Signature - 99944-7int32Precision - +14 for single precision,                             and -14 for double precision.8-11int32Record size (always 0: variable length records)12-23All zeros24-27int32File size (in 2 byte words), including header size 28-99All zeros

    Then variable length centroid records follow:

    For SINGLE PRECISION:

     

    BytesTypeDescription0-3int32Polygon Id4-7int32Record Length, number of 2 byte words that follow                        the current value.8-11floatCentroid X coordinate12-15floatCentroid Y coordinate16-19int32Num_Labels ( >= 0 )20+     int32List of Label Ids (Only if Num_Labels > 0 )

    For DOUBLE PRECISION:

     

    BytesTypeDescription0-3int32Polygon Id4-7int32Record Length, number of 2 byte words that follow                        the current value.8-15doubleCentroid X coordinate16-23doubleCentroid Y coordinate24-27int32Num_Labels ( >= 0 )28+     int32List of Label Ids (Only if Num_Labels > 0 ) 

     

    2.6.2 CNT file in PC Coverages V1

    In PC Coverages, the main file is called "CNT" and the index "CNX".

    They both start with the 256 bytes header specific to PC Coverages, followed by the 100 bytes header and the data records as described above.

    Note that PC Coverages files are ALWAYS single precision.

     

    2.6.3 CNT file in Weird Coverages

    Same as V7 Coverages, except that the main file is called "CNT" and the index "CNX".

 

2.7 PRJ.ADF - Projection file

 

     

    2.7.1 PRJ.ADF file in V7.x Coverages

    The PRJ.ADF file is a simple ASCII file with one line for each piece of projection information. The lines have a variable length and are terminated by a newline character.

    Here is an example of a prj.adf file:

         Projection    GEOGRAPHIC     Zunits        NO     Units         DD     Spheroid      CLARKE1866     Xshift        0.0000000000     Yshift        0.0000000000     Parameters

     

    2.7.2 PRJ file in PC Coverages V1

    PC Coverages do not appear to carry a PRJ file... or at least we never encountered any.

     

    2.7.3 PRJ file in Weird Coverages

    Just like for PC Coverages... they do not appear to carry a PRJ file... or at least we never encountered any.

 

2.8 LOG - Coverage history

 

     

    2.8.1 LOG file in V7.x Coverages

    The LOG file (named "log", not "log.adf"!) is an ASCII file with variable length lines each terminated with a newline.

    The lines have no known length limit, they can be longer than 80 characters for sure.

     

    2.8.2 LOG file in PC Coverages V1

    Nothing special... the file is called LOG as well.

     

    2.8.3 LOG file in Weird Coverages

    Probably the same... but we've never encountered any.

 

2.9 TOL - Coverage Tolerances

 

  •  
    • 1. fuzzy
    • 2. generalize (unused)
    • 3. node match (unused)
    • 4. dangle
    • 5. tic match
    • 6. undefined
    • 7. undefined
    • 8. undefined
    • 9. undefined
    • 10. undefined
  •  

    2.9.1 TOL.ADF file in V7.x Coverages

    The TOL file contains the tolerance values to use when processing a polygon coverage. It usually contains 10 tolerance entries. For each entry, we have a tolerance type, a tolerance status, and a tolerance value. The tolerance types are:

     

    The tolerance status "is set to 1 if the tolerance is verified (been applied to operations of the coverage) and to 2 if the tolerance is not verified (been set by the TOLERANCE command, but not yet used in processing)."

    In a SINGLE PRECISION coverage, the file is named "tol.adf", it has no header, and for each tolerance value, we have:

     

    BytesTypeDescription0-3int32Tolerance type (usually goes from 1 to 10)4-7int32Tolerance status8-11floatTolerance value

    In DOUBLE PRECISION coverages, the file is named "par.adf", and it DOES have the usual 100 bytes header:

     

    BytesTypeDescription0-3int32Signature - Always 99934-7int32Value of 40 (this should be the precision field???)8-11int32Tolerance record size, in 2 byte words (always 8)12-23All zeros24-27int32File size (in 2 byte words), including header size 28-99All zeros

    Then for each double precision tolerance value, we have:

     

    BytesTypeDescription0-3int32Tolerance type (usually goes from 1 to 10)4-7int32Tolerance status8-15doubleTolerance value

    Note: The double precision file ("par.adf") header does not seem to follow the general rule for the header... its precision field value is > 0 while this value is negative for all other double precision files(???). Also, the third field has a value of 8, while it is 0 in all other headers.

     

    2.9.2 TOL file in PC Coverages V1

    In PC Coverages, this file is called "TOL".

    Contrary to most other files, the TOL file in PC Coverages does not have any header... it starts immediately with the tolerance entries like the "tol.adf" in single precision V7.x coverages.

    Note that PC Coverages files are ALWAYS single precision.

     

    2.9.3 TOL file in Weird Coverages

    Same as "tol.adf" in single precision V7 Coverages except that the file is called "TOL".

 

2.10 TX6/TX7 - Annotations

 

  •  
    • test.txt - The actual annotations file
    • test.txx - Index file
    • test.tat - INFO table for this set of annotations
  •  

    2.10.1 TX6/TX7 files in V7.x Coverages

    TXT, TX6, and TX7 are 3 variations of text annotations that we find in E00 files. TX6 and TX7 annotations usually come with a .TAT info table, or a set of .TAT tables.

    It seems that you can have several "subclasses" of annotations, in the E00 file they are sub-sections of the main TX6/TX7 section, and in a binary coverage they are stored in separate files.

    There is no difference between the binary files for a TX6 and the files for a TX7. However, in the E00 format, there is an additional value in the first line of a TX7 entry (that is not present in a TX6), this value is very often 0 (or 1 in some cases), but even when it is set, it is not present in the binary file... I have no idea where it comes from!?!

    For the subclass of annotations called TEST, you will find 3 files in the coverage directory:

     

    The file "test.txt" has the usual 100 bytes header, followed by variable length records for each piece of text.

    File Header:

     

    BytesTypeDescription0-3int32Signature - Always 99944-7int32Precision - +67 for single precision,                             and -67 for double precision.8-11int32Record size (always 0: variable length records)12-23All zeros24-27int32File size (in 2 byte words), including header size 28-99All zeros

    Followed by records of data for each piece of text:

     

    BytesTypeDescription0-3int32System ID (TEST#)4-7int32Record Length, number of 2 byte words that follow                        the current value.8-11int32User ID (TEST-ID)12-15int32??? LEVEL16-19float??? Defaults to -1e+02 but is sometimes different                    (this value is always a 4 bytes float value,     even for double-prec. coverages)20-23int32SYMBOL (Text font)24-27int32num_vertices1: for the line along which the text                                       is drawn.28-31int32??? n28: Always 0 (Verified that it corresponds to the                                   6th value of 1st line in a TX7-E00)32-35int32Number of chars in text string36-39int32num_vertices2: for the text arrow.  If this value is                                       negative then the arrow is reversed.40-41int16??? Always 1  - Corresponds to the second set of42-43int16??? Always 0    20 values in a E00 TX7 entry44-45int16??? Always 0...             ...78-79int16??? Always 080-81int16Text justification  - Corresponds to the first set of82-83int16??? Always 0          20 values in a E00 TX7 entry84-85int16??? Always 0...             ...118-119int16??? Always 0

    The rest of the record depends on the precision. For SINGLE PRECISION, we have:

     

    120-123float??? v1, Text Height ???124-127float??? v2 (always 0)128-131 float??? v3 (always 0)132+    charsText String (padded with spaces to the                              next 4 bytes boundary)floatx1 - Vertices list floaty1   (num_vertices1+num_vertices2) vertice pairsfloatx2floaty2float... int32??? Unused ???  - The last 8 bytes look like junkint32??? Unused ???  - See note below.

    And for DOUBLE PRECISION, we would have:

     

    120-127double??? v1, Text Height ???128-135double??? v2 (always 0)136-143 double??? v3 (always 0)144+    charsText String (padded with spaces to the                              next 4 bytes boundary)doublex1 - Vertices list doubley1   (num_vertices1+num_vertices2) vertice pairsdoublex2doubley2double... int32??? Unused ???  - The last 8 bytes look like junkint32??? Unused ???  - See note below.

    Note:
    The last 8 bytes of junk appear to be always present in V7 coverages. However, they are sometimes present and sometimes not present in Weird coverages. Thus, the only safe way to know whether there is junk to skip at the end of a TX6 record is to use the record length value in bytes 4-7.

     

     

    2.10.2 TX6/TX7 files in PC Coverages V1

    PC Coverages probably can't have TX6/TX7 files but they can have TXT files though... see below.

     

    2.10.3 TX6/TX7 files in Weird Coverages

    Weird coverages can have TX6/TX7 files, and they work the same way as for V7 coverages, except that the name ends with "txt", instead of ".txt". (e.g. we have "testtxt" instead of "test.txt")

 

2.11 TXT - Annotations

 

  •  
    • The file names will be "txt.adf" and "txx.adf" for the Index file
    • The values in bytes 40-119 of each entry look like junk... there appears to be absolutely no correlation with what you find in the corresponding TXT section of an E00 file.
    • When the binary TXT structure is converted to E00-TXT, the first vertex of the vertices list for the text's polyline is always ignored (the first and second vertices in the vertices list are always the same).
      For instance, if num_vertices1==3 in the binary file, then we should ignore the first vertex, and the corresponding E00-TXT entry would have num_vertices1=2 (corresponding to vertices 2 and 3 in the vertices list).
  •  

    2.11.1 TXT.ADF file in V7.x Coverages

    TXT type of annotations use the exact same file structure as TX6/TX7 above, with the following differences:

     

     

    2.11.2 TXT file in PC Coverages V1

    In PC Coverages, the main file is called "TXT" and the index "TXX".

    They both start with the 256 bytes header specific to PC Coverages, followed by the 100 bytes header:

     

    BytesTypeDescription0-3int32Signature - Always 99944-7int32Precision - Always 1 (always single precision)8-11int32Record size (always 0: variable length records)12-23All zeros24-27int32File size (in 2 byte words), including header size 28-99All zeros

    However, contrary to what we find with most other file types, the data records in the TXT file are different from what we find in V7.x TXT.ADF files.

    PC Coverage TXT entries are always single precision. For each piece of text, we have:

     

    BytesTypeDescription0-3int32System ID (TEST#)4-7int32Record Length, number of 2 byte words that follow                        the current value.8-11int32??? LEVEL (Corresponds to bytes 12-15 in a V7 TXT)12-15int32Number of vertice pairs that are valid ( [1..4] )16-19floatx1 - (1st float value in a E00 TXT section)20-23floaty1 - (5th float value in a E00 TXT section)24-27floatx2 - (2nd float value in a E00 TXT section)28-31floaty2 - (6th float value in a E00 TXT section)32-35float   x3 36-39float   y340-43float   x444-47float   y448-75floatAlways 0 ??? Probably corresponds to the otherfloat values in the E00 TXT section76-79float??? Text Height ???                Corresponds to the 15th float value in a E00 TXT80-83float??? Defaults to -1e+02 but is sometimes different84-87int32SYMBOL (Text font)88-91int32Number of chars in text string92+charsText String (padded with spaces to the                              next 4 bytes boundary... it was also     noted that strings that are a multiple     of 4 chars in length are also padded      with 4 spaces)

     

     

    2.11.3 TXT file in Weird Coverages

    Weird coverages can have their TXT files stored using either the PC structure or the V7 structure. In both cases the filenames are the same ("TXT" and "TXX"). The only way to tell if the file is in PC TXT format or in V7 TXT/TX6/TX7 format is by looking at the precision field in the 100 bytes header:

     

    BytesTypeDescription0-3int32Signature - Always 99944-7int32Precision - +16 for single precision in PC TXT format,                                    +67 for single precision in V7 format,                             and -67 for double precision in V7 format.8-11int32Record size (always 0: variable length records)12-23All zeros24-27int32File size (in 2 byte words), including header size 28-99All zeros

    When the V7 structure is used, the files are identical to the V7 TXT/TX6/TX7 files described above except for the filename.

    When the PC TXT structure is used, the files are similar to PC Coverage TXT files, except for the byte ordering and the fact that there is no 256 byte header in the Weird Coverage ones. Another minor difference: in Weird Coverages, when a text string has a length that is a multiple of 4 chars, it won't be padded with 4 spaces as it would have been in a PC Coverage file. This is a minor detail, but it is interesting to notice that this bug has been fixed between PC Arc/Info and the version of Arc/Info that produced the weird coverages.

     

 

2.12 RXP - Specific to Regions

 

  •  
      RXP  2OLD         1       120         2        11         3        12         4        13         4       202         5        16         6        19         7        14         7        20         7        21         7       125         8        22......
  • RXP sections contain define the list of polygons from the PAL section that form each region, and they occur only in region coverages. There is one .rxp file for each region in the coverage.

    RXP files were never encountered in PC Coverages and Weird Coverages.

    .rxp files have no header, and they contain fixed size records:

     

    BytesTypeDescription0-3int32Region Polygon ID4-7int32PAL Polygon ID

    Regions that consist of multiple polygons will have several records with the same Region Polygon Id and differing PAL Polygon Ids.

     

 

2.13 RPL - Specific to Regions

 

    E00-RPL are also specific to region coverages. In the binary coverage, they correspond to files with a ".pal" extension. There is one .pal file for each region in the coverage and they use the exact same structure as "pal.adf" files. Each .pal file probably contains the definition of the polygons that belong to that region.

    RPL files come with an index file with a ".pax" extension.

    See the "pal.adf" description...

    RPL files were never encountered in PC Coverages and Weird Coverages.

 


 

3 - THE ATTRIBUTE FILES

Each type of coverage has a different way to store attribute information:

 

  • V7.x Coverages maintain a "../info" directory with the attribute files for all the coverages that are located in the same parent directory.

     

  • PC Coverages V1 and V2 store their attribute information in regular DBF files inside the coverage directory.

     

  • Weird Coverages also use a "../info" directory shared by a number of coverages, but the organization of that info directory differs a little from what we find in V7.x Coverages.

 

3.1 INFO FILES IN V7.x COVERAGES

 

    The INFO files are tables with the attribute information attached to an Arc/Info coverage. The data files themselves are stored in the coverage directory, but the definition of the table fields are stored in the "../info" directory.

    The ../info directory is shared by all the coverages stored in its parent directory, and contains the following files:

     

            arc.dir        arc0000.dat        arc0000.nit        arc0001.dat        arc0001.nit        ...        ...        ...

 

3.1.1 INFO/ARC.DIR

 

  •  
    • A value of " " indicates an internal table, i.e. the data is stored directly in the info/arc####.dat file.

       

    • A value of "XX" indicates an external table, i.e. the data is stored in a file outside of the info directory. In this case the arc####.dat file contains one 80 chars string with the path to this external data file relative to the info directory (padded with spaces).
  • Contains one record for each attribute table (arc*.*) in this info directory. The file has no header, and each record has a fixed size of 380 bytes.

     

    BytesTypeDescription0-31charTable name (as shown by Arc/Info) padded with spaces32-39charInternal Name ("ARC#### " file name)40-41int16Number of fields in table (valid fields... see below)42-43int16Table Record size (rounded up to a multiple of 2 bytes)44-59char ??? 16 spaces60-61int16 ??? Always 13262-63int16 ??? Always 064-67int32Number of records  (may also be only an int16???)68-77 ??? All zeros78-79charExternal flag ("  " or "XX", see note below)80-317 ??? All zeros318-325char ??? 8 spaces326-379 ??? All zeros

    Note that the Arc/Info table name (first field above) is always the coverage name followed by an extension (ex: TEST.AAT, TEST.TIC, TEST.BND, TEST.PAT, TEST.PATCOUNTRY, etc.). So this name can be used to search the arc.dir for all tables related to a given coverage.

    The arc.dir entry contains the number of valid fields in the table, but the arc####.NIT file can contain deleted field definitions (and these deleted field definitions are even exported in the E00 table headers produced by Arc/Info). In this case the number of field entries in the arc####.nit file will be bigger than the number of fields found here. Unfortunately there does not appear to be anything in the arc.dir that would allow us to tell if the table has deleted fields (or not) until we go and read the arc####.nit file.

    In some cases, the "number of records" field for a table in the arc.dir does not correspond to the real number of records in the data file. In this kind of situation, the number of records returned by Arc/Info in the corresponding E00 file will be based on the real data file size (obtained with stat()), and not on the value from the arc.dir. (i.e. use num_records = physical_data_file_size/record_size)

    The external flag tells where the data file is located.

     

    When the value for "number of records" in the arc.dir is 0, then the data file for this table may not exist yet.

    The value for "number of records" in the arc.dir is the real size used by each record in the data file, and thus must be a multiple of 2 since data records are padded at the end to be aligned with a 2 bytes boundary.

    There does not appear to be any difference between single and double precision table entries.

 

3.1.2 INFO/ARC*.DAT

 

    For internal tables (see external flag in the arc.dir entry), this file contains the table data.

    For external tables, it is an 80 characters ASCII file that contains the relative path of the file that contains the table data. The end of the path is padded to 80 chars with spaces.

     

    Ex: ../test/tic.adf

 

3.1.3 INFO/ARC*.NIT

 

    Contains the table fields definition. The file has no header, and each field definition record has a fixed size of 144 bytes.

    The meaning of the items marked with a question mark is unknown, but they could be recognized in the following E00 IFO table header.

     

      ----------------------------------------------------------------------  FNODE#            4-1   14-1   5-1 50-1  -1  -1-1                   1  ----------------------------------------------------------------------BytesTypeDescription0-15char(FNODE#) Field name padded with spaces16-17int16(4) Storage size in bytes18-19int16(-1) ?20-21int16(1) 1-based offset of the field in a record22-23int16(4) ? (always 4 !!!)24-25int16(-1) ?26-27int16(5) Display format: width28-29int16(-1) Display format: number of decimals or -1 if not applicable.30-31int16(5) First digit of field type32-33int16(0) 2nd digit of field type (always 0)34-35int16(-1) ?36-37int16(-1) ?38-39int16(-1) ?40-41int16(-1) ?42-57char ? Alternate Name (always blank!) ?...114-115int16(1) 1-based field index (-1 if field is deleted)116-144 ? All zeros

    The field type is specified by the following codes:

     

    10 (D) Date (stored as 8 bytes, display width must be either       8 chars (12/31/99) or 10 chars (12/31/1999) )20 (C) Character string30 (I) Integer with fixed number of digits (1 byte storage per digit)40 (N) Numeric value with decimals and fixed number of digits        (1 byte storage per digit, value is right-justified)50 (B) Binary integer (2 or 4 bytes)60 (F) Binary float (4 or 8 bytes, depends on coverage precision)
    Ref: Understanding GIS... p.6-5, 6-6

    When exported to E00, here is the form that each type takes:

     

    10 (D) 8 characters20 (C) Nbr of chars = field storage size.30 (I) Nbr of chars = field storage size, value is right-justified40 (N) stored as single prec. floats = 14 chars, ex: "-1.7735416E+00"               (Uses 1 byte storage per digit internally, but always stored                as single precision floats in both single and double                 precision E00 tables.)50 (B) 32 bits integer use 11 chars, right-justified       16 bits integer????? Never saw an E00 that contained any!    but it would probably be 6 chars since the biggest     value to store would be "-32767"60 (F) single prec. = 14 chars total, ex: "-1.7735416E+00"       double prec. = 24 chars total, ex: "-2.60358875000000000E+05"

 

3.1.4 TABLE DATA files (.adf, ...)

 

    The table data itself is stored in binary files inside the coverage directory. They usually have a .adf extension in simple coverages, (ex: tic.adf, bnd.adf, aat.adf, pat.adf, ...) but it may not always be the case for coverages with regions, etc.

    These files do not have any header, and they have fixed size records of the size specified in the corresponding ../info/arc####.dat and ../info/arc####.nit files.

     

 

3.1.5 Name and location of TABLE DATA files

 

  •  
    • [COVERNAME]:
      The first part of the table name (before the '.') is the name of the coverage to which the table belongs, and the data file will be created in this coverage's directory... so it is assumed that the directory "../[covername]" already exists and is writable.
    • [EXT]:
      The coverage name is followed by a 3 chars extension that will be used to build the name of the external table to create.
    • [SUBCLASSNAME]:
      For some table types, the extension is followed by a subclass name.
  • When reading a coverage, the information found in the arc.dir (and in the arc####.dat for external tables) is sufficient to establish the location of the actual data file.

    However, when time comes to create a new coverage, one needs to know how to name and where to place the data files.

    For internal tables, the data file goes directly in the info directory, inside the arc####.dat so there is not much to worry about.

    For external tables, the table name (first field in the arc.dir, and in and E00 table header) is composed of 3 parts:

             [COVERNAME].[EXT][SUBCLASSNAME]

    When [SUBCLASSNAME] is present, then the data file name will be:

                "../[covername]/[subclassname].[ext]"
    e.g. The table named "TEST.PATCOUNTY" would be stored in the file "../test/county.pat" (this path is realtive to the info directory)

    When the [SUBCLASSNAME] is not present, then the name of the data file will be:

                "../[covername]/[ext].adf"

    e.g. The table named "TEST.PAT" would be stored in the file "../test/pat.adf"

    Of course, it would be too easy if there were no exceptions to these rules! Single precision ".TIC" and ".BND" follow the above rules and will be named "tic.adf" and "bnd.adf" but in double precision coverages, they will be named "dbltic.adf" and "dblbnd.adf".

     

 

3.2 INFO FILES IN "WEIRD" COVERAGES

 

    Weird coverages use the same method to store their INFO tables as V7.x Coverages except for the file names used.

     

          V7 Filename        Corresponding Weird Filename      info/arc.dir   info/arcdr9      info/arc0000.dat   info/arc000dat      info/arc0000.nit     info/arc000nit      covername/aat.adf    covername/aat

    Weird coverage filenames and directory names are often in upper case. We've also observed some coverages in which the DAT/NIT filenames were truncated to 8 characters, e.g. "ARC000DA", "ARC000NI", ...

    Another difference that was noted is that the "ARCDR9" file can contain multiple entries for the same table name, and the only way to tell which one is valid is by looking for the corresponding DAT and NIT files.

    V7 coverages will overwrite old tables in the arc.dir, but weird coverages seem to always append to the end of the index.

 

3.3 DBF FILES IN PC COVERAGES V1 and V2

 

  •  
    • AAT.DBF
    • PAT.DBF
    • TIC.DBF
    • BND.DBF
    • LUT.DBF
    • AAT.DBF
    • PAT.DBF
    • DBLTIC.DBF
    • DBLBND.DBF
    • LANDUSE.DBF
  • PC Coverages store their attribute information in regular DBF files inside the coverage directory.

    File names:

    There is no equivalent to the arc.dir with the list of table for each coverage. You have to look for "???.DBF" in the coverage directory to get the list of tables.

    Here are the most common .DBF table filenames we can find:

     

    Double precision PC Coverages V2 may contain:

     

    Field Names:

    Because of restrictions in the DBF specs for attribute names, some special attribute names have to be repaired when they are read from a DBF file. For instance, in a coverage named "TEST", the following DBF field names will contain "_" characters in place of some characters that are not permitted in DBF field names:

        .DBF Attribute Name    Arc/Info NameTEST_                 TEST#TEST_ID      TEST-IDFNODE_                FNODE#TNODE_                TNODE#LPOLY_                LPOLY#RPOLY_                RPOLY#

    It is also important to note that DBF field names are limited to 10 characters while Arc/Info field names can have up to 15 characters.

    Field Data Types:

    DBF and INFO files do not use the same code for field data types. The DBF data types have to be mapped to Arc/Info data types:

     

        Arc/Info Data Type            DBF Field Data Type10 (D) Date                ??? Never seen any... probably 'D' (date)20 (C) Char   'C' - char30 (I) Integer   'N' - Numeric, decimals=040 (N) Numeric   ??? Never seen any... probably 'N'50 (B) Binary int.   'N' - Numeric, decimals=060 (F) Binary float   'N' - Numeric, see not below

    Note: Floating point values (type 60) are stored inside the DBF file using exponent notation in 13 characters numeric ('N') fields with 0 significant digits before the point. (e.g. -110.333300 is stored as -.1103333E+03, and 65.277460 is stored as 0.6527746E+02)

    Note2: What is the difference between types 30 and 50 when stored in DBF files? It seems that all system attributes (TEST#, TEST-ID, FNODE#, etc...) are always stored as type 50, and all user-defined integer fields would always be stored as type 30.

     

原创粉丝点击