EXCEL FILE FORMAT 1

来源:互联网 发布:火灾数据2014 编辑:程序博客网 时间:2024/06/04 19:25

Microsoft Excel is a popular spreadsheet.  It uses a file format called BIFF (Binary
File Format).  There are many types of BIFF records.  Each has a 4 byte header.  The
first two bytes are an opcode that specifies the record type.  The second two bytes
specify record length.  Header values are stored in byte-reversed form (less significant
byte first).  The rest of the record is the data itself (Figure 2-1).
Figure 2-1.  BIFF record header.

MS Excel 是一种流行的电子表格软件。其使用一种被称为BIFF(二进制文件格式) 的文件格式。
它包含多种二进制格式记录。每个记录均有一个4字节的头部,起始两个字节是opcode 用来描述记录类型。后两个字节
代表记录长度。头部的值按照低字节在前的形式存放。记录其余的部分是他的数据区。
                 |  Record Header    |  Record Body
Byte Number         |  0    1    2    3 |  0    1   ...
                -----------------------------------
Record Contents     | XX | XX | XX | XX | XX | XX | ...
                -----------------------------------
                | opcode   | length  | data
               
Each X represents a hexadecimal digit
每个X代表一个16进制数。
Two X's form a byte.  The least significant (low) byte of the opcode is byte 0 and the
most significant (high) byte is byte 1.  Similarly, the low byte of the record length
field is byte 2 and the high byte is byte 3.
两个X组成一个字节。opcede 最小的有意义的(低位)字节是 0 (Byte Number 处0),大多数有意的(高位)字节是1(Byte Number 处0)。
类似的,记录长度的低位是2,高位是3。

BOF (Beginning of File)
文件开始
The first record in every spreadsheet is always of the BOF type (Figure 2-2). 
Figure 2-2.  BOF record.
每个电子表格的第一个记录通常是BOF类型。
           |  Record Header    |    Record Body    |
Byte       |  0    1    2    3 |  0    1    2    3 |
           -----------------------------------------
Contents   | 09 | 00 | 04 | 00 | 02 | 00 | 10 | 00 |
           -----------------------------------------
           | opcode  | length  | version |  file   |
           |         |         |  number |  type   |
The first two bytes, arranged with the low byte first, show that the opcode for BOF is
09h.  The second two bytes indicate that the record body is 4 bytes long.  The first two
bytes of the body are the version number (2 for the initial version of Excel).  The last
two bytes are the file type.  Type 10h is a worksheet file.
前两个字节,低字节在前,BOF的值是09H,跟着两个字节表明记录体的长度是4字节。记录体的前两个字节是版本信息(版本号)
 最后两个字节是文件类型。10h代表工作薄文件。

Relating Spreadsheet Cells to Record Data Bytes
电子表单的单元和记录的数据字节关系
A spreadsheet appears on a screen or printout as a matrix of rectangular cells.  Each
column is identified by a letter at its top, and each row is identified by a number.
Thus cell A1 is in the first column and the first row.  Cell C240 is in the third column
and the 240th row.  This scheme identifies cells in a way easily understood by people.
However, it is not particularly convenient for computers, as they do not handle letters
efficiently.  They are best at dealing with binary numbers.  Thus, Excel stores cell
identifiers as binary numbers, that people can read as hexadecimal.  The first number in
the system is 0 rather than 1.
Figure 2-3, which shows the form of an INTEGER record, illustrates the storage of column
and row information.
一个电子表单以矩阵的形式在屏幕表示或者打印出来的。每一列在顶部用一个字母表示。每一行用一个数字表示。
单元格A1代表第一行第一列。单元格C240代表第204行第三列,这样表述的好处是方便人们去理解。
但是,对电脑来说不是十分方便,因为它们不能高效率地处理字母。电脑最好的方式是用二进制数据表示。
所以Excel 以二进制数字存储单元格,这样人是可以按照16进制方式阅读。系统中第一个数字是0胜过是1。
Figure 2-3,展示了int型记录的结构,并举例说明行列的信息。


Figure 2-3.  INTEGER record.
      |  Record Header    |  Record Body
Byte  |  0    1    2    3 |  0    1    2    3    4    5    6    7    8 |
      ------------------------------------------------------------------
Value | 02 | 00 | 09 | 00 | 00 | 00 | 02 | 00 | 00 | 00 | 00 | 39 | 00 |
      ------------------------------------------------------------------
      | opcode  | length  |   row   | column  |   rgbAttr    |    w    |
Opcode 2 indicates an integer record.  The length bytes show that the record body is 9
bytes long.  Row 0 in the body corresponds to spreadsheet row 1.  Row 1 corresponds to
spreadsheet row 2, and so on.  Column 2 corresponds to spreadsheet column C.  Thus,
Figure 2-3 deals with cell C1.  The next three bytes, labeled "rgbAttr," specify cell
attributes (Table 2-3).  The final pair of bytes, (labeled "w") holds the integer's
value.  Here it is 39H or 57 decimal.  Thus the record specifies that cell C1 of the
spreadsheet contains an integer with the value 57.

length  段显示了记录体的长度是9字节。row 值0 代表了电子表单的第一行。row 1代表第二行。列2 代表了Clie。因此
Figure 2-3 所显示的是单元格C1。后面三个字节 rgbAttr    部分代表单元格属性,最后的部分,标记为w的代表整形数据的值
在这里是39H 或者 57 ,因此记录表示C1单元格的内容是整形 值57

原创粉丝点击