Thought bubbles - Little End and Big End

来源：互联网发布：八音度调音数据编辑：程序博客网时间：2024/04/29 04:56

Little end and big end are computer words, and actually they are called "Little end order" and "Big end order".

What are they? How did they come into computer world?

As we know, we store data in memory disk usually byte by byte, and we normally use several continuous bytes to form other data types such as signed/unsigned integer, single precision float data. For example, in 32bit OS platform, an integer is usually 4 bytes wide. Then comes the question of how these 4 bytes locate in memory. Suppose an integer i = 0x01020304, how do i locate in memory? Well, we can arrange these 4 bytes into two different forms in memory:

form L: 0x04 0x03 0x02 0x01, where 0x04 is in the lower memory address.

form B: 0x01 0x02 0x03 0x04, where 0x01 is in the lower memory address.

The form L to arrange multiple bytes of some data type in memory is called "Little end order", and the form B is called "Big end order". Why they were called these names? Maybe it can be traced back into the novel "Gulliver's Travels". At first, I felt struggling in these names and often forgot that which name maps to which case. Finally I figured out the sense of these names. As of "Little end order", it suggests that the least significance byte comes first in the lower memory address, and as of "Big end order", it suggests that the least significance byte comes last in the memory address. According to this clue, we should easily figure out which name maps to which case.

Different cpus manufatured by different companies may have different "end order"s. For example, Intel's cpu are all "Little end order", and some other kinds of cpu are "Big end order". Before Internet emerged, this was not a problem. But when data needs to be transfer from one computer to another computer, it becomes a problem. For data is transfered through the network layer byte by byte, and the network layer would not care about the "end order" of the data in the source host, so the destination host could not figure out which "end order" of the data it received. For example, suppose an integer i = 0x01020304 from a source host of "Little end order" transfered to a destination host of "Big end order", the byte "0x04" was transfered first, followed by bytes "0x03", "0x02", "0x01". The destination host received bytes "0x04", "0x03", "0x02", "0x01", and because the destination host is in "Big end order", it would perceived the received bytes to form an integer i = 0x04030201. So we need a contract here: the data being transferred in network must be in "Big end order".

Since we know this contract, and the OS also know this contract, why do we always do this "order" transformation job in our application code by ourselves? Why not throw this job to the OS? Because the OS do not want to care about this stuff to keep its apis simple and clean. The internet api of the OS only accepts streams of bytes to transfer and receive, and if we plug the "order" transformation logic to it, its logic may become complex.