Data Types

来源:互联网 发布:中国金融服务贸易数据 编辑:程序博客网 时间:2024/04/27 14:38

Data Types

数据类型

Java is a strongly typed language. This means that every variable must have a declared type. There are eight primitive types in Java. Four of them are integer types; two are floating-point number types; one is the character type char, used for code units in the Unicode encoding scheme (see the section on the char type); and one is a boolean type for truth values.

Java是一种“强类型语言”。这意味着每个变量都必须声明一个类型。Java中共有八种基本类型:四种整数类型;两种浮点数类型;一种字符型char,用于基于Unicode编码方案的代码单元(参见字符型这一节);还有一种是用来表示真值的布尔型。

NOTE

Java has an arbitrary precision arithmetic package. However, "big numbers," as they are called, are Java objects and not a new Java type. You see how to use them later in this chapter.

Java有一个独立的精确运算包。然而,“大数”,正如他们的名字,是Java对象而不是一个新的Java类型。你稍后会在这一章里看到关于如何使用它们的内容。

 

Integers

整数类型

The integer types are for numbers without fractional parts. Negative values are allowed. Java provides the four integer types shown in Table 3-1.

整数类型是为不包含小数部分的数准备的,而且允许有负值。Java提供了如表3-1所示的四种整数类型。

Table 3-1. Java Integer Types

Type

类型

Storage Requirement

存储需求

Range (Inclusive)

数值范围(含)

Int

4 bytes

–2,147,483,648 to 2,147,483, 647 (just over 2 billion)

Short

2 bytes

–32,768 to 32,767

Long

8 bytes

–9,223,372,036,854,775,808 to 9,223,372,036,854,775,807

Byte

1 byte

–128 to 127

 

In most situations, the int type is the most practical. If you want to represent the number of inhabitants of our planet, you'll need to resort to a long. The byte and short types are mainly intended for specialized applications, such as low-level file handling, or for large arrays when storage space is at a premium.

大多数情况,int类型都是最实用的。如果要描述我们这个星球上的居民数量时,你就需要求助于long类型。byteshort类型主要用于特殊的应用程序,诸如低水平的文件处理或当存储空间有限时使用大的数组。

Under Java, the ranges of the integer types do not depend on the machine on which you will be running the Java code. This alleviates a major pain for the programmer who wants to move software from one platform to another, or even between operating systems on the same platform. In contrast, C and C++ programs use the most efficient integer type for each processor. As a result, a C program that runs well on a 32-bit processor may exhibit integer overflow on a 16-bit system. Because Java programs must run with the same results on all machines, the ranges for the various types are fixed.

Java中,整数类型的取值范围并不依赖于运行Java代码的机器。这会使打算将软件从一个平台到另一个平台,甚至同一个平台的不同操作系统之间进行移植的程序员减轻很大的痛苦。与此相反,CC++程序在不同的处理器上使用最有效率的整数类型。结果,一个在32微处理器上运行得很好的C程序可能会在16位的系统上表现为整数溢出。因为Java程序在所有的机器上运行都必须得出相同的结果,所以这几种类型的取值范围是固定的。

Long integer numbers have a suffix L (for example, 4000000000L). Hexadecimal numbers have a prefix 0x (for example, 0xCAFE). Octal numbers have a prefix 0. For example, 010 is 8. Naturally, this can be confusing, and we recommend against the use of octal constants.

长整型数要有个后缀L(例如,4000000000L)。十六进制数要加前缀0x(例如,0xCAFE)。八进制数加前缀0。例如,010就是8。显然,这容易混淆,我们并不建议使用八进制常量。

C++ NOTE

In C and C++, int denotes the integer type that depends on the target machine. On a 16-bit processor, like the 8086, integers are 2 bytes. On a 32-bit processor like the Sun SPARC, they are 4-byte quantities. On an Intel Pentium, the integer type of C and C++ depends on the operating system: for DOS and Windows 3.1, integers are 2 bytes. When 32-bit mode is used for Windows programs, integers are 4 bytes. In Java, the sizes of all numeric types are platform independent.

C/C++中,int表示的整数类型要取决于目标机。在16微处理器上,如8086,整数占2字节。在像SunSPARC这样的32微处理器上,它们就占4字节。在IntelPentium机上,CC++的整数类型取决于操作系统:在DOSWindows3.1上,整数占2字节;在Windows程序的32位模式下,整数占4字节。在Java 中,所有的数字类型的大小都是与平台无关的。

Note that Java does not have any unsigned types.

注意Java没有任何的无符号类型。

 

Floating-Point Types

浮点类型

The floating-point types denote numbers with fractional parts. The two floating-point types are shown in Table 3-2.

浮点类型表示含有小数部分的数。两种浮点类型如表3-2所示:

Table 3-2. Floating-Point Types

Type

类型

Storage Requirement

存储需求

Range

数值范围

float

4 bytes

approximately ±3.40282347E+38F (6–7 significant decimal digits)

±3.40282347E+38F67位有效小数位数)

double

8 bytes

approximately ±1.79769313486231570E+308 (15 significant decimal digits)

±1.79769313486231570E+30815位有效小数位数)

 

The name double refers to the fact that these numbers have twice the precision of the float type. (Some people call these double-precision numbers.) Here, the type to choose in most applications is double. The limited precision of float is simply not sufficient for many situations. Seven significant (decimal) digits may be enough to precisely express your annual salary in dollars and cents, but it won't be enough for your company president's salary. The only reasons to use float are in the rare situations in which the slightly faster processing of single-precision numbers is important or when you need to store a large number of them.

double这个名字来源于这些数具有float类型的两倍精度。(有些人把它们称之为“双精度数”。)这里,大多数的应用程序都选择double类型。float的精度限制完全无法满足多数情况的要求。七位有效(小数)位数也许足够精确的表示你年薪的圆、角、分,但它可不够表示你公司总裁的薪水。使用float的唯一前提是单精度数的处理速度快那么一点点变得很重要这种极少出现的情况,或者当你需要存储相当庞大的数据的时候。

Numbers of type float have a suffix F (for example, 3.402F). Floating-point numbers without an F suffix (such as 3.402) are always considered to be of type double. You can optionally supply the D suffix (for example, 3.402D).

float类型的数要加上后缀F(例如,3.402F)。不加后缀F的浮点数(例如3.402)始终被看作是double类型。如果你愿意,也可以加上后缀D(例如,3.402D)。

As of JDK 5.0, you can specify floating-point numbers in hexadecimal. For example, 0.125 is the same as 0x1.0p-3. In hexadecimal notation, you use a p, not an e, to denote the exponent.

JDK5.0开始,你可以用十六进制指定浮点数。例如,0.1250x1.0p-3是一样的。使用十六进制计数,你要用p而不是e来表示指数。

All floating-point computations follow the IEEE 754 specification. In particular, there are three special floating-point values:

所有的浮点数计算指令集都是遵循IEEE 754规范的。特别指出,这里有三个特殊的浮点值:

·         positive infinity 正无穷大

·         negative infinity 负无穷大

·         NaN (not a number) 无意义(不是数)

to denote overflows and errors. For example, the result of dividing a positive number by 0 is positive infinity. Computing 0/0 or the square root of a negative number yields NaN.

用来表示数据溢出和出错。用0去除一个正数的结果就是positive infinity(正无穷大)。计算0/0或者是求一个负数的平方根就得到NaN(无意义)。

NOTE

The constants Double.POSITIVE_INFINITY, Double.NEGATIVE_ INFINITY, and Double.NaN (as well as corresponding Float constants) represent these special values, but they are rarely used in practice. In particular, you cannot test

常量Double.POSITIVE_INFINITYDouble.NEGATIVE_ INFINITYDouble.NaN(以及相应的Float常数)来表示这三个特殊值,但他们在实际中很少用到。特别的,你不能通过

if (x == Double.NaN) // is never true 永不正确

 

to check whether a particular result equals Double.NaN. All "not a number" values are considered distinct. However, you can use the Double.isNaN method:

来验证一个特定的结果是不是等于Double.NaN。所有 “无意义(不是数)”的值都是被区别开来考虑的。然而,你可以使用Double.isNaN方法:

if (Double.isNaN(x)) // check whether x is "not a number" 检验x是否无意义

 

 

CAUTION

Floating-point numbers are not suitable for financial calculation in which roundoff errors cannot be tolerated. For example, the command System.out.println(2.0 - 1.1) prints 0.8999999999999999, not 0.9 as you would expect. Such roundoff errors are caused by the fact that floating-point numbers are represented in the binary number system. There is no precise binary representation of the fraction 1/10, just as there is no accurate representation of the fraction 1/3 in the decimal system. If you need precise numerical computations without roundoff errors, use the BigDecimal class, which is introduced later in this chapter.

浮点数并不适合于财会计算,因为舍入错误无法容忍的。例如,命令System.out.println(2.0 – 1.1)打印的结果是0.8999999999999999,而不是你所期望的0.9。这样的舍入错误事实上是由于浮点数用二进制来描述所引起的。在二进制中无法精确表示分数1/10,就像在十进制中无法准确表示分数1/3一样。如果你需要没有舍入错误的精确的数值计算,就要使用这一章晚些时候将要介绍的BigDecimal类。

 

The char Type

字符类型

To understand the char type, you have to know about the Unicode encoding scheme. Unicode was invented to overcome the limitations of traditional character encoding schemes. Before Unicode, there were many different standards: ASCII in the United States, ISO 8859-1 for Western European languages, KOI-8 for Russian, GB18030 and BIG-5 for Chinese, and so on. This causes two problems. A particular code value corresponds to different letters in the various encoding schemes. Moreover, the encodings for languages with large character sets have variable length: some common characters are encoded as single bytes, others require two or more bytes.

理解char(字符)类型,你就不得不了解Unicode编码方案。Unicode是为了突破传统字符编码方案的局限性而创造出来的。在Unicode之前,有许多不同的标准:美国的ASCII,用于西欧语言的ISO 8859-1,用于俄语的KOI-8,用于中文的GB18030BIG-5等等。这就导致了两个问题。一个特定的值在各种各样的编码方案里对应不同的字母。况且,大字符集语言的编码具有不固定的长度:一些公共的字符用单字节编码,而有些就要用到两个或多个字节。

Unicode was designed to solve these problems. When the unification effort started in the 1980s, a fixed 2-byte width code was more than sufficient to encode all characters used in all languages in the world, with room to spare for future expansion—or so everyone thought at the time. In 1991, Unicode 1.0 was released, using slightly less than half of the available 65,536 code values. Java was designed from the ground up to use 16-bit Unicode characters, which was a major advance over other programming languages that used 8-bit characters.

Unicode就是为了解决这些问题的。当统一的工作在二十世纪八十年代开始的时候,固定的2字节长的代码足够给全世界所有语言所使用的字符编码,剩余的空间留给将来的扩展——那个时候大概每个人都这样想。1991年,Unicode 1.0发布,用了将近有可用的65536个代码值的一半。Java完全使用16位的Unicode字符,这相对于其它使用8位字符的编程语言是一个很大的进步。

Unfortunately, over time, the inevitable happened. Unicode grew beyond 65,536 characters, primarily due to the addition of a very large set of ideographs used for Chinese, Japanese, and Korean. Now, the 16-bit char type is insufficient to describe all Unicode characters.

不幸的是,随着时间的推移,不可避免的事发生了。Unicode发展得超过了65536个字符,主要归咎于使用表意文字的中文、日文和韩文这些非常庞大的字符集的加入。现在,16位的char类型已经不足以描述所有的Unicode字符了。

We need a bit of terminology to explain how this problem is resolved in Java, beginning with JDK 5.0. A code point is a code value that is associated with a character in an encoding scheme. In the Unicode standard, code points are written in hexadecimal and prefixed with U+, such as U+0041 for the code point of the letter A. Unicode has code points that are grouped into 17 code planes. The first code plane, called the basic multilingual plane, consists of the "classic" Unicode characters with code points U+0000 to U+FFFF. Sixteen additional planes, with code points U+10000 to U+10FFFF, hold the supplementary characters.

我们需要一些术语来解释Java是如何从JDK5.0开始解决这个问题的。代码点是在编码方案中被关联到一个字符的代码值。在Unicode标准中,代码点用十六进制数表示,并以前缀U+开头,例如U+0041就是字母A的代码点。Unicode的代码点被分在17个代码面中。第一个代码面,称作基本多语言面,由代码点从U+0000U+FFFF的“经典”Unicode字符所组成。另外的十六个面,代码点从U+10000U+10FFFF,保存增补的字符。

The UTF-16 encoding is a method of representing all Unicode code points in a variable length code. The characters in the basic multilingual plane are represented as 16-bit values, called code units. The supplementary characters are encoded as consecutive pairs of code units. Each of the values in such an encoding pair falls into an unused 2048-byte range of the basic multilingual plane, called the surrogates area (U+D800 to U+DBFF for the first code unit, U+DC00 to U+DFFF for the second code unit). This is rather clever, because you can immediately tell whether a code unit encodes a single character or whether it is the first or second part of a supplementary character. For example, the mathematical symbol for the set of integers has code point U+1D56B and is encoded by the two code units U+D835 and U+DD6B. (See http://en.wikipedia.org/wiki/UTF-16 for a description of the encoding algorithm.)

UTF-16编码是一种用可变长度的代码表示所有Unicode代码点的方法。在基本多语言面中的字符被表示成16位的值,称为代码单位。增补的代码被编码成一对对连续的代码单位。这样的编码对中的每一个值都被放入到基本多语言面中一个闲置的2048字节数组中,称为代理区域。(U+D800U+DBFF存放第一个代码单位,U+DC00U+DFFF存放第二个代码单位)。这样做相当巧妙,因为你能够直接说出一个代码单位究竟是单纯一个字符的编码,还是一个增补字符编码的第一部分或第二部分。例如,整数集的数学符号的代码点U+1D56B就被编码为两个代码单位U+D835U+DD6B。(参见http://en.wikipedia.org/wiki/ UTF-16的编码算法说明。)

In Java, the char type describes a code unit in the UTF-16 encoding.

Java中,char类型用UTF-16编码描述一个代码单位。

Our strong recommendation is not to use the char type in your programs unless you are actually manipulating UTF-16 code units. You are almost always better off treating strings (which we will discuss starting on page 51) as abstract data types.

我们强烈建议不要在你的程序中使用char类型,除非你完全熟练掌握了UTF-16代码单位的使用。你差不多会一直把“串”(我们将在51页讨论到)当作抽象数据类型来处理(译者:水平有限,暂译如此)。

Having said that, there will be some cases when you will encounter char values. Most commonly, these will be character constants. For example, 'A' is a character constant with value 65. It is different from "A", a string containing a single character. Unicode code units can be expressed as hexadecimal values that run from /u0000 to /uFFFF. For example, /u2122 is the trademark symbol (™) and /u03C0 is the Greek letter pi (p).

曾经提到过,你会碰到一些涉及char值的案例(译者:水平有限,暂译如此)。大多数情况是字符常量。例如,’A’是值为65的字符常量。与”A”不同,”A”是一个包含单一字符的串。Unicode代码单位可以用形如/u0000/uFFFF的十六进制的值来表示。例如,/u2122就表示商标符号(),而/u03C0表示希腊字母p

Besides the /u escape sequences that indicate the encoding of Unicode code units, there are several escape sequences for special characters, as shown in Table 3-3. You can use these escape sequences inside quoted character constants and strings, such as '/u2122' or "Hello/n". The /u escape sequence (but none of the other escape sequences) can even be used outside quoted character constants and strings. For example,

除了转义字符(换码顺序)/u指示Unicode代码单位的编码以外,还有一些转义字符代表专用字符,如表3-3所示。你可以在引用的字符常量和字符串中使用这些转义字符,如’/u2122’”Hello/n”。转义字符/u(注意不是其它的转移字符)甚至可以用在引用字符常量和字符串之外的地方。例如:

public static void main(String/5B/5D args)

 

Table 3-3. Escape Sequences for Special Characters

3-3 专用字符的转义字符

Escape Sequence

转义字符

Name

名称

Unicode Value

Unicode

/b

Backspace

/u0008

/t

Tab

/u0009

/n

Linefeed

/u000a

/r

Carriage return

/u000d

/"

Double quote

/u0022

/'

Single quote

/u0027

//

Backslash

/u005c

 

is perfectly legal—/u005B and /u005D are the UTF-16 encodings of the Unicode code points for [ and ].

是完全合法的——/u005B/u005D是符号[ ]Unicode代码点的UTF-16编码。

NOTE

Although you can use any Unicode character in a Java application or applet, whether you can actually see it displayed depends on your browser (for applets) and (ultimately) on your operating system for both.

虽然你可以在Java应用程序和小程序中使用任何的Unicode字符,但是你是否能够看到它实际显示出来却取决于你的浏览器(针对applets小程序)和你的操作系统(根本原因)。(译者:水平有限,暂译如此)

The boolean Type

布尔类型

The boolean type has two values, false and TRue. It is used for evaluating logical conditions. You cannot convert between integers and boolean values.

boolean(布尔)类型有两个值,falseTRue(译者:电子书英文原文就是TRue)。它是用来表示逻辑表达式的值的。你不能在整数类型和boolean(布尔)值之间进行转换。

C++ NOTE

In C++, numbers and even pointers can be used in place of boolean values. The value 0 is equivalent to the bool value false, and a non-zero value is equivalent to true. This is not the case in Java. Thus, Java programmers are shielded from accidents such as

C++中,数字甚至是指针都可以被用来代替布尔值。数值0等价于布尔值false,而非零的数值等价于true。在Java中的情况不是这样。因此,Java程序员就能够避免这样的意外:

if (x = 0) // oops...meant x == 0 噢!……应该是 x==0

 

In C++, this test compiles and runs, always evaluating to false. In Java, the test does not compile because the integer expression x = 0 cannot be converted to a boolean value.

C++中,这个例子编译并运行,总是得到false的结果。在Java中,这个例子无法编译,因为x=0这个整型表达式不能被转换成布尔值。