R语言学习-01

来源：互联网发布：jquery ui.min.js 1.8 编辑：程序博客网时间：2024/06/06 07:46

A Scientific Calculator

一个科学计算器

R is at heart a supercharged scientific calculator, so it has a fairly comprehensive set of mathematical capabilities built in. This chapter will take you through the arithmetic operators, common mathematical functions, and relational operators, and show you how to assign a value to a variable.

本章的宗旨在于使用算术操作符，一般的数学函数，逻辑运算符等方式给一个变量赋值。

Mathematical Operations and Vectors

数学操作和向量

The + operator performs addition, but it has a special trick: as well as adding two numbers together, you can use it to add two vectors. Avectoris an ordered set of values.Vectors are tremendously important in statistics, since you will usually want to analyze a whole dataset rather than just one piece of data.The colon operator, :, which you have seen already, creates a sequence from one number to the next, and the c function concatenates values, in this case to create vectors (concatenateis a Latin word meaning “connect together in a chain”).

向量是一组数值的集合，向量对于统计学是很重要的，更多地时候你想分析的是一组数值的集合，而不仅仅是单个数值。

":" 函数可以创建一个数值序列，

"c" 函数可将一系列数值连接起来，作为一个向量。

"+" 函数不仅能操作两个数字，也可以操作两个向量。

Variable names are case sensitive in R, so we need to be a bit careful in this next example. The C function does something completely different to c
R语句的变量名称是大小写敏感的，所以我们要额外注意一点。“C” 函数的使用方法

> 1:5 + 6:10[1]  7  9 11 13 15

> c(1, 3, 6, 10, 15) + c(0, 1, 3, 6, 10)[1]  1  4  9 16 25

函数":"和函数"c"都可以创建一个向量

Vectorized has several meanings in R, the most common of which is that an operator or a function will act on each element of a vector without the need for you to explicitly write a loop. (This built-in implicit looping over elements is also much faster than explicitly writing your own loop.) A second meaning of vectorization is when a function takes a vector as an input and calculates a summary statistic:

> sum(1:5)[1] 15

> median(1:5)[1] 3

A third, much less common case of vectorization is vectorization over arguments. This is when a function calculates a summary statistic from several of its input arguments. The sum function does this, but it is very unusual. median does not:

</pre><pre name="code" class="plain">> sum(1,2,3,4,5)[1] 15

> median(1,2,3,4,5)错误于median(1, 2, 3, 4, 5) : 参数没有用(3, 4, 5)

在R中得向量，向量通常有多种含义：一种是不需要显示的去写一个循环，就可以对向量中得每一个元素进行操作（内置的隐式循环一般要比一个显示循环运行的更快）；一种含义是可以将向量作为一个整体进行操作；一种是当几个输入参数作为向量来计算一个汇总统计函数时，对于函数来说是过度向量化的会导致错误。例如上述对一个向量使用求和操作，可以使用sum函数；对一个向量求中位数操作，可以使用median函数。

All the arithmetic operators in R, not just plus (+), are vectorized. The following examples demonstrate subtraction, multiplication, exponentiation, and two kinds of division,as well as remainder after division:

在R语句中对一个向量进行加、减、乘、除、取余、取幂都是对向量中得每个元素都进了操作。

减法示例：

> c(2,3,5,7,11,13) - 2 # subtraction[1]  0  1  3  5  9 11

乘法示例：

> -2:2 * -2:2   #multiplication[1] 4 1 0 1 4

取幂示例：> 2^3[1] 8

取幂示例：> 2**3[1] 8

严格等于函数示例：> identical(2^3,2**3)[1] TRUE

浮点除示例：> 1:10/3 [1] 0.3333333 0.6666667 1.0000000 1.3333333 1.6666667 2.0000000 2.3333333 [8] 2.6666667 3.0000000 3.3333333整数除示例：

> 1:10%/%3 [1] 0 0 1 1 1 2 2 2 3 3取余示例：

> 1:10%%3 [1] 1 2 0 1 2 0 1 2 0 1

R also contains a wide selection of mathematical functions. You get trigonometry (sin, cos, tan, and their inverses asin, acos, and atan), logarithms and exponents (log and exp, and their variants log1p and expm1 that calculate log(1 + x) and exp(x - 1) more accurately for very small values of x), and almost any other mathematical function you can think of. The following examples provide a hint of what is on offer. Again, notice that all the functions naturally operate on vectors rather than just single values:

反正就是说R语言有很大一堆数据函数，有三角的和反三角的，有对数函数和指数函数，还有对数函数和指数函数的变形函数。当然了，如果这些函数的参数是向量，那么函数会对向量中得所有的值依次进行计算。

COS函数的示例：

> cos(c(0,pi/4,pi/2,pi))[1]  1.000000e+00  7.071068e-01  6.123234e-17 -1.000000e+00

欧拉公式的示例：

> exp(pi * 1i) + 1[1] 0+1.224647e-16i

To compare integer values for equality, use ==. Don’t use a single = since that is used for something else, as we’ll see in a moment. Just like the arithmetic operators, == and the other relational operators are vectorized. To check for inequality, the “not equals” operator is !=. Greater than and less than are as you might expect: > and < (or >= and <= if equality is allowed). Here are a few examples:

比较两个值得大小，我们可以使用 "=="，比较不等使用"!="，比较大于">",比较大于等于">=",比较小于"<",比较小于等于"<="

向量与单个数值比较的示例：

> c(1,2,3,4,5,6) == 3[1] FALSE FALSE  TRUE FALSE FALSE FALSE

比较两个不同的向量

> 1:3 != 3:1[1]  TRUE FALSE  TRUE

Comparing nonintegers using == is problematic. All the numbers we have dealt with so far are floating point numbers. That means that they are stored in the form a * 2 ^b, for two numbers a and b. Since this whole form has to be stored in 32 bits, the result in g number is only an approximation of what you really want. This means that rounding errors often creep into calculations, and the answers you expected can be wildly wrong.Whole books have been written on this subject; there is too much to worry about here.

Consider these two numbers, which should be the same:

说了很多，意思就是，使用"=="进行判断两个值或者两个向量是否相等是有问题的，因为"=="只在整数比较时时没有问题的，对于浮点运算来说会产生问题，问题的本质与计算机是进行二进制存储是有关系的，计算机本身是无法表示精确的1，那么进行浮点互逆运算时可能会导致两个值不同。

在上述的互逆运算中，结果是不相等的。

> sqrt(2)^2 == 2[1] FALSE> sqrt(2)^2 - 2[1] 4.440892e-16

R also provides the function all.equal for checking equality of numbers. This provides a tolerance level (by default, about 1.5e-8), so that rounding errors less than the tolerance are ignored:

R针对该种情况，提供了一种可以容忍一定程度差异的函数 all.equal。如果在容忍程度1.5e-8之内，可以看成无差异。

> identical(sqrt(2)^2, 2)[1] FALSE> all.equal(sqrt(2)^2, 2)[1] TRUE

If the values to be compared are not the same, all.equal returns a report on the differences. If you require a TRUE or FALSE value, then you need to wrap the call to all.equal in a call to isTRUE:

all.equal函数是分为两种情况的，一种是上述那样在要么无差异，要么差异在容忍范围内的，针对这种情况将返回true或者false；另一种是存在差异，且差异大于容忍程度的，那么函数将返回两者的差异值。如果不想返回差异，那么可以使用isTRUE函数来判断.

> all.equal(sqrt(2)^2,3)[1] "Mean relative difference: 0.5"> isTRUE(all.equal(sqrt(2)^2,3))[1] FALSE

We can also use == to compare strings. In this case the comparison is case sensitive, so the strings must match exactly. It is also theoretically possible to compare strings using greater than or less than (> and <):

在R语句中，字母大小写是敏感的，我们可以比较两个字符串是否相等，或者大小。字符是没有什么容忍度的，可以直接使用"=="

> c('can','you','cana','canb') == 'can'[1]  TRUE FALSE FALSE FALSE> c('can','you','cana','canb') < 'can'[1] FALSE FALSE FALSE FALSE> c('can','you','cana','canb') <= 'can'[1]  TRUE FALSE FALSE FALSE

0 0