R学习02(数据集创建)
来源:互联网 发布:java使用命令行参数 编辑:程序博客网 时间:2024/05/29 12:31
- 注意
1、 TRUE,FALSE严格区分大小写
2、R不支持多行注释
3、变量不能被declared,they come into existence on first assignment
4、
vactor
Note:vectors are one-dimensional arrays,scalars are one-element vectors.
- 创建
use c() function
a<-c(1,2,3) #a numeric vectorb<-c("one",'two","three") # a charactor vectorc<-(TRUE,TRUE,FALSE) # a logical vector)f<-3 # scalars are used to hold constants(常量)
matrices
- 创建
mymatrix<- matrix(vector,nrow=number_of_rows,ncol=number_of_columns,byrow=logical_value,dimname=list(char_vector_rownames,char_vector_colnames))
默认:byrow=FALSE
> y<-matrix(1:20,nrow=5,ncol=4)> y [,1] [,2] [,3] [,4][1,] 1 6 11 16[2,] 2 7 12 17[3,] 3 8 13 18[4,] 4 9 14 19[5,] 5 10 15 20> cells<-c(1,2,3,4)> rnames<-c("R1","R2")> cnames<-c("C1,","C2")> mymatrix<-matrix(cells,nrow=2,ncol=2,byrow = TRUE,dimnames=list(rnames,cnames))> mymatrix C1, C2R1 1 2R2 3 4
arrays
Arrays are similar to matrices but can have more than two dimensions.
Like matrices,they must be asingle mode.
- 创建
myarray<-array(vertor,dimensions,dimnames)
dimensions is a numeric vector giving the maximal index for each dimension
dimnames is an optional list of dimension lables
> dim1<-c("a1","a2")> dim2<-c("b1","b2","b3")> dim3<-c("c1","c2","c3","c4")> z<-array(1:24,c(2,3,4),dimnames = list(dim1,dim2,dim3))> z, , c1 b1 b2 b3a1 1 3 5a2 2 4 6, , c2 b1 b2 b3a1 7 9 11a2 8 10 12, , c3 b1 b2 b3a1 13 15 17a2 14 16 18, , c4 b1 b2 b3a1 19 21 23a2 20 22 24
data frames
a data from is more general than a matrix in that different columns can contain different modes of data.
- 创建
mydata<- data.frame(col1,clo2,clo3,…)
> ID<-c(1,2,3,4)> age<-c(25,34,28,52)> diabetes<-c("type1","type2","type1","type1")> status<-c("poor","improved","excellent","poor")> patientdata<-data.frame(ID,age,diabetes,status)> patientdata ID age diabetes status1 1 25 type1 poor2 2 34 type2 improved3 3 28 type1 excellent4 4 52 type1 poor
each column must have only one mode,but you can put colmns of different modes together to form the data frame.
factor
variables canbe described as nominal(名义型),ordinal(有序型),or continuous.
Categorical(nominal) and ordered categorical(ordinal) variables in R are called factors.
norminal:上个例子中的diabetes(type1,type2),是无序的
ordinal:上个例子中的status(poor,improved,excellent),是有序的,但不表示数量
factor()
myfactor<-factor(factor_vector,order=TRUE,levels)
> ID<-c(1,2,3,4)> age<-c(25,34,28,52)> diabetes<-c("type1","type2","type1","type1")> status<-c("poor","improved","excellent","poor")> diabetes<-factor(diabetes)> status<-factor(status,order=TRUE)> patientdata<-data.frame(patientdata,age,diabetes,status)> str(patientdata)'data.frame': 4 obs. of 7 variables: $ ID : num 1 2 3 4 $ age : num 25 34 28 52 $ diabetes : Factor w/ 2 levels "type1","type2": 1 2 1 1 $ status : Factor w/ 3 levels "excellent","improved",..: 3 2 1 3 $ age.1 : num 25 34 28 52 $ diabetes.1: Factor w/ 2 levels "type1","type2": 1 2 1 1 $ status.1 : Ord.factor w/ 3 levels "excellent"<"improved"<..: 3 2 1 3> summary(patientdata) ID age diabetes status Min. :1.00 Min. :25.00 type1:3 excellent:1 1st Qu.:1.75 1st Qu.:27.25 type2:1 improved :1 Median :2.50 Median :31.00 poor :2 Mean :2.50 Mean :34.75 3rd Qu.:3.25 3rd Qu.:38.50 Max. :4.00 Max. :52.00 age.1 diabetes.1 status.1 Min. :25.00 type1:3 excellent:1 1st Qu.:27.25 type2:1 improved :1 Median :31.00 poor :2 Mean :34.75 3rd Qu.:38.50 Max. :52.00
注意,对factor指定order=TRUE,并规定level,是为了让factor的排序方式与逻辑顺序一致,默认情况是依照字母顺序创建的
list
A list allows you to gather a variety of objects under one name.For example,a list may contain a conbination of vectors,matrices,data frames,and even other lists.
- 创建
mylist<-list(object1,object2,…)
# Optionally,you can name the objects in a list
mylist<-list(name1=object1,name2=object2,…)
> g<-"my first list"> h<-c(1,2,3,4)> j<-matrix(1:10,nrow=5)> k<-c("one","two","three")> mylist<-list(title=g,ages=h,j,k)> mylist$title[1] "my first list"$ages[1] 1 2 3 4[[3]] [,1] [,2][1,] 1 6[2,] 2 7[3,] 3 8[4,] 4 9[5,] 5 10[[4]][1] "one" "two" "three"> mylist[[2]][1] 1 2 3 4> mylist[["ages"]][1] 1 2 3 4
R refers to case identifiers as rownames and categorical variables(nominal[名义型] ,ordinal[有序型]) as factors
A dataset is usually a rectangular array of data with rows representing observations and columns representing variables.
R has a wide variaty of objects for hoding data,including scalars(标量),vectors,matrices,arrays,data frame,and lists.——《R in Action》
- R学习02(数据集创建)
- R语言学习笔记 - 创建数据集
- R语言学习笔记(一) ---创建R语言数据集
- R语言学习1--基本操作及创建数据集
- R语言-创建数据集
- 《R语言实战》读书笔记(二)-- 创建数据集
- SAS学习︱逻辑库、数据集创建与查看、数据库链接(SAS与R的code对照)
- SAS学习︱逻辑库、数据集创建与查看、数据库链接(SAS与R的code对照)
- R语言之创建数据集
- R语言之创建数据集
- R语言学习-创建新的数据框
- R语言学习二(包与数据集)
- R语言学习笔记(R对象和数据组织)
- R语言实战(第2版)笔记-第2章 创建数据集
- R中数据框的创建
- R语言入门之创建数据集——向量、矩阵、数组、数据框和列表
- R语言学习笔记2——数据集
- R语言学习三 各种格式数据集的导入
- 编程之战第五章 百万级斐波那契
- Scrapy-Request和Response(请求和响应)
- NRF51822通道设置的理解
- ArcGIS API for JavaScript(4.x)——图层控制
- Maven3路程(四)用Maven创建Struts2项目
- R学习02(数据集创建)
- 边界检查
- 【排序算法】简单选择排序
- dubbo项目启动报错,无法加载org.apache.zookeeper.server.ZooTrace
- Android开发,关于aar你应该知道的
- url发送get,post请求,应该是最底层的,能够从http,request中直接get到的
- Android Studio 实用工具依赖库
- 2017.7 新的启航
- STM32 中的 assert_param 函数