R语言的基本统计分析

来源:互联网 发布:万达乐园和迪士尼 知乎 编辑:程序博客网 时间:2024/04/30 20:46

描述性统计分析

#利用(mtcars)数据集,我们提取出英里数(mpg),马力(hp),车重(wt)> myvars <- c("mpg","hp","wt")> head(mtcars[myvars])                   mpg  hp    wtMazda RX4         21.0 110 2.620Mazda RX4 Wag     21.0 110 2.875Datsun 710        22.8  93 2.320Hornet 4 Drive    21.4 110 3.215Hornet Sportabout 18.7 175 3.440Valiant           18.1 105 3.460> summary(mtcars[myvars])      mpg              hp              wt        Min.   :10.40   Min.   : 52.0   Min.   :1.513   1st Qu.:15.43   1st Qu.: 96.5   1st Qu.:2.581   Median :19.20   Median :123.0   Median :3.325   Mean   :20.09   Mean   :146.7   Mean   :3.217   3rd Qu.:22.80   3rd Qu.:180.0   3rd Qu.:3.610   Max.   :33.90   Max.   :335.0   Max.   :5.424  #运用sapply(x,FUN,options)函数,FUN可以是任意函数,如果指定了options,它将被传递给FUN,这里的典型函数有mean(),sd(),var(),min(),max(),median(),length(),range(),quantile().> sapply(mtcars[myvars],mean)      mpg        hp        wt  20.09062 146.68750   3.21725 #特别函数fivenum()可以返回五种数(summary是6种,这里不包括mean)> fivenum(mtcars[myvars]$mpg, na.rm = TRUE)[1] 10.40 15.35 19.20 22.80 33.90#自己构建函数用于sapply> myfun = function(x,na.omit=FALSE){+               if(na.omit)+               x <- x[!is.na(x)]#将NA删除在赋值+               m <- mean(x)+               n <- length(x)+               s <- sd(x)+               skew <- sum((x-m)^3/s^3)/n+               kurt <- sum((x-m)^4/s^4)/n-3+               return(c(n=n,mean=m,stdev=s,skew=skew,kurtosis=kurt))+             }> myvars <- c("mpg","hp","wt")> sapply(mtcars[myvars], myfun)               mpg          hp          wtn        32.000000  32.0000000 32.00000000mean     20.090625 146.6875000  3.21725000stdev     6.026948  68.5628685  0.97845744skew      0.610655   0.7260237  0.42314646kurtosis -0.372766  -0.1355511 -0.02271075#mpg的平均数为20.1,标准差为6.0,分布呈右偏(右偏度+0.61),且较正态分布较平(峰度-0.37
原创粉丝点击