Getting and Cleaning Data - Week 1 Quiz
来源:互联网 发布:8051 单片机有哪些 编辑:程序博客网 时间:2024/06/08 17:57
Coursera 公开课 Quiz
第一题
题目
The American Community Survey distributes downloadable data about United States communities. Download the 2006 microdata survey about housing for the state of Idaho using download.file() from here:
https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06hid.csv
and load the data into R. The code book, describing the variable names is here:
https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2FPUMSDataDict06.pdf
How many properties are worth $1,000,000 or more?
solution
从这个网站下载一个关于房价的表格
https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06hid.csv
fileUrl <- "https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06hid.csv"download.file(fileUrl, destfile = "data.csv", method = "curl")dateDownloaded <- date();
关于这组数据的详细解释可以参考这个pdf
https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2FPUMSDataDict06.pdf
问有多少房屋价格大于100万美元,根据pdf里面的解释,val>= 24 代表房价大于100万美元
hd <- read.csv("data.csv")sum(!is.na(hd[hd$VAL >= 24, 37]))
结果
[ 1 ] 53
第二题
题目
Use the data you loaded from Question 1. Consider the variable FES in the code book. Which of the “tidy data” principles does this variable violate?
solution
tidy data 一列不能有多个变量
第三题
题目
Download the Excel spreadsheet on Natural Gas Aquisition Program here:
https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2FDATA.gov_NGAP.xlsx
Read rows 18-23 and columns 7-15 into R and assign the result to a variable called:
dat
What is the value of:
sum(dat$Zip*dat$Ext,na.rm=T)
original data source: http://catalog.data.gov/dataset/natural-gas-acquisition-program)
solution
下载这个表格
fileUrl <- "https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2FDATA.gov_NGAP.xlsx"download.file(fileUrl, destfile = "gas.xlsx")dateDownloaded <- date()
读取 row 18-23 col 7-15 的数据,赋值给dat
library(xlsx)rowIndex = 18:23colIndex = 7:15dat <- read.xlsx("gas.xlsx", sheetIndex = 1, rowIndex = rowIndex, colIndex = colIndex, header = TRUE)
计算题目给出的式子
sum(dat$Zip * dat$Ext, na.rm=T)
结果
36534720
第四题
题目
Question 4
Read the XML data on Baltimore restaurants from here:
https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Frestaurants.xml
How many restaurants have zipcode 21231?
solution
代码
fileUrl <- "http://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Frestaurants.xml"doc <- xmlTreeParse(fileUrl, useInternal = TRUE)rootNode <- xmlRoot(doc)sum(xpathSApply(rootNode, "//zipcode", xmlValue) == "21231")
结果
127
第五题
题目
The American Community Survey distributes downloadable data about United States communities. Download the 2006 microdata survey about housing for the state of Idaho using download.file() from here:
https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06pid.csv
using the fread() command load the data into an R object
- Getting and Cleaning Data - Week 1 Quiz
- Getting and Cleaning Data Week 2 Quiz
- Getting And Cleaning Data Week 3 Quiz
- [Getting and Cleaning data] Quiz 1
- [Getting and Cleaning data] Week 1
- [Getting and Cleaning data] Quiz 2
- [Getting and Cleaning data] Quiz 3
- [Getting and Cleaning data] Quiz 4
- [Getting and Cleaning data] Week 2
- [Getting and Cleaning data] Week 3
- [Getting and Cleaning data] Week 4
- [Getting and Cleaning data] swirl
- [Getting and Cleaning data] Project
- R Getting and Cleaning Data获取和清理数据
- Week 1 Quiz
- UIUC大学之Coursera课程Text Retrieval and Search Engines:Week 1 Practice Quiz
- UIUC大学之Coursera课程Text Retrieval and Search Engines:Week 1 Quiz
- Machine Learning week 1 quiz: Introduction
- awk入门学习笔记
- Activity 之 值保存、状态恢复
- java filechannel
- css 样式使用包括样式文件导入
- uva 272
- Getting and Cleaning Data - Week 1 Quiz
- CSS3阴影全接触
- Saving Data --- Saving Key-Value Sets(一)
- 使用IntelliJ IDEA 14和Maven创建java web项目
- python 学习简记 《编程导论》 CH6&CH7
- Linux:-bash: ***: command not found
- 算术平均、几何平均、平方平均、调和平均
- Android 使用Socket实现服务器与手机客户端的长连接七:定义消息体,区分是响应还是回复信息
- Android 使用Socket实现服务器与手机客户端的长连接七:定义消息体,区分是响应还是回复信息