Time Serise Analysis[Using R]
来源:互联网 发布:妈妈讲故事软件下载 编辑:程序博客网 时间:2024/06/04 18:52
Time Serise Analysis[Using R]
[近期需要用到时间序列分析,顺便整理下笔记以供日后参考]
时间序列分析基本流程
时间序列分析在R中的实战分析
- #### 导入数据
# Get Work Directorygetwd()# Import Data From local FileData <- read.csv('~/Documents/data.csv', fill = TRUE, header = TRUE)# Use data which is incorporated in RData <- AirPassengers#Generate Datat = ts(seq(1,30))Date_List <- seq(from = as.Date('2016-9-1'),by=1,length.out = 30)Data = data.frame(Date_List,t)
可视化数据
可视化时间序列数据的目的在于分析数据的趋势性、季节性以及它的随机表现
plot(AirPassengers)abline(reg=lm(AirPassengers~time(AirPassengers)))
平稳化时间序列
时间序列的平稳性有3个基本的判别准则
The mean of the series should not be a function of time rather should be a constant.
The variance of the series should not a be a function of time. This property is known as homoscedasticity.
The covariance of the i th term and the (i + m) th term should not be a function of time.
# Dickey Fuller Test of Stationarity# AR or MA are not applicable on non-stationary series.install.packages('fUnitRoots')library(fUnitRoots)adfTest(AirPassengers)# ResultTitle:Augmented Dickey-Fuller TestTest Results: PARAMETER: Lag Order: 1 STATISTIC: Dickey-Fuller: -0.3524 P VALUE: 0.5017
将时间序列平稳化的三个基本技巧
Detrending
Here, we simply remove the trend component from the time series. (If We Know the trend component)
Differencing
Seasonality
Seasonality can easily be incorporated in the ARIMA model directly
adfTest(diff(log(AirPassengers)))# ResultTitle:Augmented Dickey-Fuller TestTest Results: PARAMETER: Lag Order: 1 STATISTIC: Dickey-Fuller: -8.8157 P VALUE: 0.01
依据ACF、PACF寻找合适的参数
Once we have got the stationary time series, we must answer two primary questions:
Q1. Is it an AR or MA process?
Q2. What order of AR or MA process do we need to use?
Simple Example:
- AR : [x(t) = alpha * x(t – 1) + error (t)]
- MA : [x(t) = beta * error(t-1) + error (t)]
acf(diff(log(AirPassengers))) # FOR Parameters p (MA Model)
pacf(diff(log(AirPassengers))) # FOR Parameters q (AR Model)
Clearly, ACF plot cuts off after the first lag. Hence, we understood that value of p should be 0 as the ACF is the curve getting a cut off. While value of q should be 1 or 2. After a few iterations, we found that (0,1,1) as (p,d,q) comes out to be the combination with least AIC and BIC.
建立ARIMA模型
The value found in the previous section might be an approximate estimate and we need to explore more (p,d,q) combinations. The one with the lowest BIC and AIC should be our choice.
fit <- arima(log(AirPassengers), c(0, 1, 1),seasonal = list(order = c(0, 1, 1), period = 12))# d choose 1 because diff's order is 1
模型预测
pred <- predict(fit, n.ahead = 10*12)ts.plot(AirPassengers,2.718^pred$pred, log = "y", lty = c(1,3))
Reference :A Complete Tutorial on Time Series Modeling in R
- Time Serise Analysis[Using R]
- Time series Analysis with R
- Time series Analysis with R(二)
- Easy, Real-Time Big Data Analysis Using Storm
- 学习Introduction to Data Analysis using R系列
- Market Basket Analysis Using Association Rules in R
- Time Series Analysis
- Sentiment Analysis Using Doc2Vec
- Time complexity analysis: solving recurrences
- vehicle time series data analysis
- recursive function time complexity analysis
- Time Series Analysis in Python
- Automated Data Analysis Using Excel
- Using Hive for weblog analysis
- Market Basket Analysis with R
- The first time using
- time test using microtime()
- External Merge Sort, time complexity analysis
- HDU 4507 吉哥系列故事——恨7不成妻(数位dp,较难)
- Oracle 管理用户
- startup.sh: command not found
- UITableView最上面tableHeaderView留空间的两种方法
- 5G通信技术解读:波束成形如何为5G添翼?
- Time Serise Analysis[Using R]
- 蛇形数组
- android 移植 ffmpeg (一)
- OpenCV生成不规则ROI另一法
- QML中push传参数从当前QML到下一个QML
- 利用runtime解决button重复响应事件
- 双向循环链表的建立
- struts2入门,HelloWorld环境搭建。
- tomcat+nginx+redis实现均衡负载、session共享---(推荐)