20140612_Time-Series Data Mining

来源:互联网 发布:js文件怎么编写使用 编辑:程序博客网 时间:2024/06/05 08:59

ABSTRACT

Inalmost every scientific field, measurements are performed over time. Theseobservations lead to a collection of organized data called time series. Thepurpose of time-series data mining is to try to extract all meaningfulknowledge from the shape of data. Even if humans have a natural capacity toperform these tasks, it remains a complex problem for computers. In thisarticle, we intend to provide a survey of the techniques applied fortime-series data mining. The first part is devoted to an overview of the tasksthat have captured most of the interests of researchers. Considering that inmost cases, time-series task relies on the same components for implementation,we divide the literature depending on these common aspects, namelyrepresentation techniques, distance measures, and indexing methods. The studyof the relevant literature has been categorized for each individual aspects.Four types of robustness could then be formalized and any kind of distancecould then be classified. Finally, the study submits various research trendsand avenues that can be explored in the near future. We hope that this articlesprovide a broad and deep understanding of the time-series data mining researchfield.

INTRODUCTION

Time-seriesdata mining stems from   the desire to reify our natural ability tovisualize the shapeof data. 

Majortime-series-related tasks:

1)       queryby content;

2)       anomalydetection;

3)       motifdiscovery;

4)       prediction;

5)       clustering;

6)       classification;

The research has not been driven so much by actual problems but by an interest inproposing new problems. However, with the ever-growing maturity of time-seriesdata mining techniques, this statement seems to have been obsolete.

Awide range of real-life problems:

1)       Economicforecasting

2)       Intrusiondetection

3)       Geneexpression analysis

4)       Medicalsurveillance

5)       hydrology

Three issues:

1)       Datarepresentation

2)       Similaritymeasurements

3)       Indexingmethod

Forecasting is the most blatant example of a topic that requires more advanced analysisprocessed as it is more closely related to statistical analysis. It may requirethe use of a time-series representation and a notion of similarity (mostly usedto measure prediction accuracy) whereas model selection and statisticallearning are also at the core of forecasting systems.

 


0 0