Data Analysis
来源:互联网 发布:网络线上和线下的区别 编辑:程序博客网 时间:2024/05/01 21:12
Introduction to Advanced Data Analysis
Given datasets to analysis, there are basically two approaches can be applied o the task. For datasets of relative small sizes, statistical approach is a good option, which provides a number statistical status and visualizing methods to let you get the overviews of the given datasets. Visualizations are important techniques give you the direct intutions to the data. However, for very larger datasets, the statisfied graphs are generally hard to produced, and valuable informations are hard to study out . Data mining is defined as the process of generating actionable information, interesting patterns from large and complex datasets. It provides enhanced techniques from machine learning and statistics to handle the datasets with large volume and complex formats.
Statistic Method
The are many plot methods, some are for multivariate analysis, some for bivariate analysis
Analysis Points, be visualized and summarized, descriptive , perspective. A ideal method should be
1) contains quantitative information. 2) not lose information 3) intuitive.
- dot plot
- jiterring plot (univariate)
- mosaic plot (multivariate)
- contingency table (bivariate for binary category)
- multiplot (scatter-plot matrix and co-plot)
- false color plot (multivariate)
stacked plot (with using dot plot, soltions for composition problem)
kernel density estimate
Linear regression
histogram
culmulative distribution function
correlation
The statistical summary are generally required.
The size of the dataset
The max, min values in an attribute
The mode, frequent in an categories attributes
Distribution of an numerica attributes, is it symetric or asymmetric ?
The spread of the data
Any clusters
Any outliers
Data Mining
Data mining mainly contains 4 tasks: 1) classification. 2) clustering. 3) Association rules mining 4) Anomaly detection.
- data analysis
- Data Analysis
- Data analysis and Data mining
- Data Oriented Analysis & Design
- Roubust Data Analysis
- Access Data Analysis Cookbook
- data analysis tool___R 程序语言
- python for data analysis
- Data analysis and visualization
- python Data analysis function
- Automated Data Analysis Using Excel
- Microsoft Access 2007 Data Analysis
- Challenges in Visual Data Analysis
- use Perl analysis BLAST data
- vehicle time series data analysis
- Python For Data Analysis笔记
- [Exploratory Data Analysis] Week 1
- [Exploratory Data Analysis] Project 1
- JSP Servlet
- 安装Adobe AIR时出错:管理员可能不允许安装此软件
- ASP.NET MVC 2 中的模型验证和元数据
- windows xp + VMWare(linux)网络配置
- QT Creator编译工程常见的错误及解决办法【长期有效】
- Data Analysis
- 构建不带 Web 窗体的 Web 应用程序
- 许多面试题看似简单,却需要深厚的基本功才能给出完美的解答!
- Android Service学习之AIDL, Parcelable和远程服务
- printf函数
- 虚拟机备份转移后,网络启动异常,提示“SIOCSIFADDR: No such device”的解决方案
- 计算机启动后的内幕
- SQL Server 2008语句大全完整版
- 游戏性的根本——浅谈游戏关卡设计