Data mining(I)
来源:互联网 发布:ubuntu设置不休眠 编辑:程序博客网 时间:2024/06/06 19:22
Learning Notes of Dr.Bo Yuan.THU 《Data:Theory and Algorithm》Part I
- Definition:Data Mining is the process of automatically extracting interesting and useful hidden patterns from usually massive,incomplete and noisy data.
Not a fully automatically process.
From data to intelligence.
Data、information、knowledge、decision support Classification
Algorithms:
Decision Tree、KNN、Neural Networks、SVM
Overfitting
Cross Validation Training data 、Test data
Confusion Matrix 、 TP(True Positive) 、FP(False Positive) 、FN(False Negative) 、TN(True Negative) 、TPR(True Positive Rate)、 TNR(True Negative Rate)、 Accuracy
TP+FP+FN+TN = number of samples
ROC:Receiver Operating Characteristic
AUC:Area Under ROC Curve #AUC near 1 is good
Cost sensitive learning
Lift analysisClustering
Difference:Clustering is Unsupervised Learning,Classification is Supervised Learning
Association RuleRegression
Underfitting
Overfitting- Data Preprocessing
Garbage Input garbage Output
Cloud Computing
Parallel Computing
阅读全文
0 0
- Data mining(I)
- Data Mining
- Data Mining
- Data Mining
- Data Mining
- DATA MINING
- DATA MINING(1) Data mining introduction
- What is Data Mining?
- DATA MINING 过程
- Data Mining Trends
- Data Mining资源大全
- Data Mining资源大全
- 何谓Data Mining
- Data Mining 的功能
- 数据挖掘(Data Mining)
- Mining Graph Data
- Advanced Data Mining Techniques
- Data Mining 是个好东西
- HDOJ1163 Eddy's digital Roots
- 并发编程的一些问题
- LINTCODE—— 3个不同的因子
- vue中使用bootstrap框架
- java ArrayList 排序 遍历
- Data mining(I)
- [JAVA]从零开始的“桌面宠物”之路(一):动画效果
- 下一代GIS的思考-周成虎院士报告
- Kodi
- Android实战—闹钟的简单实现
- 【算法】【Dynamic Programming】Longest Valid Parentheses
- OpenCV3.1.0+OpenCV_contrib库实现KCF论文复现的资源总结
- 捣鼓Raspberry Pi 3 (一)之安装Ubuntu Mate 16.04
- 12306自动刷票下单-查票下单