Weka-filter类-选择样本[5]
来源:互联网 发布:exe软件重新编译 编辑:程序博客网 时间:2024/06/05 20:27
目前有3个方法可以研究下:RemovePercentage,RemoveRange,RemoveWithValues
RemovePercentage:顾名思义,根据百分比选择样本
RemoveRange:根据序号选择样本
RemoveWithValues:根据某字段的设定值选择样本
然后,为减少文章重复的内容,我们先看这几个方法的不同的构造函数部分:
(1)RemovePercentage
-
- RemovePercentage remove1=new RemovePercentage();
- remove1.setOptions(new String[]{"-P","45"});
-
-
- RemovePercentage remove2=new RemovePercentage();
- remove2.setPercentage(80);
(2)RemoveRange-
- RemoveRange remove1=new RemoveRange();
- remove1.setOptions(new String[]{"-R","1,3-10"});
-
-
- RemoveRange remove2=new RemoveRange();
- remove2.setInstancesIndices("1,3-5");
(3)RemoveWithValues(仅适用于数值型变量,名义型变量需使用其他方法)-
-
- RemoveWithValues remove1=new RemoveWithValues();
- remove1.setOptions(new String[]{"-C","2","-S","80"});
-
-
- RemoveWithValues remove2=new RemoveWithValues();
- remove2.setAttributeIndex("2");
- remove2.setSplitPoint(75);
学习了如何使用这3种方法后,我们选择RemoveWithValues方法来演示,完整代码:
- import java.io.FileReader;
-
- import weka.core.Instances;
- import weka.filters.unsupervised.instance.RemoveWithValues;
-
- public class Filter4 {
-
- public static void main(String[] args) throws Exception {
-
-
- RemoveWithValues remove1=new RemoveWithValues();
- remove1.setOptions(new String[]{"-C","2","-S","80"});
-
-
- RemoveWithValues remove2=new RemoveWithValues();
- remove2.setAttributeIndex("2");
- remove2.setSplitPoint(75);
-
-
- Instances data=new Instances(new FileReader("data/weather.numeric.arff"));
-
-
- System.out.println("原数据有:"+data.numInstances()+"记录");
- for(int i=0;i<data.numInstances();i++){
- System.out.println(data.instance(i));
- }
-
- System.out.println("================================");
-
-
- remove1.setInputFormat(data);
- Instances newdata=weka.filters.Filter.useFilter(data, remove1);
- System.out.println("第一种方法,根据第2列剔除小于80的样本,剩余"+newdata.numInstances()+"记录");
- for(int i=0;i<newdata.numInstances();i++){
- System.out.println(newdata.instance(i));
- }
-
- remove1.setInvertSelection(true);
- remove1.setInputFormat(data);
- newdata=weka.filters.Filter.useFilter(data, remove1);
- System.out.println("未选择的数据有:"+newdata.numInstances()+"记录");
- for(int i=0;i<newdata.numInstances();i++){
- System.out.println(newdata.instance(i));
- }
-
- System.out.println("================================");
-
-
- remove2.setInputFormat(data);
- newdata=weka.filters.Filter.useFilter(data, remove2);
- System.out.println("第二种方法,根据第2列剔除小于75的样本,剩余"+newdata.numInstances()+"记录");
- for(int i=0;i<newdata.numInstances();i++){
- System.out.println(newdata.instance(i));
- }
- }
- }
结果如图:
0 0