使用Spark MLLib的PrefixSpan算法示例代码
来源:互联网 发布:java培训包就业 编辑:程序博客网 时间:2024/04/29 13:45
http://spark.apache.org/docs/latest/mllib-frequent-pattern-mining.html
PrefixSpan
import java.util.Arrays;import java.util.List;
import org.apache.spark.mllib.fpm.PrefixSpan;import org.apache.spark.mllib.fpm.PrefixSpanModel;
JavaRDD
FP-Growth
http://spark.apache.org/docs/latest/mllib-frequent-pattern-mining.html
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaPairRDD;
import org.apache.spark.api.java.JavaRDD;
import org.apache.spark.api.java.JavaSparkContext;
import org.apache.spark.api.java.function.FlatMapFunction ;
import org.apache.spark.api.java.function.Function;
import org.apache.spark.api.java.function.Function2;
import org.apache.spark.api.java.function.PairFunction;
import scala.Tuple2;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.regex.Pattern;
import org.apache.spark.mllib.fpm.AssociationRules;
import org.apache.spark.mllib.fpm.FPGrowth;
import org.apache.spark.mllib.fpm.FPGrowthModel;
public final class mysparktest {
private static final Pattern SPACE = Pattern. compile(" ");
public static void main(String[] args) throws Exception {
String data_path = "E:\sample_fpgrowth.txt";
SparkConf sparkConf = new SparkConf().setAppName( "Java_FP-Growth"); JavaSparkContext sc = new JavaSparkContext(sparkConf); //从hdfs中读取日志转为RDD JavaRDD<String> data = sc.textFile(data_path); //将数据转换成项集 JavaRDD<List<String>> transactions = data.map( new Function<String, List<String>>() { public List<String> call(String line) { String[] parts = line.split( " "); return Arrays.asList (parts); } } ); //生成FPGrowth算法对象 FPGrowth fpg = new FPGrowth().setMinSupport(0.2).setNumPartitions(10); //使用FPGrowth算法计算结果 FPGrowthModel<String> model = fpg.run(transactions); //输出频繁项集 for (FPGrowth.FreqItemset<String> itemset: model.freqItemsets().toJavaRDD().collect()) { System. out.println(itemset.javaItems() + ", " + itemset.freq()); } //输出关联规则 double minConfidence = 0.8; for (AssociationRules.Rule<String> rule : model.generateAssociationRules(minConfidence).toJavaRDD().collect()) { System. out.println( rule.javaAntecedent() + " => " + rule.javaConsequent() + ", " + rule.confidence()); } System.exit(0);
}
}
- 使用Spark MLLib的PrefixSpan算法示例代码
- Spark MLlib SVM算法
- Spark MLlib FPGrowth算法
- Spark MLlib 算法
- Spark MLlib SVM算法
- Spark MLlib FPGrowth算法
- Spark MLlib算法
- spark mllib 决策树算法
- Pipeline详解及Spark MLlib使用示例(Scala/Java/Python)
- Pipeline详解及Spark MLlib使用示例(Scala/Java/Python)
- Spark mllib 随机森林算法的简单应用(附代码)
- spark mllib 相关使用
- prefixspan算法
- prefixspan算法
- 用Spark学习FP Tree算法和PrefixSpan算法
- 用Spark学习FP Tree算法和PrefixSpan算法
- 用Spark学习FP Tree算法和PrefixSpan算法_0
- 用Spark学习FP Tree算法和PrefixSpan算法
- 系统一致性介绍
- android中wifi输入的密码保存的路径
- 将java文件打包提交MapReduce任务流程
- Spark集群部署流程
- Git最佳实践与常见问题
- 使用Spark MLLib的PrefixSpan算法示例代码
- Android:adb 启动activity、service,发送broadcast的方法
- 数据集成字符串匹配算法:EditDIstance,NeedlemanWunch,Soundex,Jaccard
- 人最怕深交后的陌生
- JavaScript面向对象总结(长篇慢慢看)
- Oracle与jdbc增删改查CRUD(Create-Read-Update-Delete)
- 关于推送通知的隐藏
- 高并发网络编程之epoll详解
- IOS 下应该如何播放G711数据呢 初学者求帮助