Spark pipe + PHP 的 wordcount 实现

来源:互联网 发布:linux tar 解压 编辑:程序博客网 时间:2024/06/07 17:28

php的工作就是把每个词搞成一个RDD

<?php$in = fopen('php://stdin','r');while(!feof($in)){  $temp = explode(" ",fgets($in));  for ($i=0;$i<count($temp);$i++){     printf("%s\n",$temp[$i]);    } }?>
package testimport org.apache.spark.SparkConfimport org.apache.spark.SparkContextobject PipeTest {  def main(args: Array[String]) {    val sparkConf = new SparkConf().setAppName("pipe Test")    val sc = new SparkContext(sparkConf)    val a = sc.textFile("/home/gt/wordcount.txt", 3)    val result = a.pipe("php /home/gt/spark/bin/test.php").map(x => (x, 1)).reduceByKey(_ + _)    result.foreach { x => println("!!!!! " + x) }    sc.stop()  }}
0 0
原创粉丝点击