sparksql语法,通过映射方式读txt

来源:互联网 发布:qq远程监控软件 编辑:程序博客网 时间:2024/06/11 07:49
Inferring the Schema Using Reflection(通过映射指定schema)--声明一个类scala> case class Person(name: String, age: Int)defined class Person--读取txt文件内容,然后映射到类的字段,转化成dataframescala> val people = sc.textFile("hdfs://node1:8020/test/input/people.txt").map(_.split(",")).map(p => Person(p(0), p(1).trim.toInt)).toDF()people: org.apache.spark.sql.DataFrame = [name: string, age: int]--变量注册成表,表名为peoplescala> people.registerTempTable("people");--执行sqlscala> val teenagers=sqlContext.sql("select name,age from people where age>=13 and age<=19")teenagers: org.apache.spark.sql.DataFrame = [name: string, age: int]scala> teenagers.show+------+---+|  name|age|+------+---+|Justin| 19|+------+---+scala> teenagers.select("name").show+------+|  name|+------+|Justin|+------+scala> teenagers.countres19: Long = 1--通过位置访问scala> teenagers.map(t => "Name: " + t(0)).collect().foreach(println)Name: Justin--通过字段访问scala> teenagers.map(t => "Name: " + t.getAs[String]("name")).collect().foreach(println)Name: Justin--转化成map对scala> teenagers.map(_.getValuesMap[Any](List("name", "age"))).collect().foreach(println)Map(name -> Justin, age -> 19)

0 0
原创粉丝点击