spark1.4 读取hbase 0.96 报错 java.io.NotSerializableException: org.apache.hadoop.hbase.io.ImmutableBytes

来源：互联网发布：天天动听mac 编辑：程序博客网时间：2024/05/01 01:22

spark接hbase读取数据：

    val sc = new SparkContext(sparkConf)    val conf = HBaseConfiguration.create()    conf.set("hbase.zookeeper.property.clientPort", "port..")    conf.set("hbase.zookeeper.quorum", "ip..")    conf.set(TableInputFormat.INPUT_TABLE,"table1..")    val hBaseRDD = sc.newAPIHadoopRDD(conf, classOf[TableInputFormat],      classOf[org.apache.hadoop.hbase.io.ImmutableBytesWritable],      classOf[org.apache.hadoop.hbase.client.Result])    val count = hBaseRDD.count()    println("HBase RDD Count:" + count)    hBaseRDD.cache()    //遍历输出    hBaseRDD.collect().foreach{ case (_,result) =>      val key = Bytes.toInt(result.getRow)      val name = Bytes.toString(result.getValue("pd".getBytes,"name".getBytes))      val age = Bytes.toInt(result.getValue("pd".getBytes,"age".getBytes))      println("Row key:" + key + " name:" + name +" age:" + age)    }        sc.stop()

在集群中spark-submit报错如下：

在网上看到有关继承spark自带的序列化工具，就想到可能没有设置序列化的类，还是用的java的

于是到conf/目录下查看spark-defaults.conf文件，果然没有用到，修改如下：

更改后重新启动spark可以取到hbase的数据。

0 0