使用传统hbase的api创建hbase表(scala)

来源:互联网 发布:广州数据库开发工程师 编辑:程序博客网 时间:2024/05/21 10:18

本地执行主类实现采集hbase表


一、使用传统hbase的api创建hbase表(scala本地运行类,并且集群不需要kerberos认证

1、环境准备:idea 16+scala-2.10.4+cdh-spark-1.6.1+jdk-1.7+hbase-1.2.0-cdh5.8.0

2、导入相关jar包,见pom.xml

<?xml version="1.0"encoding="UTF-8"?>
<projectxmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"
>
    <modelVersion>4.0.0</modelVersion>

    <groupId>com</groupId>
    <artifactId>enn.hbase</artifactId>
    <version>1.0-SNAPSHOT</version>
    <properties>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
        <project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
        <java.version>1.7</java.version>
        <cdh.version>1.2.0-cdh5.8.0</cdh.version>
        <!--add  maven release-->
        
<maven.compiler.source>1.7</maven.compiler.source>
        <maven.compiler.target>1.7</maven.compiler.target>
        <encoding>UTF-8</encoding>
        <scala.tools.version>2.10</scala.tools.version>
        <scala.version>2.10.4</scala.version>
        <!--google-collections版本-->
        
<google-collections>1.0</google-collections>
    </properties>
    <!--配置依赖库地址(用于加载CDH依赖的jar包) -->
    
<repositories>
        <repository>
            <id>cloudera</id>
            <url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
        </repository>
    </repositories>
    <dependencies>
        <!--hbase-server-->
    
<dependency>
        <groupId>org.apache.hbase</groupId>
        <artifactId>hbase-server</artifactId>
        <version>${cdh.version}</version>
    </dependency>
        <!--hbase-client/ -->
        
<dependency>
            <groupId>org.apache.hbase</groupId>
            <artifactId>hbase-client</artifactId>
            <version>${cdh.version}</version>
        </dependency>
        <!--hbase-common-->
        
<dependency>
            <groupId>org.apache.hbase</groupId>
            <artifactId>hbase-common</artifactId>
            <version>${cdh.version}</version>
        </dependency>
        <!--hbase-spark -->
        
<dependency>
            <groupId>org.apache.hbase</groupId>
            <artifactId>hbase-spark</artifactId>
            <version>${cdh.version}</version>
        </dependency>

        <!--spark scala-->
        
<dependency>
            <groupId>org.scala-lang</groupId>
            <artifactId>scala-library</artifactId>
            <version>${scala.version}</version>
        </dependency>
        <!--google-collections-->
        
<dependency>
            <groupId>com.google.collections</groupId>
            <artifactId>google-collections</artifactId>
            <version>${google-collections}</version>
        </dependency>
    </dependencies>
</project>

 

3、编写实现的代码类,如下:

(1)HbaseConnectionUtil.scala

 

package util
import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.hbase.HBaseConfiguration
import org.apache.hadoop.hbase.client.HBaseAdmin
import org.apache.hadoop.security.UserGroupInformation;
/**
  * Document:本类作用---->获取Hbase连接
  * User: yangjf
  * Date: 2016/8/18  17:40
  */
object HbaseConnectionUtil {

  //常规方式获取链接
  
def  getHbaseConn( ipStr:String) :HBaseAdmin= {
      //获取配置
      
valconf=HBaseConfiguration.create();
      //ipstr="192.168.142.115,192.168.142.116,192.168.142.117"---->以逗号分隔
      
conf.set("hbase.zookeeper.quorum",ipStr);
      //获取Hbase的master
      
valadmin=newHBaseAdmin(conf);
      admin
  }
  //释放连接
  
defreleaseConn( admin:HBaseAdmin) :Unit= {
    try{
      if(admin!=null){
        admin.close();
      }
    }catch{
      case ex:Exception=>ex.getMessage
    }
  }
 }


(2)HbaseCreate.scala

package controller
import org.apache.hadoop.hbase.{HColumnDescriptor, HTableDescriptor, TableName}
import org.apache.hadoop.hbase.client.HTable
import org.apache.spark.{SparkConf, SparkContext}
import util.HbaseConnectionUtil
/**
  * Document:本类作用---->使用Hbase传统api创建hbase表(测试环境)
  * User: yangjf
  * Date: 2016/8/18  18:31
  */
object HbaseCreate {
//  生产环境
  
lazy valTABLE_NAME="gas:test_enn_222"
//生产环境
  
lazy valIP_STR="host17.slave.cluster.enn.cn:2181"

  def
main(args: Array[String]) {
    
valadmin = HbaseConnectionUtil.getHbaseConn(IP_STR);
    val htable = new HTable(admin.getConfiguration(),TABLE_NAME);
    try {

      //1\确认是否存在该表
      
if(admin.tableExists(TABLE_NAME)) {
        //置为不可用,然后删除
        
admin.disableTable(TABLE_NAME);
        admin.deleteTable(TABLE_NAME);
      }
      //2\创建描述
      
valh_table =newHTableDescriptor(TableName.valueOf(TABLE_NAME));
      val h_clomun = new HColumnDescriptor("dep_info");
      h_clomun.setBlocksize(64 * 1024);
      h_clomun.setBlockCacheEnabled(true);
      h_clomun.setMaxVersions(2); //最大版本号
      //添加到family
      
h_table.addFamily(h_clomun);
      h_table.addFamily(new HColumnDescriptor("son_id".getBytes()));
      //3\创建表
      
admin.createTable(h_table);
    } catch {
      case ex:Exception=>ex.printStackTrace()
    }finally {
      HbaseConnectionUtil.releaseConn(admin)
    }
  }
}


4、运行HbaseCreate.scala类

5、进入集群查看"gas:test_enn_222"表是否创建

(1)使用xshell进入机器

(2)命令行输入:hbase shell



(3)输入:list 查看所有表

(4)查看表结构描述:descgas:test_enn_222



出现这种情况,说明已经创建表成功!!


二、使用传统hbase的api创建hbase表(scala本地运行类,并且集群需要kerberos认证

1、需要集群用户的相关认证文件:

(1)e_lvbin.keytab:需要在集群上生成,然后取到windows本地

(2)krb5.conf:可以从集群上/etc/目录下,取到本地


2、在刚才的HbaseConnectionUtil.scala类中添加一个新的Hbase链接方法

 

//通过认证方式获取Hbase链接(测试环境的Hbase链接)
def getHbaseConnect():HBaseAdmin={
  System. setProperty("java.security.krb5.conf","F:/krb5.conf");
 val  configuration = HBaseConfiguration.create();
  //zookeeper地址可以写一个,也可写多个

configuration.set("hbase.zookeeper.quorum","slave-29.dev.cluster.enn.cn,slave-30.dev.cluster.enn.cn,slave-31.dev.cluster.enn.cn","2181");

  //        Configuration conf = buildConnection(
  //                "slave-29.dev.cluster.enn.cn,slave-30.dev.cluster.enn.cn,slave-31.dev.cluster.enn.cn", "2181");
  
configuration.set("hadoop.security.authentication","kerberos");
  configuration.set("hbase.security.authentication","kerberos");
  configuration.set("hbase.security.authorization","true");
  configuration.set("hbase.master.kerberos.principal","hbase/_HOST@ENN.CN");
  configuration.set("hbase.thrift.kerberos.principal","hbase/_HOST@ENN.CN");
  configuration.set("hbase.regionserver.kerberos.principal","hbase/_HOST@ENN.CN");

  // conf.set("zookeeper.znode.rootserver", "root-region-server");
 
valuser ="e_lvbin@ENN.CN";
  val keyPath = "F:/e_lvbin.keytab"//
  
UserGroupInformation.setConfiguration(configuration);
  UserGroupInformation.loginUserFromKeytab(user, keyPath);
  //获取Hbase的master
  
valadmin=newHBaseAdmin(configuration);
  admin
}

 

3、将主类HbaseCreate.scala中的链接

val admin = HbaseConnectionUtil.getHbaseConn(IP_STR);

修改为

val admin = HbaseConnectionUtil.getHbaseConnect();

 

4、修改hbase为新的表名称

  lazy valTABLE_NAME="gas:test_enn_33"

5、运行主类:HbaseCreate.scala

6、进入hbase查看是否创建成功即可

 


以上测试已经通过,可根据自己需求,进行更改!(红色部分是需要修改的!)

不足之处,请各位批评指正!




0 0
原创粉丝点击