大数据-Hadoop学习笔记08

来源：互联网发布：美国石油钻井数据编辑：程序博客网时间：2024/05/18 15:52

27.Hadoo序列化

【Text类型】

hadoop的Text对应java的java.lang.String

【基本操作】

public void test1() throws Exception {            Text txt = new Text("hello world");            int v = txt.charAt(0);            System.out.println((char)v);            int pos = txt.find("o", 7);            System.out.println(pos);            System.out.println(txt.getLength());            System.out.println(txt.getBytes().length);            System.out.println(Text.decode(txt.getBytes()));    }

【串行】public void serialize() throws Exception {        FileOutputStream fos = new FileOutputStream("/data.dat");        DataOutputStream dos = new DataOutputStream(fos);        //开始串行化        IntWritable iw = new IntWritable(100);        iw.write(dos);        LongWritable lw = new LongWritable(200);        lw.write(dos);        Text txt = new Text("hello");        txt.write(dos);        txt.set("world");        txt.write(dos);        iw.set(-10);        iw.write(dos);        dos.close();        fos.close();    }

【反串行】public void deserialize() throws Exception {        FileInputStream fis = new FileInputStream("/data.dat");        DataInputStream dis = new DataInputStream(fis);        //开始串行化        IntWritable iw = new IntWritable();        iw.readFields(dis);        System.out.println(iw.toString());        LongWritable lw = new LongWritable();        lw.readFields(dis);;        System.out.println(lw.toString());        Text txt = new Text();        txt.readFields(dis);        System.out.println(Text.decode(txt.getBytes()));        txt.readFields(dis);        System.out.println(Text.decode(txt.getBytes()));        IntWritable iw2 = new IntWritable();        iw2.readFields(dis);        System.out.println(iw2.toString());        dis.close();        fis.close();    }

28.secondaryNamenode

1.创建检查点过程    a)nn上执行编辑日志滚动，产生新的编辑日志    b)2nn复制nn的image + edits    c)辅助名称节点进行融合    d)2nn将新的image发送回nn    e)nn重命名新的image，替换旧的2.2nn创建检查点的周期    a)一小时执行一次（fs.checkpoint.period）    b)edits文件超过64M也会出发检查点生成（fs.checkpoint.size),2nn五分钟检查一次

29.hdfs dfsadmin

1.设置配额管理（针对hdfs文件系统）

    【目录配额】    $>hdfs dfsadmin -setQuota    条件：对目录进行设置，值必须是正整数，具有管理员权限    控制的是目录所包含文件／文件夹的个数，1即目录为空    $>hdfs dfsadmin -setQuota 1 dirname    清除配额：    $>hdfs dfsadmin -clrQuota dirname    【空间配额】    可以带单位，且副本数计算在内，控制所有文件的总大小，所以，最小值必须大于384MB    $>hdfs dfsadmin -setSpaceQuota 384MB dirname    清除配额    $>hdfs dfsadmin -clrSpaceQuota

2.快照

快照相当于对目录做一个备份，并不会立即复制所有文件，而是指向同一文件，当写入发生时，才会产生新文件【创建快照】$>hdfs dfsadmin -allowSnapshot dirname(允许快照)$>hdfs dfs -createSnapshot dirname（创建快照）【重命名】$>hdfs dfs -renameSnapshot dirname oldname newname【删除】$>hdfs dfs -deleteSnapshot dirname snapshotname【列出所有快照目录】$>hdfs lsSnapshottableDir【比较两个快照目录】$>hdfs snapshotDiff

3.回收站

trash,对应的是一个文件夹，保存时间按分钟计算，且检查周期应该小于等于删除时间fs.trash.interval=0    //分钟数，0为禁用【启动回收站】[core-site.xml]fs.trash.interval=1【分发文件立即生效】回收站位置为：./.Trash（hdfs目录）【清空回收站】$>hdfs dfs -expunge

4.修改hdfs的webUI的静态用户名称

[core-site.xml]hadoop.http.staticuser.user=username

0 0