spatialhadoop2.3源码阅读(九) ShapeLineInputFormat & ShapeLineRecordReader & SpatialRecordReader[FileMBR]

来源:互联网 发布:ios 淘宝 双11图标 编辑:程序博客网 时间:2024/05/01 17:22


ShapeLineInputFormat的作用就是生成ShapeLineRecordReader,其具体实现和spatialhadoop2.1源码阅读(三) 自定义InputFormat(SpatialInputFormat & ShapeInputFormat)中介绍的ShapeInputFormat的实现基本相同,具体可看该文章。ShapeLineInputFormat的源码如下:

public class ShapeLineInputFormat extends SpatialInputFormat<Rectangle, Text> {    @Override  public RecordReader<Rectangle, Text> getRecordReader(InputSplit split,      JobConf job, Reporter reporter) throws IOException {    if (reporter != null)      reporter.setStatus(split.toString());    this.rrClass = ShapeLineRecordReader.class;    return super.getRecordReader(split, job, reporter);  }}

接下来重点介绍ShapeLineRecordReader和 SpatialRecordReader的实现。



public class ShapeLineRecordReader    extends SpatialRecordReader<Rectangle, Text> {  public ShapeLineRecordReader(Configuration job, FileSplit split)      throws IOException {    super(job, split);  }  public ShapeLineRecordReader(CombineFileSplit split, Configuration conf,      Reporter reporter, Integer index) throws IOException {    super(split, conf, reporter, index);  }    public ShapeLineRecordReader(InputStream in, long offset, long endOffset)      throws IOException {    super(in, offset, endOffset);  }  @Override  public boolean next(Rectangle key, Text shapeLine) throws IOException {    boolean read_line = nextLine(shapeLine);    key.set(cellMbr);    return read_line;  }  @Override  public Rectangle createKey() {    return new Rectangle();  }  @Override  public Text createValue() {    return new Text();  }}

3.SpatialRecordReader(按照FileMBR MapReduce进行介绍)

3.1 构造函数

public SpatialRecordReader(Configuration job, long s, long l, Path p) throws IOException {    this.start = s;    this.end = s + l;    this.path = p;    this.fs = this.path.getFileSystem(job);    this.directIn =;    this.blockSize = fs.getFileStatus(this.path).getBlockSize();    this.cellMbr = new Rectangle();"Open a SpatialRecordReader to file: "+this.path);    codec = new CompressionCodecFactory(job).getCodec(this.path);    if (isCompressedInput()) {      decompressor = CodecPool.getDecompressor(codec);      if (codec instanceof SplittableCompressionCodec) {        final SplitCompressionInputStream cIn =            ((SplittableCompressionCodec)codec).createInputStream(                directIn, decompressor, start, end,                SplittableCompressionCodec.READ_MODE.BYBLOCK);        in = cIn;        start = cIn.getAdjustedStart();        end = cIn.getAdjustedEnd();        filePosition = cIn; // take pos from compressed stream      } else {        in = codec.createInputStream(directIn, decompressor);        filePosition = directIn;      }    } else {;      in = directIn;      filePosition = directIn;    }    this.pos = start;    this.maxShapesInOneRead = job.getInt(SpatialSite.MaxShapesInOneRead, 1000000);    this.maxBytesInOneRead = job.getInt(SpatialSite.MaxBytesInOneRead, 32*1024*1024);    initializeReader();  }







38:首先判断输入文件是否有全局索引,如果存在则设置cellMbr,否则cellMbr无效;然后判断输入文件是否为R-tree indexed,根据不同情况初始化不同。

3.2 nextLine函数

protected boolean nextLine(Text value) throws IOException {    if (blockType == BlockType.RTREE && pos == 8) {      // File is positioned at the RTree header      // Skip the header and go to first data object in file      pos += RTree.skipHeader(in);"Skipped R-tree to position: "+pos);      // Reinitialize record reader at the new position      lineReader = new LineReader(in);    }    while (getFilePosition() <= end) {      value.clear();      int b = 0;      if (buffer != null) {        // Read the first line encountered in buffer        int eol = RTree.skipToEOL(buffer, 0);        b += eol;        value.append(buffer, 0, eol);        if (eol < buffer.length) {          // There are still some bytes remaining in buffer          byte[] tmp = new byte[buffer.length - eol];          System.arraycopy(buffer, eol, tmp, 0, tmp.length);          buffer = tmp;        } else {          buffer = null;        }        // Check if a complete line has been read from the buffer        byte last_byte = value.getBytes()[value.getLength()-1];        if (last_byte == '\n' || last_byte == '\r')          return true;      }            // Read the first line from stream      Text temp = new Text();      b += lineReader.readLine(temp);      if (b == 0) {        // Indicates an end of stream        return false;      }      pos += b;            // Append the part read from stream to the part extracted from buffer      value.append(temp.getBytes(), 0, temp.getLength());            if (value.getLength() > 1) {        // Read a non-empty line. Note that end-of-line character is included        return true;      }    }    // Reached end of file    return false;  }

2-9:对R-tree indexed进行特殊处理





0 0