spark 2.0 ManagedBuffer 和 FileSegmentManagedBuffer
来源:互联网 发布:wnacg新域名wnacg.net 编辑:程序博客网 时间:2024/06/13 10:32
ManagedBuffer以字节数组的方式提供数据,并且数据不能被修改。 它的实现应该制定数据是怎么提供的。
ManagedBuffer有三种具体的实现
1. FileSegmentManagedBuffer: 文件的一部分支持数据。
2. NioManagedBuffer: NIO ByteBuffer支持数据。
3. NettyManageBuffer: Netty ByteBuf支持数据。
具体的缓冲区实现可能不被JVM垃圾收回收器管理。 例如: NettyManagedBuffer,这个缓冲区是引用计数的方式,如果缓冲区被传到不能的线程,应该调用retain/release方法。
/** * This interface provides an immutable view for data in the form of bytes. The implementation * should specify how the data is provided: * * - {@link FileSegmentManagedBuffer}: data backed by part of a file * - {@link NioManagedBuffer}: data backed by a NIO ByteBuffer * - {@link NettyManagedBuffer}: data backed by a Netty ByteBuf * * The concrete buffer implementation might be managed outside the JVM garbage collector. * For example, in the case of {@link NettyManagedBuffer}, the buffers are reference counted. * In that case, if the buffer is going to be passed around to a different thread, retain/release * should be called. */public abstract class ManagedBuffer { /** Number of bytes of the data. */ public abstract long size(); /** * Exposes this buffer's data as an NIO ByteBuffer. Changing the position and limit of the * returned ByteBuffer should not affect the content of this buffer. */ // TODO: Deprecate this, usage may require expensive memory mapping or allocation. public abstract ByteBuffer nioByteBuffer() throws IOException; /** * Exposes this buffer's data as an InputStream. The underlying implementation does not * necessarily check for the length of bytes read, so the caller is responsible for making sure * it does not go over the limit. */ public abstract InputStream createInputStream() throws IOException; /** * Increment the reference count by one if applicable. */ public abstract ManagedBuffer retain(); /** * If applicable, decrement the reference count by one and deallocates the buffer if the * reference count reaches zero. */ public abstract ManagedBuffer release(); /** * Convert the buffer into an Netty object, used to write the data out. The return value is either * a {@link io.netty.buffer.ByteBuf} or a {@link io.netty.channel.FileRegion}. * * If this method returns a ByteBuf, then that buffer's reference count will be incremented and * the caller will be responsible for releasing this new reference. */ public abstract Object convertToNetty() throws IOException;}
/** * A {@link ManagedBuffer} backed by a segment in a file. */public final class FileSegmentManagedBuffer extends ManagedBuffer { private final TransportConf conf; private final File file; private final long offset; private final long length; public FileSegmentManagedBuffer(TransportConf conf, File file, long offset, long length) { this.conf = conf; this.file = file; this.offset = offset; this.length = length; } @Override public long size() { return length; } @Override public ByteBuffer nioByteBuffer() throws IOException { FileChannel channel = null; try { channel = new RandomAccessFile(file, "r").getChannel(); // Just copy the buffer if it's sufficiently small, as memory mapping has a high overhead. if (length < conf.memoryMapBytes()) { ByteBuffer buf = ByteBuffer.allocate((int) length); channel.position(offset); while (buf.remaining() != 0) { if (channel.read(buf) == -1) { throw new IOException(String.format("Reached EOF before filling buffer\n" + "offset=%s\nfile=%s\nbuf.remaining=%s", offset, file.getAbsoluteFile(), buf.remaining())); } } buf.flip(); return buf; } else { return channel.map(FileChannel.MapMode.READ_ONLY, offset, length); } } catch (IOException e) { try { if (channel != null) { long size = channel.size(); throw new IOException("Error in reading " + this + " (actual file length " + size + ")", e); } } catch (IOException ignored) { // ignore } throw new IOException("Error in opening " + this, e); } finally { JavaUtils.closeQuietly(channel); } } @Override public InputStream createInputStream() throws IOException { FileInputStream is = null; try { is = new FileInputStream(file); ByteStreams.skipFully(is, offset); return new LimitedInputStream(is, length); } catch (IOException e) { try { if (is != null) { long size = file.length(); throw new IOException("Error in reading " + this + " (actual file length " + size + ")", e); } } catch (IOException ignored) { // ignore } finally { JavaUtils.closeQuietly(is); } throw new IOException("Error in opening " + this, e); } catch (RuntimeException e) { JavaUtils.closeQuietly(is); throw e; } } @Override public ManagedBuffer retain() { return this; } @Override public ManagedBuffer release() { return this; } @Override public Object convertToNetty() throws IOException { if (conf.lazyFileDescriptor()) { return new DefaultFileRegion(file, offset, length); } else { FileChannel fileChannel = new FileInputStream(file).getChannel(); return new DefaultFileRegion(fileChannel, offset, length); } } public File getFile() { return file; } public long getOffset() { return offset; } public long getLength() { return length; } @Override public String toString() { return Objects.toStringHelper(this) .add("file", file) .add("offset", offset) .add("length", length) .toString(); }}
0 0
- spark 2.0 ManagedBuffer 和 FileSegmentManagedBuffer
- spark shuffle Read fetch过来的数据以ManagedBuffer形式存在时,该底层数据时在堆外内存中还是文件中,是否受memorymanager管理
- spark 2.0 RpcCallCopntext和NettyRpcCallContext
- Spark生态和Spark架构
- Spark生态和Spark架构
- Spark 2.0介绍:Dataset介绍和使用
- Spark 2.0介绍:Dataset介绍和使用
- Spark 容器和组件
- Spark 容器和组件
- spark和shark安装
- spark和shark
- 编译hadoop和spark
- spark cache和chechpoint
- spark cache和checkpoint
- Spark安装和使用
- Spark:Transformation和Action
- spark 安装和使用
- 整合spark和hive
- 【Jquery实战】按下回车键触发事件
- android studio 快捷键记录
- string,StringBuffer与StringBuilder的区别
- Java Servlet单例多线程机制详解
- Emmet使用手册
- spark 2.0 ManagedBuffer 和 FileSegmentManagedBuffer
- Linux中NTP的配置注意事项
- Dagger2代码分析
- python 去掉 using libgmp >=5 to avoid timing attack vulnerability
- 取消eclipse自动跳转到debug模式
- sh语法
- MySQL 格式问题
- Jackson第一篇【JSON字符串、实体之间的相互转换】
- Check $M2_HOME environment variable and mvn script match解决办法