Spark Q&A : Kryo serialization failed: Buffer overflow
来源:互联网 发布:s5700交换机ip mac绑定 编辑:程序博客网 时间:2024/06/02 01:13
Q1 . Spark运行Job 报错
org.apache.spark.SparkException: Kryo serialization failed: Buffer overflow. Available: 2, required: 4. To avoid this, increase spark.kryoserializer.buffer.max valueat org.apache.spark.serializer.KryoSerializerInstance.serialize(KryoSerializer.scala:299)...
A1: 根据报错信息进行逆向代码分析
在org.apache.spark.serializer.KryoSerializerInstance
中查看serialize
方法源码
override def serialize[T: ClassTag](t: T): ByteBuffer = { output.clear() val kryo = borrowKryo() try { kryo.writeClassAndObject(output, t) } catch { case e: KryoException if e.getMessage.startsWith("Buffer overflow") => throw new SparkException(s"Kryo serialization failed: ${e.getMessage}. To avoid this, " + "increase spark.kryoserializer.buffer.max value.") } finally { releaseKryo(kryo) } ByteBuffer.wrap(output.toBytes) }
报错的位置是在try-catch
块中涉及的writeClassAndObject
方法,继续跟进:
public void writeClassAndObject (Output output, Object object) { if (output == null) throw new IllegalArgumentException("output cannot be null."); beginObject(); try { if (object == null) { writeClass(output, null); return; } Registration registration = writeClass(output, object.getClass()); if (references && writeReferenceOrNull(output, object, false)) return; if (TRACE || (DEBUG && depth == 1)) log("Write", object); registration.getSerializer().write(this, output, object); } finally { if (--depth == 0 && autoReset) reset(); } }
这里因为没有详细的日志可以看出哪里报的错,于是将该方法中涉及的方法都跟进看了看, 我的理解是由于writeReferenceOrNull
报错, 跟进该方法:
boolean writeReferenceOrNull (Output output, Object object, boolean mayBeNull) { if (object == null) { if (TRACE || (DEBUG && depth == 1)) log("Write", null); output.writeByte(Kryo.NULL); return true; } if (!referenceResolver.useReferences(object.getClass())) { if (mayBeNull) output.writeByte(Kryo.NOT_NULL); return false; } // Determine if this object has already been seen in this object graph. int id = referenceResolver.getWrittenId(object); // If not the first time encountered, only write reference ID. if (id != -1) { if (DEBUG) debug("kryo", "Write object reference " + id + ": " + string(object)); output.writeInt(id + 2, true); // + 2 because 0 and 1 are used for NULL and NOT_NULL. // Q! return true; } // Otherwise write NOT_NULL and then the object bytes. id = referenceResolver.addWrittenObject(object); output.writeByte(NOT_NULL); if (TRACE) trace("kryo", "Write initial object reference " + id + ": " + string(object)); return false; }
同上,跟进writeInt
方法:
public int writeInt (int value, boolean optimizePositive) throws KryoException { if (!optimizePositive) value = (value << 1) ^ (value >> 31); if (value >>> 7 == 0) { require(1); buffer[position++] = (byte)value; return 1; } if (value >>> 14 == 0) { require(2); buffer[position++] = (byte)((value & 0x7F) | 0x80); buffer[position++] = (byte)(value >>> 7); return 2; } if (value >>> 21 == 0) { require(3); buffer[position++] = (byte)((value & 0x7F) | 0x80); buffer[position++] = (byte)(value >>> 7 | 0x80); buffer[position++] = (byte)(value >>> 14); return 3; } if (value >>> 28 == 0) { require(4); buffer[position++] = (byte)((value & 0x7F) | 0x80); buffer[position++] = (byte)(value >>> 7 | 0x80); buffer[position++] = (byte)(value >>> 14 | 0x80); buffer[position++] = (byte)(value >>> 21); return 4; } require(5); buffer[position++] = (byte)((value & 0x7F) | 0x80); buffer[position++] = (byte)(value >>> 7 | 0x80); buffer[position++] = (byte)(value >>> 14 | 0x80); buffer[position++] = (byte)(value >>> 21 | 0x80); buffer[position++] = (byte)(value >>> 28); return 5; }
最终调用com.esotericsoftware.kryo.io
中require
方法:
private boolean require(int required) throws KryoException { if(this.capacity - this.position >= required) { return false; } else if(required > this.maxCapacity) { throw new KryoException("Buffer overflow. Max capacity: " + this.maxCapacity + ", required: " + required); } else { this.flush(); while(this.capacity - this.position < required) { if(this.capacity == this.maxCapacity) { throw new KryoException("Buffer overflow. Available: " + (this.capacity - this.position) + ", required: " + required); } this.capacity = Math.min(this.capacity * 2, this.maxCapacity); if(this.capacity < 0) { this.capacity = this.maxCapacity; } byte[] newBuffer = new byte[this.capacity]; System.arraycopy(this.buffer, 0, newBuffer, 0, this.position); this.buffer = newBuffer; } return true; } }
问题的关键在于output
在写入id
时, 因为id+2
的值较大(value >>> 28 == 0
), 需要申请4个byte
代码中可以看出,maxCapacity
值是id
的上限值,超过该值就会报错. maxCapacity
则是由如下的逻辑(由下往上描述)进行设置:
private lazy val output = ks.newKryoOutput() \\ 新建KryoOutputdef newKryoOutput(): KryoOutput = new KryoOutput(bufferSize, math.max(bufferSize, maxBufferSize)) \\ 设置maxBufferSizeprivate val bufferSize = ByteUnit.KiB.toBytes(bufferSizeKb).toInt \\ buffer的正常大小private val bufferSizeKb = conf.getSizeAsKb("spark.kryoserializer.buffer", "64k")private val maxBufferSize = ByteUnit.MiB.toBytes(maxBufferSizeMb).toInt \\ buffer的最大值val maxBufferSizeMb = conf.getSizeAsMb("spark.kryoserializer.buffer.max", "64m").toInt
得出结论, spark.kryoserializer.buffer.max
没有设置对. 最大可以设置为2048mb
.
阅读全文
0 0
- Spark Q&A : Kryo serialization failed: Buffer overflow
- spark解决 org.apache.spark.SparkException: Kryo serialization failed: Buffer overflow
- spark解决org.apache.spark.SparkException: Kryo serialization failed: Buffer overflow
- com.esotericsoftware.kryo.KryoException: Buffer overflow. Available: 0, required: 1
- 简单聊聊Kryo serialization
- A Look at the Buffer-Overflow Hack
- buffer overflow
- buffer overflow
- Buffer Overflow
- Buffer overflow
- Buffer Overflow
- spark 使用kryo
- WinRAR buffer overflow
- 关于buffer overflow
- Stack buffer overflow (wiki)
- Buffer-overflow attacks
- 缓存溢出Buffer Overflow
- buffer overflow vulnerability
- 单次循环,搜索出两个数组中不同的一段数据,把不同的数据存进EEPROM
- 获取和设置用户id以及组id
- 那些年——4 编码感受
- JAVA知识点总结16-多线程
- 图解JAVA中Spring Aop作用
- Spark Q&A : Kryo serialization failed: Buffer overflow
- usbplayer demo
- Python进阶
- 初学者的CNN搭建示例(torch,cifar10数据集)
- 1012. 数字分类 (20)
- SpringMVC 文档学习笔记
- RxJava 从入门到爱上它
- QML让圆形物体按照圆形轨迹运动和color使用rgba值的Demo
- Linux下启动和停止apache服务