Android Binder: TranscationTooLargeException分析

来源:互联网 发布:javascript 创建对象 编辑:程序博客网 时间:2024/06/05 09:37

最近看公司的代码,发现访问PackageManager的接口时,特意加了同步锁。
询问了一下原因,得知多线程并发访问PackageManager的接口时,可能抛出异常。
异常的信息类似于:

看了一下源代码,发现PackageManager的许多接口确实都有可能抛出这种异常。
以8.0的代码为例,整个过程如下(不方便下载整个源码时,可以直接在网上找http://androidxref.com/):

    //ContextImpl获取PackageManager的接口    @Override    public PackageManager getPackageManager() {        if (mPackageManager != null) {            return mPackageManager;        }        //获取Binder通信客户端        IPackageManager pm = ActivityThread.getPackageManager();        if (pm != null) {            // Doesn't matter if we make more than one instance.            // 利用Binder通信客户端构造出ApplicationPackageManager            return (mPackageManager = new ApplicationPackageManager(this, pm));        }        return null;    }    //ApplicationPackageManager的getPackageInfo接口    @Override    public PackageInfo getPackageInfo(String packageName, int flags)            throws NameNotFoundException {        return getPackageInfoAsUser(packageName, flags, mContext.getUserId());    }    @Override    public PackageInfo getPackageInfoAsUser(String packageName, int flags, int userId)            throws NameNotFoundException {        try {            //通过Binder客户端通信            PackageInfo pi = mPM.getPackageInfo(packageName, flags, userId);            if (pi != null) {                return pi;            }        } catch (RemoteException e) {            //Binder通信, 会抛出RemoteException            throw e.rethrowFromSystemServer();        }        throw new NameNotFoundException(packageName);    }    //RemoteException会抛出Dead异常    public RuntimeException rethrowFromSystemServer() {        if (this instanceof DeadObjectException) {            throw new RuntimeException(new DeadSystemException());        } else {            throw new RuntimeException(this);        }    }

根据上述代码可以看出,当访问PackageManager接口,Binder通信出现问题时,
就可能抛出RuntimeException。

根据异常信息,查询android_util_Binder.cpp文件,
发现signalExceptionForError函数中有如下代码:

void signalExceptionForError(JNIEnv* env, jobject obj, status_t err,        bool canThrowRemoteException, int parcelSize){    switch (err) {        .............        case FAILED_TRANSACTION: {            ALOGE("!!! FAILED BINDER TRANSACTION !!!  (parcel size = %d)", parcelSize);            const char* exceptionToThrow;            char msg[128];            // TransactionTooLargeException is a checked exception, only throw from certain methods.            // FIXME: Transaction too large is the most common reason for FAILED_TRANSACTION            //        but it is not the only one.  The Binder driver can return BR_FAILED_REPLY            //        for other reasons also, such as if the transaction is malformed or            //        refers to an FD that has been closed.  We should change the driver            //        to enable us to distinguish these cases in the future.            // 从代码注释及异常信息,可以看出大多数情况下,抛出该异常的原因是:            // 传输的数据量过大,导致Binder通信失败            if (canThrowRemoteException && parcelSize > 200*1024) {                // bona fide large payload                exceptionToThrow = "android/os/TransactionTooLargeException";                snprintf(msg, sizeof(msg)-1, "data parcel size %d bytes", parcelSize);            } else {                // Heuristic: a payload smaller than this threshold "shouldn't" be too                // big, so it's probably some other, more subtle problem.  In practice                // it seems to always mean that the remote process died while the binder                // transaction was already in flight.                exceptionToThrow = (canThrowRemoteException)                        ? "android/os/DeadObjectException"                        : "java/lang/RuntimeException";                snprintf(msg, sizeof(msg)-1,                        "Transaction failed on small parcel; remote process probably died");            }            jniThrowException(env, exceptionToThrow, msg);        } break;        .............    }}

至此,我们知道访问PackageManager接口,抛出异常的原因了。
但是导致Binder通信出现异常的原因是什么呢?
为了分析这个问题,我们就需要进一步看看Binder相关的代码。

在ProcessState.cpp中,ProcessState初始化时会调用open_driver:

static int open_driver(){    //开启binder    int fd = open("/dev/binder", O_RDWR | O_CLOEXEC);    if (fd >= 0) {        int vers = 0;        status_t result = ioctl(fd, BINDER_VERSION, &vers);        if (result == -1) {            ALOGE("Binder ioctl to obtain version failed: %s", strerror(errno));            close(fd);            fd = -1;        }        if (result != 0 || vers != BINDER_CURRENT_PROTOCOL_VERSION) {            ALOGE("Binder driver protocol does not match user space protocol!");            close(fd);            fd = -1;        }        //默认最大值为15        size_t maxThreads = DEFAULT_MAX_BINDER_THREADS;        //即一个Binder FD最多绑定15个线程        result = ioctl(fd, BINDER_SET_MAX_THREADS, &maxThreads);        if (result == -1) {            ALOGE("Binder ioctl to set max threads failed: %s", strerror(errno));        }    } else {        ALOGW("Opening '/dev/binder' failed: %s\n", strerror(errno));    }    return fd;}//开启binder驱动后,ProcessState会设置FD可用的内存ProcessState::ProcessState()..........{    if (mDriverFD >= 0) {        // mmap the binder, providing a chunk of virtual address space to receive transactions.        //#define BINDER_VM_SIZE ((1 * 1024 * 1024) - sysconf(_SC_PAGE_SIZE) * 2)        //默认值接近1M        mVMStart = mmap(0, BINDER_VM_SIZE, PROT_READ, MAP_PRIVATE | MAP_NORESERVE, mDriverFD, 0);        if (mVMStart == MAP_FAILED) {            // *sigh*            ALOGE("Using /dev/binder failed: unable to mmap transaction memory.\n");            close(mDriverFD);            mDriverFD = -1;        }    }    ..............}

从这部分代码可以看出,一个进程启动对应的Binder后,最多可以供15个线程访问,
可利用的JVM内存大约在1M左右。
此外,对于一个服务进程,从以前分析Binder的逻辑来看,
启动后会有两个IPCThreadState读取Binder收到的数据及向Binder发送数据。

因此容易看出,当多个线程同时访问一个Binder的FD,导致数据超过内存限制,
则有可能导致上文提到的TransactionTooLargeException。

自己做了一下测试,多线程并发访问PackageManager的getInstalledPackages时(返回PackageInfo List),
很容易就会出现这种异常。
因此,多线程并发访问时加锁或者做一些缓存,减少Binder通信的压力,是很有必要的。