android 6.0 修改vold check sd卡同步改成异步机制

来源:互联网 发布:网络布线验收报告 编辑:程序博客网 时间:2024/06/06 15:37

android原生在vold中检查到kernel上报的event信息(sd卡)会发给MountService,然后MountService通过localSocket发给vold,让vold去mount这张sd卡。在vold中这个通信机制,在在一个死循环,select函数中执行,也就是vold处理MountService发给来的信息是顺序的,如果线程卡住了,select函数也执行不下去,整个MountService给vold发信息,都会卡住等待。


一、问题原因,现象

好了问题出在哪呢,当vold去mount的时候先要check sd卡:

status_t PublicVolume::doMount() {    // TODO: expand to support mounting other filesystems    readMetadata();    if (mFsType != "vfat") {        LOG(ERROR) << getId() << " unsupported filesystem " << mFsType;        return -EIO;    }    if (vfat::Check(mDevPath)) {        LOG(ERROR) << getId() << " failed filesystem check";        return -EIO;    }

只有check过了,才会去mount。

而这个check是开了一个进程去执行一个linux原始的工具,而我们需要等待这个进程的一个结果。下面我们来看下代码:

status_t Check(const std::string& source) {    if (access(kFsckPath, X_OK)) {//检查这个进程是否有执行权限,没有直接不检查了        SLOGW("Skipping fs checks\n");        return 0;    }    int pass = 1;    int rc = 0;    do {        std::vector<std::string> cmd;        cmd.push_back(kFsckPath);        cmd.push_back("-p");        cmd.push_back("-f");        cmd.push_back(source);        // Fat devices are currently always untrusted        rc = ForkExecvp(cmd, sFsckUntrustedContext);//这里会一直等待进程返回结果        if (rc < 0) {            SLOGE("Filesystem check failed due to logwrap error");            errno = EIO;            return -1;        }        switch(rc) {        case 0:            SLOGI("Filesystem check completed OK");            return 0;        case 2:            SLOGE("Filesystem check failed (not a FAT filesystem)");            errno = ENODATA;            return -1;        case 4:            if (pass++ <= 3) {                SLOGE("Filesystem modified - rechecking (pass %d)",                        pass);                continue;            }            SLOGE("Failing check after too many rechecks");            errno = EIO;            return -1;        default:            SLOGE("Filesystem check failed (unknown exit code %d)", rc);            errno = EIO;            return -1;        }    } while (0);    return 0;}

check函数会fork一个子进程,然后一直等待其结果:

static const char* kFsckPath = "/system/bin/fsck_msdos";

当有的坏卡会一直卡住,造成主线程一直等待,以至于vold和MountService的通信坏了。


二、解决方法

因此我们需要把这个fork一个进程等待结果,改成异步的:

class CheckThread : public Thread {//check线程public:    CheckThread(std::vector<std::string> cmd, int fd):            mCmd(cmd),mFd(fd) {    }    virtual ~CheckThread() {    }    virtual bool threadLoop() {        int rc = ForkExecvp(mCmd, sFsckUntrustedContext);//fork进程,check卡等待结果        write(mFd, &rc, sizeof(int));//把结果写入管道        return false;//返回false,不循环了    }private:    std::vector<std::string> mCmd;    int mFd;};status_t Check(const std::string& source) {    if (access(kFsckPath, X_OK)) {        SLOGW("Skipping fs checks\n");        return 0;    }    int pass = 1;    int rc = 3;//注意原来是0,改成3    do {        std::vector<std::string> cmd;        cmd.push_back(kFsckPath);        cmd.push_back("-p");        cmd.push_back("-f");        cmd.push_back(source);        int pipefd[2];        if (pipe(pipefd) < 0) { //管道            SLOGE("pipe failed");            return -1;        }        CheckThread* thread = new CheckThread(cmd, pipefd[1]);        thread->run();//开启线程,执行fork进程,等待结果,同时把管道一端给它        int mEpollFd = epoll_create(2);        struct epoll_event eventItem;        memset(& eventItem, 0, sizeof(epoll_event));        // zero out unused members of data field union        eventItem.events = EPOLLIN;        eventItem.data.fd = pipefd[0];        epoll_ctl(mEpollFd, EPOLL_CTL_ADD, pipefd[0], & eventItem);        struct epoll_event eventItems[2];        int eventCount = epoll_wait(mEpollFd, eventItems, 2, 10000);//使用epoll机制,设置timeout为10秒        for (int i = 0; i < eventCount; i++) {            int fd = eventItems[i].data.fd;//有管道数据,说明check好了,返回结果就在管道中            uint32_t epollEvents = eventItems[i].events;            if (fd == pipefd[0]) {                if (epollEvents & EPOLLIN) {                    read(fd, &rc, sizeof(int));                }            }        }//直接timeout了,说明10秒还没有check完        close(mEpollFd);        close(pipefd[0]);//关闭fd        close(pipefd[1]);        // Fat devices are currently always untrusted        //rc = ForkExecvp(cmd, sFsckUntrustedContext);        if (rc < 0) {            SLOGE("Filesystem check failed due to logwrap error");            errno = EIO;            return -1;        }        switch(rc) {        case 0:            SLOGI("Filesystem check completed OK");            return 0;        case 2:            SLOGE("Filesystem check failed (not a FAT filesystem)");            errno = ENODATA;            return -1;        case 3://3说明check超时了            SLOGE("Filesystem check outime");            errno = ENODATA;            return -1;        case 4:            if (pass++ <= 3) {                SLOGE("Filesystem modified - rechecking (pass %d)",                        pass);                continue;            }            SLOGE("Failing check after too many rechecks");            errno = EIO;            return -1;        default:            SLOGE("Filesystem check failed (unknown exit code %d)", rc);            errno = EIO;            return -1;        }    } while (0);    return 0;}

我们使用了管道+epoll机制,利用了epoll的timeout来完成我们的异步功能。


三、总结

其实我们一开始的思路是开一个线程执行,然后主线程waitRelative 10秒,如果子线程check完,broadcast把主线程唤醒。但是不知道为什么使用android原生的Condition不成功。所以才用了epoll机制。


后续可以再研究下,或者使用linux的pthread_cond_timedwait等再试下,因为android的Condition也是封装的这个。




2 0