Android源码解析之repo仓库
来源:互联网 发布:js onclick事件传参 编辑:程序博客网 时间:2024/06/08 03:45
- Summary
- About Repo
- Introduction
- Repo仓库
- Manifest仓库
- Projects仓库集
- 创建分支
- Reference
1.Summary
首先说一下为什么会想分享这篇博客。出发点很简单,只是想学习一下Python在AOSP中的应用。repo应用就是一个研究的切入点。其次Python在深度学习、大数据都有一定的支持,后续会研究一下这方面的技术。最后就是个人喜好,无他。
2.About Repo
repo就是通过Python封装git命令的应用。什么是repo?简单来说就是对AOSP含有git仓库的各个项目的批处理。repo应用包括repo仓库(仓库也可以叫做项目)、manifest仓库、projectsc仓库集这三个核心。repo仓库都是一些Python文件,manifest仓库只有一个存放AOSP各个子项目元数据的xml文件。projects仓库集是AOSP各个子项目对应的git仓库。
下面用一张图片表示一下。
补充一点,git是允许repository和working directory分布在不同的目录下的。所以就会看到AOSP的working directory在项目根目录而.git目录在.repo/projects目录
3.Introduction
先来草率的分析一下,拉取一套AOSP代码应该按照如下流程:
mkdir testsource #创建AOSP目录。用于存放.repo应用和源码cd testsourcerepo init -u https://android.googlesource.com/platform/manifest -b android-4.0.1_r1 cmd #初始化repo仓库和manifest仓库 repo sync -j 8 cmd #同步projects仓库集repo start master --all cmd #创建并且切换到新分支上repo仓库初始化--->manifest仓库初始化--->project仓库集初始化--->创建并切换到新分支上
从数据流自上而下看:
repo command line --->optparse--->git command line
在Python中使用的是optparse模块(后续将被argparse模块取代)解析命令行,所以optparse模块相当于数据转换中心将repo命名行转成git命令行
Repo仓库
接下来就看看具体的细节处理,从repo模块的入口函数main开始,执行的命令行如下:
repo init -u https://android.googlesource.com/platform/manifest -b android-4.0.1_r1
def main(orig_args): cmd, opt, args = _ParseArguments(orig_args) repo_main, rel_repo_dir = None, None # Don't use the local repo copy, make sure to switch to the gitc client first. if cmd != 'gitc-init': repo_main, rel_repo_dir = _FindRepo() wrapper_path = os.path.abspath(__file__) my_main, my_git = _RunSelf(wrapper_path) cwd = os.getcwd() ... if not repo_main: if opt.help: _Usage() if cmd == 'help': _Help(args) if not cmd: _NotInstalled() if cmd == 'init' or cmd == 'gitc-init': if my_git: _SetDefaultsTo(my_git) try: _Init(args, gitc_init=(cmd == 'gitc-init')) except CloneFailure: ... sys.exit(1) repo_main, rel_repo_dir = _FindRepo() else: _NoCommands(cmd) if my_main: repo_main = my_main ver_str = '.'.join(map(str, VERSION)) me = [sys.executable, repo_main, '--repo-dir=%s' % rel_repo_dir, '--wrapper-version=%s' % ver_str, '--wrapper-path=%s' % wrapper_path, '--'] me.extend(orig_args) me.extend(extra_args) try: os.execv(sys.executable, me) except OSError as e: ... sys.exit(148)
repo模块函数main(sys.argv[1:]) 参数sys.argv[1:]就是由command、options组成,然后由_ParseArguments函数解析。由于main函数流程复杂,我们考虑的是初次初始化。main函数在调用_Init函数之前对环境进行了检查:repo模块的版本号和路径、.repo/repo/路径下的main模块和git仓库。在_Init函数之后就是执行main模块中的入口函数_Main
_ParseArguments函数的代码如下:
def _ParseArguments(args): cmd = None opt = _Options() arg = [] for i in range(len(args)): a = args[i] if a == '-h' or a == '--help': opt.help = True elif not a.startswith('-'): cmd = a arg = args[i + 1:] break return cmd, opt, arg
_ParseArguments函数解析出cmd、opt、args,其中,cmd是init,args是command(init)后面的参数(-u https://android.googlesource.com/platform/manifest -b android-4.0.1_r1),而opt特指-h(–help)这样的用意在于当你输入repo -h,–help时就可以弹出一些帮助文档。
_FindRepo函数的代码如下:
def _FindRepo(): """Look for a repo installation, starting at the current directory. """ curdir = os.getcwd() repo = None olddir = None while curdir != '/' \ and curdir != olddir \ and not repo: repo = os.path.join(curdir, repodir, REPO_MAIN) if not os.path.isfile(repo): repo = None olddir = curdir curdir = os.path.dirname(curdir) return (repo, os.path.join(curdir, repodir))
_FindRepo函数查找当前执行repo命令的目录下.repo/repo/main.py和.repo目录两者是否都存在。
_RunSelf函数的代码如下:
def _RunSelf(wrapper_path): my_dir = os.path.dirname(wrapper_path) my_main = os.path.join(my_dir, 'main.py') my_git = os.path.join(my_dir, '.git') if os.path.isfile(my_main) and os.path.isdir(my_git): for name in ['git_config.py', 'project.py', 'subcmds']: if not os.path.exists(os.path.join(my_dir, name)): return None, None return my_main, my_git return None, None
_RunSelf函数检查repo模块的同级目录里是否有三个文件main.py 、git_config.py、project.py 和两个目录subcmds、.git。这次是查找运行中模块repo的同级目录,是否具备三个文件两个目录,如有具备这些,则.repo仓库之前就已经被初始化过了。反之,接下去就会初始化仓库。
接下来的各种控制流判断,取其中两个关键函数_SetDefaultsTo和_Init来详细讲解
... if not repo_main: ... if cmd == 'init' or cmd == 'gitc-init': if my_git: _SetDefaultsTo(my_git) try: _Init(args, gitc_init=(cmd == 'gitc-init')) ...
repo_main,cmd,my_git这三个变量我们前面已经说过了它们的由来,其中的my_git如果存在,调用_SetDefaultsTo函数会设置数据源,反之,就是初次初始化,使用默认的数据源(REPO_URL = ‘https://gerrit.googlesource.com/git-repo’ ),那么就会克隆一个.repo/repo/仓库
_SetDefaultsTo函数
def _SetDefaultsTo(gitdir): global REPO_URL global REPO_REV REPO_URL = gitdir proc = subprocess.Popen([GIT, '--git-dir=%s' % gitdir, 'symbolic-ref', 'HEAD'], stdout=subprocess.PIPE, stderr=subprocess.PIPE) REPO_REV = proc.stdout.read().strip() proc.stdout.close() proc.stderr.read() proc.stderr.close() if proc.wait() != 0: _print('fatal: %s has no current branch' % gitdir, file=sys.stderr) sys.exit(1)
–git-dir 指定git仓库的位置,symbolic-ref 指定当前分支为克隆分支这两者的值通过关键词global变成全局变量。
接下来就是核心函数_Init如下:
def _Init(args, gitc_init=False): """Installs repo by cloning it over the network. """ ... opt, args = init_optparse.parse_args(args) if args: init_optparse.print_usage() sys.exit(1) url = opt.repo_url if not url: url = REPO_URL extra_args.append('--repo-url=%s' % url) branch = opt.repo_branch if not branch: branch = REPO_REV extra_args.append('--repo-branch=%s' % branch) if branch.startswith('refs/heads/'): branch = branch[len('refs/heads/'):] ... try: ... os.mkdir(repodir) except OSError as e: if e.errno != errno.EEXIST: ... sys.exit(1) _CheckGitVersion() try: if NeedSetupGnuPG(): can_verify = SetupGnuPG(opt.quiet) else: can_verify = True dst = os.path.abspath(os.path.join(repodir, S_repo)) _Clone(url, dst, opt.quiet, not opt.no_clone_bundle) if not os.path.isfile('%s/repo' % dst): _print("warning: '%s' does not look like a git-repo repository, is " "REPO_URL set correctly?" % url, file=sys.stderr) if can_verify and not opt.no_repo_verify: rev = _Verify(dst, branch, opt.quiet) else: rev = 'refs/remotes/origin/%s^0' % branch _Checkout(dst, branch, rev, opt.quiet) except CloneFailure: ...
_Init函数的参数 args=[-u,https://android.googlesource.com/platform/manifest,-b,android-4.0.1_r1]
,使用OptionParse类的成员函数parse_args解析得到opt对象(存储有url地址和分支号)和args列表(其值不为空时,便会停止创建仓库的进程)。既然得到了opt对象,那么接下来就要通过指定url地址和分支号去获取repo仓库、manifest仓库,由于命令行只有manifest仓库的地址,那么是不是就没有办法获取repo仓库了吗?google提供了自家repo仓库的url地址(https://gerrit.googlesource.com/git-repo
)供开发者使用,也可以使用–repo-url选项指定自家公司repo仓库的url地址。接下来检查和配置环境
- 1.需要支持1.7.2以上的git版本
- 2.没有配置GnuPG的环境下,自动生成GnuPG文件
当GnuPG的环境配置好了,就会返回一个值can_verify,用于判断克隆完repo仓库后验证最新的tag是否被GunPG签过名,然后将克隆下来的repo仓库使用_Checkout函数切换到这个最新的tag,以便于使用最新的release版本(一般发布一个release版本都会打上一个tag),这就是这个tag的用意。
接下来我们就来看看函数_Init调用的两个核心函数_Clone和_Checkout
在这之前我们来看看git对远程仓库的操作图
_Clone函数的代码如下:
def _Clone(url, local, quiet, clone_bundle): """Clones a git repository to a new subdirectory of repodir """ try: os.mkdir(local) except OSError as e: ... raise CloneFailure() cmd = [GIT, 'init', '--quiet'] try: proc = subprocess.Popen(cmd, cwd=local) except OSError as e: ... raise CloneFailure() if proc.wait() != 0: ... raise CloneFailure() _InitHttp() _SetConfig(local, 'remote.origin.url', url) _SetConfig(local, 'remote.origin.fetch', '+refs/heads/*:refs/remotes/origin/*') if clone_bundle and _DownloadBundle(url, local, quiet): _ImportBundle(local) _Fetch(url, local, 'origin', quiet)
这里简单说一下_Clone函数的流程图。
创建git仓库(git init)---> 初始化http网络 ----> 配置远程仓库url地址、分支名(git config) ---> fetch记录从remote repository到local repository(git fetch)
在有网络的条件下可以从远程仓库克隆代码,但是如果离线了怎么办?git给我们提供了一种bundle机制。
_DownloadBundle函数的代码如下:
def _DownloadBundle(url, local, quiet): if not url.endswith('/'): url += '/' url += 'clone.bundle' proc = subprocess.Popen( [GIT, 'config', '--get-regexp', 'url.*.insteadof'], cwd=local, stdout=subprocess.PIPE) for line in proc.stdout: m = re.compile(r'^url\.(.*)\.insteadof (.*)$').match(line) if m: new_url = m.group(1) old_url = m.group(2) if url.startswith(old_url): url = new_url + url[len(old_url):] break proc.stdout.close() proc.wait() if not url.startswith('http:') and not url.startswith('https:'): return False dest = open(os.path.join(local, '.git', 'clone.bundle'), 'w+b') try: try: r = urllib.request.urlopen(url) except urllib.error.HTTPError as e: if e.code in [401, 403, 404, 501]: return False ... raise CloneFailure() except urllib.error.URLError as e: ... raise CloneFailure() try: if not quiet: _print('Get %s' % url, file=sys.stderr) while True: buf = r.read(8192) if buf == '': return True dest.write(buf) finally: r.close() finally: dest.close()
在使用git fetch获取remote repository记录到local repository之前,其实代码的来源还可以从bundle获取。生成bundle之前需要在有网络的条件下,将远程仓库的记录存储在bundle中。
最后会调用_ImportBundle函数导入数据。这种导入方式的应用场景在于环境处于脱机状态,便可以从其他的机器拷贝一份bundle导入到自己的仓库中。_ImportBundle函数是对_Fetch函数进行包装,其中最为重要的就是第三个参数,指定了要导入到local repository的数据来源路径,可以是网络的 url 的仓库名,也可以是本地的bundle路径
_Checkout函数
def _Checkout(cwd, branch, rev, quiet): """Checkout an upstream branch into the repository and track it. """ cmd = [GIT, 'update-ref', 'refs/heads/default', rev] if subprocess.Popen(cmd, cwd=cwd).wait() != 0: raise CloneFailure() _SetConfig(cwd, 'branch.default.remote', 'origin') _SetConfig(cwd, 'branch.default.merge', 'refs/heads/%s' % branch) cmd = [GIT, 'symbolic-ref', 'HEAD', 'refs/heads/default'] if subprocess.Popen(cmd, cwd=cwd).wait() != 0: raise CloneFailure() cmd = [GIT, 'read-tree', '--reset', '-u'] if not quiet: cmd.append('-v') cmd.append('HEAD') if subprocess.Popen(cmd, cwd=cwd).wait() != 0: raise CloneFailure()
该函数对git chechout的底层函数进行封装,功能和git checkout切分支是一样的,至此我们的_Init函数就执行完了,并且得到了repo仓库了那么接下来就是要得到manifest仓库了
... ver_str = '.'.join(map(str, VERSION)) me = [sys.executable, repo_main, '--repo-dir=%s' % rel_repo_dir, '--wrapper-version=%s' % ver_str, '--wrapper-path=%s' % wrapper_path, '--'] me.extend(orig_args) me.extend(extra_args) try: os.execv(sys.executable, me) ...
Manifest仓库
接下来就是执行main模块函数_Main,执行时命令行如下:
/home/.../.repo/repo/main.py --repo-dir=/home/.../.repo --wrapper-version=1.0 --wrapper-path=/usr/bin/repo -- init -u xxxx -b xxx
其参数argv经过repo模块的扩展,添加了三个信息
- .repo目录的绝对路径
- repo模块内部定义的版本号
- repo模块的绝对路径
经过repo模块添加的信息用来检查是否有可用的repo和执行main模块,这是命令行的前部分,而后半部分(init -u xxxx -b xxx)供直接或间接以Command为基类的衍生类的成员函数Execute调用放置在.repo/repo/subcmds/目录下的*.py模块。repo脚本能执行的命令都是放在该目录下的,一个Python文件对应一个repo命令。比如:”repo init”表示要执行的模块在.repo/repo/subcmds/init.py。
_Main函数的代码如下:
def _Main(argv): result = 0 opt = optparse.OptionParser(usage="repo wrapperinfo -- ...") opt.add_option("--repo-dir", dest="repodir", help="path to .repo/") opt.add_option("--wrapper-version", dest="wrapper_version", help="version of the wrapper script") opt.add_option("--wrapper-path", dest="wrapper_path", help="location of the wrapper script") _PruneOptions(argv, opt) opt, argv = opt.parse_args(argv) _CheckWrapperVersion(opt.wrapper_version, opt.wrapper_path) _CheckRepoDir(opt.repodir) Version.wrapper_version = opt.wrapper_version Version.wrapper_path = opt.wrapper_path repo = _Repo(opt.repodir) try: try: init_ssh() init_http() result = repo._Run(argv) or 0 finally: close_ssh() except KeyboardInterrupt: ... result = 1 except ManifestParseError as mpe: ... result = 1 except RepoChangedException as rce: # If repo changed, re-exec ourselves. # argv = list(sys.argv) argv.extend(rce.extra_args) try: os.execv(__file__, argv) except OSError as e: ... result = 128 sys.exit(result)if __name__ == '__main__': _Main(sys.argv[1:])
_Main函数的重点部分在于repo调用_Repo类中的成员函数_Run,而前期也如repo和main两个模块一样做一些必要的检查。修剪命令行的_PruneOptions函数、解析命令的parse_args函数(opt为”–”之前的内容,argv”为–”之后的内容)、检查repo模块版本的_CheckWrapperVersion函数、检查 .repo目录是否存在的_CheckRepoDir函数。
_Repo类的代码如下:
from subcmds import all_commandsclass _Repo(object): def __init__(self, repodir): self.repodir = repodir self.commands = all_commands # add 'branch' as an alias for 'branches' all_commands['branch'] = all_commands['branches'] def _Run(self, argv): result = 0 name = None glob = [] for i in range(len(argv)): if not argv[i].startswith('-'): name = argv[i] if i > 0: glob = argv[:i] argv = argv[i + 1:] break if not name: glob = argv name = 'help' argv = [] gopts, _gargs = global_options.parse_args(glob) ... try: cmd = self.commands[name] except KeyError: ... return 1 cmd.repodir = self.repodir cmd.manifest = XmlManifest(cmd.repodir) ... Editor.globalConfig = cmd.manifest.globalConfig ... try: copts, cargs = cmd.OptionParser.parse_args(argv) copts = cmd.ReadEnvironmentOptions(copts) except NoManifestException as e: ... return 1 ... start = time.time() try: result = cmd.Execute(copts, cargs) except (DownloadError, ManifestInvalidRevisionError, NoManifestException) as e: ... result = 1 except NoSuchProjectError as e: ... result = 1 except InvalidProjectGroupsError as e: ... result = 1 finally: elapsed = time.time() - start hours, remainder = divmod(elapsed, 3600) minutes, seconds = divmod(remainder, 60) if gopts.time: if hours == 0: print('real\t%dm%.3fs' % (minutes, seconds), file=sys.stderr) else: print('real\t%dh%dm%.3fs' % (hours, minutes, seconds), file=sys.stderr) return result
_Repo类有两个成员变量repodir、commands和一个类变量all_commands,其中all_commands字典的值是一些repo脚本能够执行命令的类名。那这些值是怎么来的呢 ? 在 from subcmds import all_commands
时,就会初始化subcmds包,将subcmds目录下所有模块名的首字母转化为大写其余字母不变,就成了命令的类名。再结合成员函数_Run,可以知道,该类的作用在于,将解析后的cmd分发到包subcmds下所对应的模块里面的类(比如:init指令—>subcmds/init.py里面的Init类)。
_Repo类的成员函数_Run主要是初始化XmlManifest,获取某个指令独有OptionParse并解析指令,调用Command类的成员函数Execute。
其中XmlManifest类用于管理 .repo,XmlManifest类的代码如下:
class XmlManifest(object): """manages the repo configuration file""" def __init__(self, repodir): self.repodir = os.path.abspath(repodir) self.topdir = os.path.dirname(self.repodir) self.manifestFile = os.path.join(self.repodir, MANIFEST_FILE_NAME) self.globalConfig = GitConfig.ForUser() self.localManifestWarning = False self.isGitcClient = False self.repoProject = MetaProject(self, 'repo', gitdir = os.path.join(repodir, 'repo/.git'), worktree = os.path.join(repodir, 'repo')) self.manifestProject = MetaProject(self, 'manifests', gitdir = os.path.join(repodir, 'manifests.git'), worktree = os.path.join(repodir, 'manifests'))
XmlManifest类在manifest_xml模块里面,XmlManifest类的主要成员变量有:
- repodir:.repo目录的绝对路径
- topdir:AOSP项目的绝对路径(testsource目录绝对路径)
- manifestFile:.repo目录下的链接文件manifest.xml
- repoProject: .repo目录下的repo仓库
- manifestProject:.repo目录下的manifest仓库
类中还提供了对.repo的属性值和对属性值操作的成员函数,比如加载数据到XmlManifest对象(_Load成员函数)和重置数据(_Unload成员函数),创建manifest.xml链接文件(Link成员函数),获取projects目录下的仓库对象(GetProjectsWithName,GetProjectPaths成员函数)。所以不难看出该类就是对.repo目录的管理工具。我们在继续看一下该类中重要的成员变量repoProject、manifestProject,都是MetaProject类的对象.
MetaProject类的代码如下
class MetaProject(Project): """A special project housed under .repo. """ def __init__(self, manifest, name, gitdir, worktree): Project.__init__(self, manifest=manifest, name=name, gitdir=gitdir, objdir=gitdir, worktree=worktree, remote=RemoteSpec('origin'), relpath='.repo/%s' % name, revisionExpr='refs/heads/master', revisionId=None, groups=None)
成员变量如下:
- manifest:是XmlManifest类的对象
- name:创建新仓库的名字
- gitdir: .git仓库的绝对路径
- worktree:工作目录
- remote:远程仓库
- relpath:创建新仓库的相对于.repo目录的路径
- revisionExpr: 分支
MetaProject和Project对于仓库的操作逻辑差不多一样,不过为了体现这两个仓库(repo仓库和manifest仓库)在AOSP项目整个仓库集的重要性,才会有这样的命名。
Project类的代码如下:
class Project(object): # These objects can be shared between several working trees. shareable_files = ['description', 'info'] shareable_dirs = ['hooks', 'objects', 'rr-cache', 'svn'] # These objects can only be used by a single working tree. working_tree_files = ['config', 'packed-refs', 'shallow'] working_tree_dirs = ['logs', 'refs'] def __init__(self, manifest, name, remote, gitdir, objdir, worktree, relpath, revisionExpr, revisionId, rebase=True, groups=None, sync_c=False, sync_s=False, clone_depth=None, upstream=None, parent=None, is_derived=False, dest_branch=None, optimized_fetch=False, old_revision=None): """Init a Project object. Args: manifest: The XmlManifest object. name: The `name` attribute of manifest.xml's project element. remote: RemoteSpec object specifying its remote's properties. gitdir: Absolute path of git directory. objdir: Absolute path of directory to store git objects. worktree: Absolute path of git working tree. relpath: Relative path of git working tree to repo's top directory. revisionExpr: The `revision` attribute of manifest.xml's project element. revisionId: git commit id for checking out. rebase: The `rebase` attribute of manifest.xml's project element. groups: The `groups` attribute of manifest.xml's project element. sync_c: The `sync-c` attribute of manifest.xml's project element. sync_s: The `sync-s` attribute of manifest.xml's project element. upstream: The `upstream` attribute of manifest.xml's project element. parent: The parent Project object. is_derived: False if the project was explicitly defined in the manifest; True if the project is a discovered submodule. dest_branch: The branch to which to push changes for review by default. optimized_fetch: If True, when a project is set to a sha1 revision, only fetch from the remote if the sha1 is not present locally. old_revision: saved git commit id for open GITC projects. """ self.manifest = manifest self.name = name self.remote = remote self.gitdir = gitdir.replace('\\', '/') self.objdir = objdir.replace('\\', '/') if worktree: self.worktree = worktree.replace('\\', '/') else: self.worktree = None self.relpath = relpath self.revisionExpr = revisionExpr if revisionId is None \ and revisionExpr \ and IsId(revisionExpr): self.revisionId = revisionExpr else: self.revisionId = revisionId self.rebase = rebase self.groups = groups self.sync_c = sync_c self.sync_s = sync_s self.clone_depth = clone_depth self.upstream = upstream self.parent = parent self.is_derived = is_derived self.optimized_fetch = optimized_fetch self.subprojects = [] self.snapshots = {} self.copyfiles = [] self.linkfiles = [] self.annotations = [] self.config = GitConfig.ForRepository( gitdir=self.gitdir, defaults=self.manifest.globalConfig) if self.worktree: self.work_git = self._GitGetByExec(self, bare=False, gitdir=gitdir) else: self.work_git = None self.bare_git = self._GitGetByExec(self, bare=True, gitdir=gitdir) self.bare_ref = GitRefs(gitdir) self.bare_objdir = self._GitGetByExec(self, bare=True, gitdir=objdir) self.dest_branch = dest_branch self.old_revision = old_revision # This will be filled in if a project is later identified to be the # project containing repo hooks. self.enabled_repo_hooks = []
Project是用来描述AOSP项目某一个仓库(或者说项目),其中有几个重要的值是来源于manifest.xml, name,revisionExpr,rebase,groups,sync_c,sync_s,upstream
这几个值对应到manifest.xml中某个标签的属性值,后续我们在克隆projects仓库集会讲解manifest标签和属性的用途。所以AOSP项目的仓库信息都在manifest.xml,除了repo仓库和manifest仓库,这些信息是在我们使用”repo sync”时会用到。
现在我们回到成员函数_Run的流程中,XmlManifest类已经构造完了。cmd.OptionParser.parse_args(argv)
,再去获取每个指令独有的OptionParser并且解析指令 init -u xxxx -b xxx
OptionParser属性函数的代码如下:
class Command(object): """Base class for any command line action in repo. """ ... @property def OptionParser(self): if self._optparse is None: try: me = 'repo %s' % self.NAME usage = self.helpUsage.strip().replace('%prog', me) except AttributeError: usage = 'repo %s' % self.NAME self._optparse = optparse.OptionParser(usage=usage) self._Options(self._optparse) return self._optparse ... def _Options(self, p): """Initialize the option parser. """
Command的衍生类重写了基类的_Options,定义了属于自己的options,先留个坑后面讲到”repo sync”的时候再分析。
创建完XmlManifest类,解析命令行后,接下来就是调用Execute。
Command类是所有命令(init、sync、start)的基类,其成员函数Execute被其衍生类重写,故调用成员函数Execute就可以执行某个命令对应的成员函数Execute。所以,执行到这一行 result = cmd.Execute(copts, cargs)
的时候,就是整个架构的分水岭了。下面的图片是对前面的总结。
接下来就是执行init模块中Init类的成员函数Execute:
class Init(InteractiveCommand, MirrorSafeCommand): ... def Execute(self, opt, args): git_require(MIN_GIT_VERSION, fail=True) if opt.reference: opt.reference = os.path.expanduser(opt.reference) # Check this here, else manifest will be tagged "not new" and init won't be # possible anymore without removing the .repo/manifests directory. if opt.archive and opt.mirror: ... sys.exit(1) self._SyncManifest(opt) self._LinkManifest(opt.manifest_name) if os.isatty(0) and os.isatty(1) and not self.manifest.IsMirror: if opt.config_name or self._ShouldConfigureUser(): self._ConfigureUser() self._ConfigureColor() self._ConfigureDepth(opt) self._DisplayResult()
Init类的成员函数Execute的重点在于两个成员函数_SyncManifest和_LinkManifest,前者会克隆出manifest仓库并且切换到可用的分支上,后者会通过os模块symlink函数生成链接文件manifest.xml。
_SyncManifest函数的代码如下:
class Init(InteractiveCommand, MirrorSafeCommand): ... def _SyncManifest(self, opt): m = self.manifest.manifestProject is_new = not m.Exists if is_new: ... m._InitGitDir(mirror_git=mirrored_manifest_git) if opt.manifest_branch: m.revisionExpr = opt.manifest_branch else: m.revisionExpr = 'refs/heads/master' else: if opt.manifest_branch: m.revisionExpr = opt.manifest_branch else: m.PreSync() ... if not m.Sync_NetworkHalf(is_new=is_new, quiet=opt.quiet, clone_bundle=not opt.no_clone_bundle, current_branch_only=opt.current_branch_only, no_tags=opt.no_tags): r = m.GetRemote(m.remote.name) print('fatal: cannot obtain manifest %s' % r.url, file=sys.stderr) # Better delete the manifest git dir if we created it; otherwise next # time (when user fixes problems) we won't go through the "is_new" logic. if is_new: shutil.rmtree(m.gitdir) sys.exit(1) if opt.manifest_branch: m.MetaBranchSwitch() syncbuf = SyncBuffer(m.config) m.Sync_LocalHalf(syncbuf) syncbuf.Finish() if is_new or m.CurrentBranch is None: if not m.StartBranch('default'): print('fatal: cannot create default in manifest', file=sys.stderr) sys.exit(1)
Init类的成员函数_SyncManifest会克隆一个仓库,流程一般如下: git init--->git fetch--->git checkout branch_name
。对应的Project类成员函数就是_InitGitDir,Sync_NetworkHalf,Sync_LocalHalf,是不是很熟悉,跟克隆repo仓库的流程是一样的,其实repo仓库、manifest仓库、projects仓库集这些仓库克隆出来的方式是一样的。
class Project(object): ... def _InitGitDir(self, mirror_git=None, force_sync=False): init_git_dir = not os.path.exists(self.gitdir) init_obj_dir = not os.path.exists(self.objdir) try: # Initialize the bare repository, which contains all of the objects. if init_obj_dir: os.makedirs(self.objdir) self.bare_objdir.init() ...
InitGitDir,初始化的仓库为manifest.git,manifest目录下的.git仓库是manifest的复制品,通过Project类的成员函数_InitWorkTree创建。接着再说类_GitGetByExec,_GitGetByExec的对象bare_objdir封装了操作仓库的命令。比如git init。但是却找不到成员函数init,原来成员函数init是动态定义的。关键的地方就在于_GitGetByExec类的成员函数_getattr。
_GitGetByExec类的成员函数getattr代码如下:
class Project(object): ... class _GitGetByExec(object): ... def __getattr__(self, name): ... name = name.replace('_', '-') def runner(*args, **kwargs): cmdv = [] config = kwargs.pop('config', None) ... if config is not None: if not git_require((1, 7, 2)): ... for k, v in config.items(): cmdv.append('-c') cmdv.append('%s=%s' % (k, v)) cmdv.append(name) cmdv.extend(args) p = GitCommand(self._project, cmdv, bare=self._bare, gitdir=self._gitdir, capture_stdout=True, capture_stderr=True) if p.Wait() != 0: ... r = p.stdout try: r = r.decode('utf-8') except AttributeError: pass if r.endswith('\n') and r.index('\n') == len(r) - 1: return r[:-1] return r return runner
runner闭包用来处理调用者提供的参数,比如bare_git.describe(project.GetRevisionId())中的”project.GetRevisionId()”,对应的git命令就是 git describe args
所以_GitGetByExec类通过成员函数getattr可以向工厂一样生产一些执行git命令的成员函数。既然仓库已经初始化好了,那么接下来就是fetch仓库了。
Sync_NetworkHalf成员函数的代码如下:
class Project(object): ... def Sync_NetworkHalf(self, quiet=False, is_new=None, current_branch_only=False, force_sync=False, clone_bundle=True, no_tags=False, archive=False, optimized_fetch=False, prune=False): ... if (need_to_fetch and not self._RemoteFetch(initial=is_new, quiet=quiet, alt_dir=alt_dir, current_branch_only=current_branch_only, no_tags=no_tags, prune=prune, depth=depth)): return False if self.worktree: self._InitMRef() else: self._InitMirrorHead() try: os.remove(os.path.join(self.gitdir, 'FETCH_HEAD')) except OSError: pass return True
Project类Sync_NetworkHalf方法调用_RemoteFetch方法实现了从远程仓库fetch记录到本地仓库,_RemoteFetch函数其实是”git fetch”命令的封装。
Sync_LocalHalf成员函数的代码如下:
class Project(object): ... def Sync_LocalHalf(self, syncbuf, force_sync=False): """Perform only the local IO portion of the sync process. Network access is not required. """ self._InitWorkTree(force_sync=force_sync) ... revid = self.GetRevisionId(all_refs) def _doff(): self._FastForward(revid) self._CopyAndLinkFiles() head = self.work_git.GetHead() ... if branch is None or syncbuf.detach_head: # Currently on a detached HEAD. The user is assumed to # not have any local modifications worth worrying about. # ... if head == revid: # No changes; don't do anything further. # Except if the head needs to be detached # if not syncbuf.detach_head: # The copy/linkfile config may have changed. self._CopyAndLinkFiles() return else: lost = self._revlist(not_rev(revid), HEAD) if lost: syncbuf.info(self, "discarding %d commits", len(lost)) try: self._Checkout(revid, quiet=True) except GitError as e: syncbuf.fail(self, e) return self._CopyAndLinkFiles() return if head == revid: # No changes; don't do anything further. # # The copy/linkfile config may have changed. self._CopyAndLinkFiles() return branch = self.GetBranch(branch) if not branch.LocalMerge: # The current branch has no tracking configuration. # Jump off it to a detached HEAD. # ... try: self._Checkout(revid, quiet=True) except GitError as e: syncbuf.fail(self, e) return self._CopyAndLinkFiles() return upstream_gain = self._revlist(not_rev(HEAD), revid) pub = self.WasPublished(branch.name, all_refs) if pub: not_merged = self._revlist(not_rev(revid), pub) if not_merged: ... return elif pub == head: # All published commits are merged, and thus we are a # strict subset. We can fast-forward safely. # syncbuf.later1(self, _doff) return # Examine the local commits not in the remote. Find the # last one attributed to this user, if any. # local_changes = self._revlist(not_rev(revid), HEAD, format='%H %ce') last_mine = None cnt_mine = 0 for commit in local_changes: commit_id, committer_email = commit.decode('utf-8').split(' ', 1) if committer_email == self.UserEmail: last_mine = commit_id cnt_mine += 1 if not upstream_gain and cnt_mine == len(local_changes): return ... branch.remote = self.GetRemote(self.remote.name) if not ID_RE.match(self.revisionExpr): # in case of manifest sync the revisionExpr might be a SHA1 branch.merge = self.revisionExpr if not branch.merge.startswith('refs/'): branch.merge = R_HEADS + branch.merge branch.Save() if cnt_mine > 0 and self.rebase: def _dorebase(): self._Rebase(upstream='%s^1' % last_mine, onto=revid) self._CopyAndLinkFiles() syncbuf.later2(self, _dorebase) elif local_changes: try: self._ResetHard(revid) self._CopyAndLinkFiles() except GitError as e: syncbuf.fail(self, e) return else: syncbuf.later1(self, _doff)
Project类的成员函数Sync_LocalHalf内部流程较为复杂,这里我们只讲checkout到一个干净的分支。
- _InitWorkTree成员函数:初始化manifest工作目录下的.git仓库
- _Checkout成员函数:通过”git checkout”切换分支。
_LinkManifest成员函数的代码如下:
class Init(InteractiveCommand, MirrorSafeCommand): ... def _LinkManifest(self, name): if not name: print('fatal: manifest name (-m) is required.', file=sys.stderr) sys.exit(1) try: self.manifest.Link(name) except ManifestParseError as e: print("fatal: manifest '%s' not available" % name, file=sys.stderr) print('fatal: %s' % str(e), file=sys.stderr) sys.exit(1)
成员函数_LinkManifest最终会调用os.symlink,创建manifest工作目录下default.xml的链接文件manifest.xml到 .repo目录下,这样方便访问manifest.xml文件
Projects仓库集
执行完repo init就获取到了repo仓库和manifest仓库了,接下来就要通过manifest.xml链接文件中的AOSP各个项目的元数据,获取projects仓库集。先来看看其内容:
<?xml version="1.0" encoding="UTF-8"?> <manifest> <remote name="aosp" fetch=".." review="https://android-review.googlesource.com/" /> <default revision="refs/tags/android-4.2_r1" remote="aosp" sync-j="4" /> <project path="build" name="platform/build" > <copyfile src="core/root.mk" dest="Makefile" /> </project> <project path="abi/cpp" name="platform/abi/cpp" /> <project path="bionic" name="platform/bionic" /> ...... </manifest>
想了解更多的manifest.xml可以查看.repo/repo/docs/manifest-format.txt。这里我们只做简单了解
manifest.xml定义了四种标签:
- remote:该标签描述的是远程仓库信息。其中的fetch值相当于url路径的前缀。就好比
git@github.com:HawksJamesf/blog.git
中的git@github.com:
,标明了服务器。属性review用于code review服务器地址。 - default:属性revision表明了AOSP项目使用的开发分支,属性sync-j表明了拉取代码时使用的cpu核数
- project: 描述了相对于远程仓库的位置和相对于AOSP根目录的目录名字。比如想要获取build仓库,远程仓库的url为
https://android.googlesource.com/platform
,那么该仓库的对应的远程仓库的url就是https://android.googlesource.com/platform/build
- copyfile: 属性src为某个文件在远程仓库的位置,属性dest为本地仓库的位置。
执行 repo sync -j 8
命令行,流程就跟执行repo init是一样的,到了main模块里面的_Repo类的成员函数_Run调用cmd.Execute(copts, cargs)这个风水岭,才会执行属于sync模块的代码。但是关于copts,cargs参数的如何获取,我们还得先看cmd.OptionParser.parse_args(argv)。
_Options成员函数的代码如下:
class Sync(Command, MirrorSafeCommand): ... def _Options(self, p, show_smart=True): try: self.jobs = self.manifest.default.sync_j except ManifestParseError: self.jobs = 1 ... p.add_option('-l', '--local-only', dest='local_only', action='store_true', help="only update working tree, don't fetch") p.add_option('-n', '--network-only', dest='network_only', action='store_true', help="fetch only, don't update working tree") ... p.add_option('-m', '--manifest-name', dest='manifest_name', help='temporary manifest to use for this sync', metavar='NAME.xml') ... p.add_option('-u', '--manifest-server-username', action='store', dest='manifest_server_username', help='username to authenticate with the manifest server') p.add_option('-p', '--manifest-server-password', action='store', dest='manifest_server_password', help='password to authenticate with the manifest server') p.add_option('--fetch-submodules', dest='fetch_submodules', action='store_true', help='fetch submodules from server') ... if show_smart: p.add_option('-s', '--smart-sync', dest='smart_sync', action='store_true', help='smart sync using manifest from the latest known good build') p.add_option('-t', '--smart-tag', dest='smart_tag', action='store', help='smart sync using manifest from a known tag') g = p.add_option_group('repo Version options') g.add_option('--no-repo-verify', dest='no_repo_verify', action='store_true', help='do not verify repo source code') g.add_option('--repo-upgraded', dest='repo_upgraded', action='store_true', help=SUPPRESS_HELP)
在分析repo init时已经说用,Command的衍生类override成员函数_Options,才能得到独有的OptionParser。还记得 repo command line --->optparse--->git command line
这个流程吗 ? 每种命令都有其对应的OptionParser,这样才能做到各个命令模块有自己处理repo command line的逻辑。紧接着把解析后的值传给成员函数Execute。
Execute成员函数的代码如下:
class Sync(Command, MirrorSafeCommand): ... def Execute(self, opt, args): ... manifest_name = opt.manifest_name ... rp = self.manifest.repoProject rp.PreSync() mp = self.manifest.manifestProject mp.PreSync() ... if not opt.local_only: mp.Sync_NetworkHalf(quiet=opt.quiet, current_branch_only=opt.current_branch_only, no_tags=opt.no_tags, optimized_fetch=opt.optimized_fetch) if mp.HasChanges: syncbuf = SyncBuffer(mp.config) mp.Sync_LocalHalf(syncbuf) if not syncbuf.Finish(): sys.exit(1) self._ReloadManifest(manifest_name) if opt.jobs is None: self.jobs = self.manifest.default.sync_j ... all_projects = self.GetProjects(args, missing_ok=True, submodules_ok=opt.fetch_submodules) self._fetch_times = _FetchTimes(self.manifest) if not opt.local_only: to_fetch = [] now = time.time() if _ONE_DAY_S <= (now - rp.LastFetch): to_fetch.append(rp) to_fetch.extend(all_projects) to_fetch.sort(key=self._fetch_times.Get, reverse=True) fetched = self._Fetch(to_fetch, opt) _PostRepoFetch(rp, opt.no_repo_verify) if opt.network_only: # bail out now; the rest touches the working tree return # Iteratively fetch missing and/or nested unregistered submodules previously_missing_set = set() while True: ... all_projects = self.GetProjects(args, missing_ok=True, submodules_ok=opt.fetch_submodules) missing = [] for project in all_projects: if project.gitdir not in fetched: missing.append(project) if not missing: break # Stop us from non-stopped fetching actually-missing repos: If set of # missing repos has not been changed from last fetch, we break. missing_set = set(p.name for p in missing) if previously_missing_set == missing_set: break previously_missing_set = missing_set fetched.update(self._Fetch(missing, opt)) ... if self.UpdateProjectList(): sys.exit(1) ... for project in all_projects: ... if project.worktree: project.Sync_LocalHalf(syncbuf, force_sync=opt.force_sync) ... ...
前期会有一些更新检查repo仓库和manifest仓库的工作,后期就会拉去projects仓库集,那么下面我们就来粗糙的理解执行成员函数Execute的流程:
- 一开始获取manifest仓库和repo仓库的对象,然后都会调用之前分析过的PreSync,如果opt.local_only不存在,就会调用Sync_NetworkHalf成员函数更新manifest仓库。紧接着如果manifest本地仓库相对于远程仓库有变化,就会调用Sync_LocalHalf做一些merge或者rebase操作,然后调用ReloadManifest成员函数重新从manifest.xml载入数据到对象。
- 接下来就是_Fetch成员函数,其中除了manifest仓库,其他的仓库都会更新。如果开启的合数大于1的话就会创建新的线程,防止主线程阻塞,其中关于线程同步机制可以查看这里。而获取projects仓库集,主要使用的是Project类的成员函数Sync_NetworkHalf,接着调用_PostRepoFetch函数判断repo仓库是否有变化,如果是,则调用Sync_LocalHalf成员函数做一些merge或者rebase操作。
- 紧接着通过GetProjects成员函数从 .repo/projects目录下得到AOSP所有仓库对象的列表,不包括repo仓库和manifest仓库,这样就可以调用Sync_LocalHalf成员函数。如果参数args指定了具体的仓库(即repo sync project_name),那么GetProjects成员函数就只能得到指定的仓库。获取仓库的方式有两种:一种_GetProjectByPath,另一种是GetProjectsWithName。
至此,repo仓库、manifest仓库、projects仓库集连同其对应的工作目录都已经初始化完成。那么是不是就可以开始开发呢 ? 其实接下来还要为AOSP项目的仓库创建一个新的开发分支,只有这样我们才能够在分支上面提交、上传自己的代码。也只有这样当我们发布了一个版本就可以给这个版本打上一个tag。当项目可以量产时,就可以在这个分支基础上创建出一个量产分支,并上传到服务器。
创建分支
repo start master --all
其实挺好理解这个命令行的,就是 git checkout -b name
命令行的批量操作。留个坑,以后填吧。
class Start(Command): ... def _Options(self, p): p.add_option('--all', dest='all', action='store_true', help='begin branch in all projects') def Execute(self, opt, args): if not args: self.Usage() nb = args[0] if not git.check_ref_format('heads/%s' % nb): print("error: '%s' is not a valid name" % nb, file=sys.stderr) sys.exit(1) err = [] projects = [] if not opt.all: projects = args[1:] if len(projects) < 1: projects = ['.',] # start it in the local project by default all_projects = self.GetProjects(projects, missing_ok=bool(self.gitc_manifest)) # This must happen after we find all_projects, since GetProjects may need # the local directory, which will disappear once we save the GITC manifest. if self.gitc_manifest: gitc_projects = self.GetProjects(projects, manifest=self.gitc_manifest, missing_ok=True) for project in gitc_projects: if project.old_revision: project.already_synced = True else: project.already_synced = False project.old_revision = project.revisionExpr project.revisionExpr = None # Save the GITC manifest. gitc_utils.save_manifest(self.gitc_manifest) # Make sure we have a valid CWD if not os.path.exists(os.getcwd()): os.chdir(self.manifest.topdir) pm = Progress('Starting %s' % nb, len(all_projects)) for project in all_projects: pm.update() if self.gitc_manifest: gitc_project = self.gitc_manifest.paths[project.relpath] # Sync projects that have not been opened. if not gitc_project.already_synced: proj_localdir = os.path.join(self.gitc_manifest.gitc_client_dir, project.relpath) project.worktree = proj_localdir if not os.path.exists(proj_localdir): os.makedirs(proj_localdir) project.Sync_NetworkHalf() sync_buf = SyncBuffer(self.manifest.manifestProject.config) project.Sync_LocalHalf(sync_buf) project.revisionId = gitc_project.old_revision # If the current revision is a specific SHA1 then we can't push back # to it; so substitute with dest_branch if defined, or with manifest # default revision instead. branch_merge = '' if IsId(project.revisionExpr): if project.dest_branch: branch_merge = project.dest_branch else: branch_merge = self.manifest.default.revisionExpr if not project.StartBranch(nb, branch_merge=branch_merge): err.append(project) pm.end() if err: for p in err: print("error: %s/: cannot start %s" % (p.relpath, nb), file=sys.stderr) sys.exit(1)
4.Reference
Android Open Source Project
repo仓库源码
清华大学提供的AOSP
Android源代码仓库及其管理工具Repo分析
- Android源码解析之repo仓库
- Android源码仓库和Repo工具使用
- 自建 Android 源码 git/repo 仓库
- Android源码仓库和Repo工具使用
- repo下载android源码
- repo 获取Android源码
- repo 获取Android源码
- repo同步android源码
- Repo全解之自己搭建repo仓库
- Android源代码仓库及其管理工具Repo分析
- Android源代码仓库及其管理工具Repo分析
- Android源代码仓库及其管理工具Repo分析
- Android源代码仓库及其管理工具Repo分析
- Android源代码仓库及其管理工具Repo分析
- Android源代码仓库及其管理工具Repo分析
- Android源代码仓库及其管理工具Repo分析
- Android源代码仓库及其管理工具Repo分析
- Android源代码仓库及其管理工具Repo分析
- Golang 使用 C语言 简单操作
- Spring Boot之基于注解的数据格式化
- 连通图计数模板【c++&Java】
- 工作学习笔记2017-04-19
- SharedPreferences封装
- Android源码解析之repo仓库
- shell编程 部分讲解
- 数据库事务相关知识总结
- 关于AFNetworking请求封装的思考与实践
- 安卓开源项目周报0419
- 简单实现的水波纹效果
- AbcABC
- s3c2440之uda1341声卡驱动以及madplay播放器移植
- poj 1157 LITTLE SHOP OF FLOWERS