Cloudkitty Data Flow(Newton版)

来源:互联网 发布:怎样做网络推广话术 编辑:程序博客网 时间:2024/06/04 18:07

一、简介

架构图

cloudkitty总共有两个服务(进程),cloudkitty-api和cloudkitty-processor

cloudkitty-api服务作为外界访问cloudkitty的统一接口,可以分为两组api,文档如下:cloudkitty API,通过rpc与cloudkitty-processor服务进行通信。

cloudkitty-processor进程的主要工作有:

  • 接收API发来的RPC消息,比如更新module的状态或优先级
  • 启动循环任务,在每一个计费周期内,对每个计费租户的资源使用情况进行查询和计费(又可细分为以下内容:)
    • 收集计费租户信息(TenantFetcher)
    • 获取资源使用数据(Collector)
    • 格式化资源使用数据(Transformers)
    • 依据计费策略计费(Rating)
    • 存储计费信息(Storage)

二、cloudkitty-api剖析

1.wsgi服务初始化

def main():
    service.prepare_service()
    server = app.build_server()
    try:
        server.serve_forever()
  • service.prepare_service():初始化log并修改配置文件默认值;
  • app.build_server():创建启动WSGI服务,load_app加载各个应用

    #/etc/cloudkitty/api_paste.ini
    [pipeline:main]
    pipeline = cors http_proxy_to_wsgi request_id authtoken ck_api_v1
    [app:ck_api_v1]
    paste.app_factory = cloudkitty.api.app:app_factory
    [filter:authtoken]
    acl_public_routes = //v1
    paste.filter_factory = cloudkitty.api.middleware:AuthTokenMiddleware.factory
    [filter:request_id]
    paste.filter_factory = oslo_middleware:RequestId.factory
    [filter:cors]
    paste.filter_factory =  oslo_middleware.cors:filter_factory
    oslo_config_project = cloudkitty
    [filter:http_proxy_to_wsgi]
    paste.filter_factory = oslo_middleware.http_proxy_to_wsgi:HTTPProxyToWSGI.factory
    oslo_config_project = cloudkitty

    由上可知client请求进来最终路由到cloudkitty.api.app:app_factory

    def app_factory(global_config, **local_conf):
        return setup_app()
     
    def setup_app(pecan_config=None, extra_hooks=None):
        app_conf = get_pecan_config()
        storage_backend = storage.get_storage()
        app_hooks = [
            hooks.RPCHook(),
            hooks.StorageHook(storage_backend),
            hooks.ContextHook(),
        ]
        app = pecan.make_app(
            app_conf.app.root,
            static_root=app_conf.app.static_root,
            template_path=app_conf.app.template_path,
            debug=CONF.api.pecan_debug,
            force_canonical=getattr(app_conf.app, 'force_canonical'True),
            hooks=app_hooks,
            guess_content_type_from_ext=False
        )
        return app

    app定义在app_conf.app.root中,其中有v1 = v1_api.V1Controller():

    #cloudkitty/api/v1/__init__()
    class V1Controller(rest.RestController):
        billing = rating_api.RatingController()
        collector = collector_api.CollectorController()
        rating = rating_api.RatingController()
        report = report_api.ReportController()
        storage = storage_api.StorageController()
        info = info_api.InfoController()

    其中比较特殊的类是rating_api.RatingController(),该类中的reload_modules方法会动态加载计费模块的API:

    class RatingController(rest.RestController):
        @wsme_pecan.wsexpose(None)
        def reload_modules(self):
            policy.enforce(pecan.request.context, 'rating:module_config', {})
            self.modules.reload_extensions()
            self.module_config.reload_extensions()
            self.module_config.expose_modules()

    其中self.modules.reload_extensions()为父类RatingModulesMixin()中的方法:

    PROCESSORS_NAMESPACE = 'cloudkitty.rating.processors'
     
        def reload_extensions(self):
            lock = lockutils.lock('rating-modules')
            with lock:
                ck_utils.refresh_stevedore(PROCESSORS_NAMESPACE)
                self.extensions = extension.ExtensionManager(
                    PROCESSORS_NAMESPACE,
                    invoke_on_load=True)
                if not self._first_call:
                    self.notify_reload()
                else:
                    self._first_call = False

    依据setup.cfg,可知‘cloudkitty.rating.processors’命名空间对应下述计费模块

    cloudkitty.rating.processors =
        noop = cloudkitty.rating.noop:Noop
        hashmap = cloudkitty.rating.hash:HashMap
        pyscripts = cloudkitty.rating.pyscripts:PyScripts


    至此,cloudkitty-api服务中的两组api入口清晰可见,若采用pyscripts(即第三方计费模块脚本),流程也是类似,需要在对应的文件中添加第三方的内容即可

  • server.serve_forever():循环接受client请求, 如果有请求来,经finish_request方法把请求交给RequestHandlerClass处理,RequestHandlerClass调用handle()方法处理request,WSGIRequestHandler的handle()方法把request又交给ServerHandler处理,ServerHandler调用run执行application方法。

2.API预览

COMMON REST API (v1)

方法路径功能GET/v1/collector/mappings返回映射到collector的每个service的列表GET/v1/collector/mappings/(service)获取service到mappings的映射关系POST/v1/collector/mappings创建service到collector的映射DELETE/v1/collector/mappings删除service到collector的映射GET/v1/collector/states获取collector的使能状态PUT/v1/collector/states设置collector的使能状态GET/v1/info/config获取当前配置GET/v1/info/services获取service列表GET/v1/info/services/(service_name)获取某个serviceGET/v1/rating/modules获取modules列表GET/v1/rating/modules/(module_id)获取某个modulePUT/v1/rating/modules更改模块的状态及优先级POST/v1/rating/quota根据多个资源描述获取即时报价GET/v1/rating/reload_modules触发计费模块module列表重载GET/v1/report/summary获取给定期间的总额GET/v1/report/tenants获取计费租户名单GET/v1/report/total获取给定期间的支付金额GET/v1/storage/dataframes获取一段时间和租户的额定资源列表

HashMap Module REST API

方法路径功能GET/v1/rating/module_config/hashmap/types获取所有可用mapping类型GET/v1/rating/module_config/hashmap/services获取service列表GET/v1/rating/module_config/hashmap/services/(service_id)获取某个servicePOST/v1/rating/module_config/hashmap/services创建一个hashmap serviceDELETE/v1/rating/module_config/hashmap/services删除某个service及子项GET/v1/rating/module_config/hashmap/fields获取field列表GET/v1/rating/module_config/hashmap/fields/(field_id)获取某个fieldPOST/v1/rating/module_config/hashmap/fields创建一个fieldDELETE/v1/rating/module_config/hashmap/fields删除某个field及子项GET/v1/rating/module_config/hashmap/mappings获取mapping列表GET/v1/rating/module_config/hashmap/mappings/(mapping_id)获取某个mappingPOST/v1/rating/module_config/hashmap/mappings创建一个mappingPUT/v1/rating/module_config/hashmap/mappings更新某个mappingDELETE/v1/rating/module_config/hashmap/mappings删除某个mappingGET/v1/rating/module_config/hashmap/mappings/group获取某个mapping上附加的groupGET/v1/rating/module_config/hashmap/groups获取group列表GET/v1/rating/module_config/hashmap/groups/(group_id)获取某个groupPOST/v1/rating/module_config/hashmap/groups创建一个groupDELETE/v1/rating/module_config/hashmap/groups删除一个groupGET/v1/rating/module_config/hashmap/groups/mappings获取被附加到某个group上的mappingsGET/v1/rating/module_config/hashmap/groups/thresholds获取被附加到某个group上的thresholds

三、cloudkitty-processer剖析

1.服务初始化

cloudkitty-processer的启动代码在cli/processer.py中,如下:

from cloudkitty import orchestrator
def main():
    service.prepare_service()
    processor = orchestrator.Orchestrator()
    try:
        processor.process()
    except KeyboardInterrupt:
        processor.terminate()

该进程主要的初始化代码在orchestrator这个库中,位于cloudkitty的根目录下。先看看orchestrator.Orchestrator()这个类如何初始化

class Orchestrator(object):
    def __init__(self):
        # Tenant fetcher
        self.fetcher = driver.DriverManager(
            FETCHERS_NAMESPACE,
            CONF.tenant_fetcher.backend,
            invoke_on_load=True).driver
        self.transformers = transformer.get_transformers()
        self.collector = collector.get_collector(self.transformers)
        self.storage = storage.get_storage(self.collector)
        # RPC
        self.server = None
        self._rating_endpoint = RatingEndpoint(self)
        self._init_messaging()
        # DLM
        self.coord = coordination.get_coordinator(
            CONF.orchestrator.coordination_url,
            uuidutils.generate_uuid().encode('ascii'))
        self.coord.start()
        self._period = CONF.collect.period
        self._wait_time = CONF.collect.wait_periods * self._period

可以看到分别初始化了fetcher、transformers、collector、storage。然后初始化了tooz库的coordination,后续利用其分布式锁功能,通过fetcher获取需要计费的tenant ID(即project_id),然后依照计费逻辑轮循各个project时,保证每个project不被中断。

首先是fetcher,调用stevedore库以driver的形式动态加载,对应的后端可以有:

cloudkitty.tenant.fetchers =
    fake = cloudkitty.tenant_fetcher.fake:FakeFetcher
    keystone = cloudkitty.tenant_fetcher.keystone:KeystoneFetcher

transformers的初始化也类似,其后端命名空间:TRANSFORMERS_NAMESPACE = 'cloudkitty.transformers',在setup.cfg中:

cloudkitty.transformers =
    CloudKittyFormatTransformer = cloudkitty.transformer.format:CloudKittyFormatTransformer
    CeilometerTransformer = cloudkitty.transformer.ceilometer:CeilometerTransformer
    GnocchiTransformer = cloudkitty.transformer.gnocchi:GnocchiTransformer

collector、storage也都类似:

cloudkitty.collector.backends =
    fake = cloudkitty.collector.fake:CSVCollector
    ceilometer = cloudkitty.collector.ceilometer:CeilometerCollector
    gnocchi = cloudkitty.collector.gnocchi:GnocchiCollector
    meta = cloudkitty.collector.meta:MetaCollector
 
cloudkitty.storage.backends =
    sqlalchemy = cloudkitty.storage.sqlalchemy:SQLAlchemyStorage
    gnocchihybrid = cloudkitty.storage.gnocchi_hybrid:GnocchiHybridStorage
    gnocchi = cloudkitty.storage.gnocchi:GnocchiStorage

由上可以看出,主要的组件均允许后端以插件的形式插入。再来看processor的主要逻辑:

2.cloudkitty-processor主要逻辑

以下是processor的主要逻辑:

这里简要画了一幅processor的逻辑图:

接下来看代码

def process(self):
    while True:
        self.process_messages()
        self._load_tenant_list()
        while len(self._tenants):
            for tenant in self._tenants[:]:
                lock = self._lock(tenant)
                if lock.acquire(blocking=False):
                    if not self._check_state(tenant):
                        self._tenants.remove(tenant)
                    else:
                        worker = Worker(self.collector, self.storage, tenant)
                        worker.run()
                    lock.release()
                self.coord.heartbeat()
            eventlet.sleep(1)
        eventlet.sleep(self._period)

首先_load_tenant_list()加载所有的project,在for循环中,对每个project建立锁,并判断当前project的状态(判断是否处于下个执行周期),若处于可执行周期,则调用worker.run

def run(self):
    while True:
        timestamp = self.check_state()
        if not timestamp:
            break
        for service in CONF.collect.services:
            try:
                try:
                    data = self._collect(service, timestamp)
                except collector.NoDataCollected:
                    raise
                except Exception as e:
                    LOG.warning(
                        _LW('Error while collecting service '
                            '%(service)s: %(error)s'),
                        {'service': service, 'error': e})
                    raise collector.NoDataCollected('', service)
            except collector.NoDataCollected:
                begin = timestamp
                end = begin + self._period
                for processor in self._processors:
                    processor.obj.nodata(begin, end)
                self._storage.nodata(begin, end, self._tenant_id)
            else:
                # Rating
                for processor in self._processors:
                    processor.obj.process(data)
                # Writing
                self._storage.append(data, self._tenant_id)
        # We're getting a full period so we directly commit
        self._storage.commit(self._tenant_id)

2.1 Collect

首先明确一点,run方法执行于某个project的计费周期内,获取可用的collect services,调用self._collect()方法,其中该方法中的self._collector在Worker类初始化时传入,对应processor服务初始化的collector,对应的collector后端具体方法位置在setup.cfg中定义,前面已经列出。这里看self._collect()方法:

def _collect(self, service, start_timestamp):
    next_timestamp = start_timestamp + self._period
    raw_data = self._collector.retrieve(service,
                                        start_timestamp,
                                        next_timestamp,
                                        self._tenant_id)
    if raw_data:
        return [{'period': {'begin': start_timestamp,
                            'end': next_timestamp},
                 'usage': raw_data}]

其中self._collector.retrieve()方法对应不同的后端有不同的实现,默认Ceilometer为collector后端时,retrieve()方法位于collector.__init__的BaseCollector基类中,其余各后端均有自己的实现(例如gnocchi为后端时,retrieve()方法位于collector.gnocchi.GnocchiCollector中)

以Ceilometer为后端为例:

#/collector/__init__.py
 
    def retrieve(self, resource, start, end=None, project_id=None, q_filter=None):
        trans_resource = self._res_to_func(resource)
        if not hasattr(self, trans_resource):
            raise NotImplementedError(
                "No method found in collector '%s' for resource '%s'."
                % (self.collector_name, resource))
        func = getattr(self, trans_resource)
        return func(start, end, project_id, q_filter)

会获取self中是否具有get_[resource]的方法,若有则调用该方法,比如resource为image:

#/collector/ceilometer.py
 
    def get_image(self, start, end=None, project_id=None, q_filter=None):
        active_image_stats = self.resources_stats('image.size', start, end, project_id, q_filter)
        image_data = []
        for image_stats in active_image_stats:
            image_id = image_stats.groupby['resource_id']
            if not self._cacher.has_resource_detail('image', image_id):
                raw_resource = self._conn.resources.get(image_id)
                image = self.t_ceilometer.strip_resource_data('image',
                                                              raw_resource)
                self._cacher.add_resource_detail('image',
                                                 image_id,
                                                 image)
            image = self._cacher.get_resource_detail('image',
                                                     image_id)
            image_size_mb = decimal.Decimal(image_stats.max/ units.Mi
            image_data.append(
                self.t_cloudkitty.format_item(image, self.units_mappings[
                    "image"], image_size_mb))
        if not image_data:
            raise collector.NoDataCollected(self.collector_name, 'image')
        return self.t_cloudkitty.format_service('image', image_data)

可以看到最终会调用Ceilometer的client去获取image的相关信息。

2.2 Rating

回到Worker.run方法中:

#/collector/__init__.py
 
# Rating
   for processor in self._processors:
       processor.obj.process(data)

其中self._processors在Worker的父类BaseWorker中被初始化,对应的命名空间为PROCESSORS_NAMESPACE = 'cloudkitty.rating.processors' 下面贴出该命名空间对应setup.cfg中的内容:

cloudkitty.rating.processors =
    noop = cloudkitty.rating.noop:Noop
    hashmap = cloudkitty.rating.hash:HashMap
    pyscripts = cloudkitty.rating.pyscripts:PyScripts

所以这里调用processor.obj.process()方法来处理前面collector收集回来的resource信息,默认后端采用hashmap计费模块的逻辑,具体方法不细致分析。

2.3 Writing

类似前面两个步骤,均是通过调用插件的驱动来完成功能,不重复展开。