ODS和数据仓库的区别

来源:互联网 发布:mac日版中文输入法切换 编辑:程序博客网 时间:2024/06/06 10:53

ODS和DWH都是DW架构中的一部分
Gartner的定义:

An operational data store (ODS) is an alternative to having operational decision support system (DSS) applications access data directly from the database that supports transaction processing (TP). While both require a significant amount of planning, the ODS tends to focus on the operational requirements of a particular business process (for example, customer service), and on the need to allow updates and propagate those updates back to the source operational system from which the data elements were obtained. The data warehouse, on the other hand, provides an architecture for decision makers to access data to perform strategic analysis, which often involves historical and cross-functional data and the need to support many applications.

TechTarget的定义:

An operational data store (ODS) is a type of database that’s often used as an interim logical area for a data warehouse.

While in the ODS, data can be scrubbed, resolved for redundancy and checked for compliance with the corresponding business rules. An ODS can be used for integrating disparate data from multiple sources so that business operations, analysis and reporting can be carried out while business operations are occurring. This is the place where most of the data used in current operation is housed before it’s transferred to the data warehouse for longer term storage or archiving.
An ODS is designed for relatively simple queries on small amounts of data (such as finding the status of a customer order), rather than the complex queries on large amounts of data typical of the data warehouse. An ODS is similar to your short term memory in that it stores only very recent information; in comparison, the data warehouse is more like long term memory in that it stores relatively permanent information.

Oracle Docs讲基本概念的:
Introduction to Data Warehousing Concepts

Operational data stores exist to support daily operations. The ODS data is cleaned and validated, but it is not historically deep: it may be just the data for the current day. Rather than support the historically rich queries that a data warehouse can handle, the ODS gives data warehouses a place to get access to the most current data, which has not yet been loaded into the data warehouse. The ODS may also be used as a source to load the data warehouse. As data warehousing loading techniques have become more advanced, data warehouses may have less need for ODS as a source for loading data. Instead, constant trickle-feed systems can load the data warehouse in near real time

这篇举了很多例子,有助于理解,比如源系统可能数据不全,而ODS可以汇总多个数据源数据,比如源系统性能有压力:
What is Operational Data Store (ODS)

里面的讨论可以细看一下,例如schema不同,时效不同,目的不同:
Difference between ODS and Datawarehouse

算是个不错的总结:
Operational Data Store (ODS) Defined | James Serra’s Blog

To summarize the differences between an ODS and a data warehouse:

An ODS is targeted for the lowest granular queries whereas a data warehouse is usually used for complex queries against summary-level or on aggregated dataAn ODS is meant for operational reporting and supports current or near real-time reporting requirements whereas a data warehouse is meant for historical and trend analysis reporting usually on a large volume of dataAn ODS contains only a short window of data, while a data warehouse contains the entire history of dataAn ODS provides information for operational and tactical decisions on current or near real-time data while a data warehouse delivers feedback for strategic decisions leading to overall system improvementsIn an ODS the frequency of data load could be every few minutes or hourly whereas in a data warehouse the frequency of data loads could be daily, weekly, monthly or quarterly

Major reasons for implementing an ODS include:

The limited reporting in the source systemsThe desire to use a better and more powerful reporting tool than what the source systems offerOnly a few people have the security to access the source systems and you want to allow others to generate reportsA company owns many retail stores each of which track orders in its own database and you want to consolidate the databases to get real-time inventory levels throughout the dayYou need to gather data from various source systems to get a true picture of a customer so you have the latest info if the customer calls customer service.  Custom data such as customer info, support history, call logs, and order info.  Or medical data to get a true picture of a patient so the doctor has the latest info throughout the day: outpatient department records, hospitalization records, diagnostic records, and pharmaceutical purchase records

几个例子:
第一个
In a bank, for example, an ODS (by this definition) has, at any given time, one account balance for each checking account, courtesy of the checking account system, and one balance for each savings account, as provided by the savings account system.

The various systems send the account balances periodically (such as at the end of each day), and an ODS user can then look in one place to see each bank customer’s complete profile (such as the customer’s basic information and balance information for each type of account).

0 0
原创粉丝点击