note of Big data dummies: Understanding the waves of managing data

来源:互联网 发布:淘宝店怎么提高销量 编辑:程序博客网 时间:2024/05/17 01:37

To get you started, big data is defined as any kind of data source that has at least three shared characteristics:

  • Extremely large Volumes of data
  • Extremely high Velocity of data
  • Extremely wide Variety of data

wave 1: Creating manageable data structures - structured relational database

  • Later in the 1970s, things changed with the invention of the relational data model and the relational database management system (RDBMS) that imposed structure and a method for improving performance. Most importantly, the relational model added a level of abstraction (the structured query language [SQL], report generators, and data management tools) so that it was easier for programmers to satisfy the growing business demands to extract value from data.
  • The Entity-Relationship (ER) model emerged, which added additional abstraction to increase the usability of the data. In this model, each item was defined independently of its use. Therefore, developers could create new relationships between data sources without complex programming.
  • Data warehouses were commercialized in the 1990s
  • As companies began to store unstructured data, vendors began to add capabilities such as BLOBs (binary large objects).
  • Enter the object database management system (ODBMS). The object database stored the BLOB as an addressable set of pieces so that we could see what was in there. Unlike the BLOB, which was an independent unit appended to a traditional relational database, the object database provided a unified approach for dealing with unstructured data.


wave 2: Web and content management - getting started with Big Data

  • The market evolved from a set of disconnected solutions to a more unified model that brought together these elements into a platform that incorporated business process management, version control, information recognition, text management, and collaboration. This new generation of systems added meta-data (information about the organization and characteristics of the stored information).


wave 3: Managing big data

  • With big data, it is now possible to virtualize data so that it can be stored efficiently and, utilizing cloud-based storage, more cost-effectively as well. In addition, improvements in network speed and reliability have removed other physical limitations of being able to manage massive amounts of data at an acceptable pace. Add to this the impact of changes in the price and sophistication of computer memory.
  • Many of the technologies at the heart of big data, such as virtualization, parallel processing, distributed file systems, and in-memory databases, have been around for decades. Advanced analytics have also been around for decades, although they have not always been practical. Other technologies such as Hadoop and MapReduce have been on the scene for only a few years.This combination of technology advances can now address significant business problems.

0 0
原创粉丝点击