Structured data 和 Unstructured data

来源:互联网 发布:excel的数据库管理功能 编辑:程序博客网 时间:2024/06/06 03:22

Structured data refers to any data that resides in a fixed field within a record or file. This includes data contained in relational databases and spreadsheets.Structured data first depends on creating a data model - a model of the types of business data that will be recorded and how they will be stored, processed and accessed. This includes defining what fields of data will be stored and how that data will be stored: data type (numeric, currency, alphabetic, name, date, address) and any restrictions on the data input (number of characters; restricted to certain terms such as Mr., Ms. or Dr.; M or F).

结构化数据指的是数据在一个记录文件里面以固定格式存在的数据。它通常包括RDD和表格数据。结构化数据首先依赖于建立一个数据模型,数据模型是指数据是怎么样被存储,处理和登录的, 他包括数据是怎么样被存储的,数据的格式以及其他的限制。


Unstructured data (or unstructured information) refers to information that either does not have a pre-defined data model or is not organized in a pre-defined manner. Unstructured information is typically text-heavy, but may contain data such as dates, numbers, and facts as well. This results in irregularities and ambiguities that make it difficult to understand using traditional programs as compared to data stored in fielded form in databases or annotated (semantically tagged) in documents.


非结构化数据是指信息没有一个预先定义好的数据模型或者没有以一个预先定义的方式来组织。非结构化数据一般指大家文字型数据,但是数据中有很多诸如时间,数字等的信息。相对于传统的在数据库中或者标记好的文件,由于他们的非特征性和歧义性,会更难理解。

0 0
原创粉丝点击