VLDB and Partitioning Guide --oracle 11

来源:互联网 发布:js e.target 获取属性 编辑:程序博客网 时间:2024/06/03 18:16

Here are some suggestions for when to partition a table:

Tables greater than 2 GB should always be considered as candidates for partitioning.

Tables containing historical data, in which new data is added into the newest partition. A typical example is a historical table where only the current month's data is updatable and the other 11 months are read only.

When the contents of a table must be distributed across different types of storage devices.

 

 

Just like partitioned tables, partitioned indexes improve manageability, availability, performance, and scalability. They can either be partitioned independently (global indexes) or automatically linked to a table's partitioning method (local indexes). In general, you should use global indexes for OLTP applications and local indexes for data warehousing or decision support systems (DSS) applications. Also, whenever possible, try to use local indexes because they are easier to manage.

 

 

When deciding what kind of partitioned index to use, you should consider the

following guidelines in this order:

1. If the table partitioning column is a subset of the index keys, then use a local index. If this is the case, then you are finished. If this is not the case, then continue to guideline 2.

2. If the index is unique and does not include the partitioning key columns, then use a global index. If this is the case, then you are finished. Otherwise, continue to guideline 3.

3. If your priority is manageability, then use a local index. If this is the case, then you are finished. If this is not the case, continue to guideline 4.

4. If the application is an OLTP type and users need quick response times, then use a global index. If the application is a DSS type and users are more interested in throughput, then use a local index.

 

 

Local Partitioned Indexes Global Partitioned Indexes Global Nonpartitioned Indexes

Local partitioned indexes are easier to manage than other types of partitioned indexes. They also offer greater availability and are common in DSS environments

Local Partitioned Indexes

Global Partitioned Indexes : Global Range Partitioned Indexes and Global Hash Partitioned Indexes

Global Nonpartitioned Indexes  

Local Partitioned Indexes

Oracle also maintains the index partitioning automatically when partitions in the underlying table are added, dropped, merged, or split, or when hash partitions or subpartitions are added or coalesced. This ensures that the index remains equipartitioned with the table.

A local index can be created UNIQUE if the partitioning columns form a subset of the index columns. This restriction guarantees that rows with identical index keys always map into the same partition, where uniqueness violations can be detected.

Local Prefixed Indexes

A local index is prefixed if it is partitioned on a left prefix of the index columns and the subpartioning key is included in the index key. Local prefixed indexes can be unique or nonunique.

Local Nonprefixed Indexes

A local index is nonprefixed if it is not partitioned on a left prefix of the index columns or if the index key does not include the subpartitioning key. You cannot have a unique local nonprefixed index unless the partitioning key is a subset of the index key.

 

 

Global Partitioned Indexes

Normally, a global index is not equipartitioned with the underlying table.

A global partitioned index contains a single B-tree with entries for all rows in all partitions. Each index partition may contain keys that refer to many different partitions or subpartitions in the table.

The highest partition of a global index must have a partition bound that includes all values that are MAXVALUE. This insures that all rows in the underlying table can be represented in the index.

A global partitioned index is prefixed if it is partitioned on a left prefix of the index columns.A global partitioned index is nonprefixed if it is not partitioned on a left prefix of the index columns. Oracle does not support global nonprefixed partitioned indexes.

 

If the necessary parallel server processes are not available for parallel execution, a SQL statement is queued when the parallel degree policy is set to automatic. After the necessary resources become available, the SQL statement is dequeued and allowed to execute. The parallel statement queue operates as a first-in, first-out queue by default. If the query in front of the queue cannot be scheduled, none of the queries in the queue can be scheduled even if resources are available in the system to ensure that the query at the head of the queue has adequate resources. However, if you configure and set up a resource plan, then you can control the order in which parallel statements are dequeued and the number of parallel servers used by each workload or consumer group.

 

 

Parallel execution benefits systems with all of the following characteristics:

Symmetric multiprocessors (SMPs), clusters, or massively parallel systems

Sufficient I/O bandwidth

Underutilized or intermittently used CPUs (for example, systems where CPU usage is typically less than 30%)

Sufficient memory to support additional memory-intensive processes, such as sorting, hashing, and I/O buffers

OLTP systems can also benefit from parallel execution during batch processing and during schema maintenance operations such as creation of indexes.

 

 

Each SQL statement undergoes an optimization and parallelization process when it is parsed. If parallel execution is chosen, then the following steps occur:

1. The user session or shadow process takes on the role of a coordinator, often called the query coordinator.

2. The query coordinator obtains the necessary number of parallel servers.

3. The SQL statement is executed as a sequence of operations (a full table scan to perform a join on a nonindexed column, an ORDER BY clause, and so on). The parallel execution servers performs each operation in parallel if possible.

4. When the parallel servers are finished executing the statement, the query coordinator performs any portion of the work that cannot be executed in parallel. For example, a parallel query with a SUM() operation requires adding the individual subtotals calculated by each parallel server.

5. Finally, the query coordinator returns any results to the user.

After the optimizer determines the execution plan of a statement, the parallel execution coordinator determines the parallel execution method for each operation in the plan.

 

 

To execute a query in parallel, Oracle Database generally creates a set of producer parallel execution servers and a set of consumer parallel execution servers. The producer server retrieves rows from tables and the consumer server performs operations such as join, sort, DML, and DDL on these rows. Each server in the producer set has a connection to each server in the consumer set. The number of virtual connections between parallel execution servers increases as the square of the degree of parallelism. Each communication channel has at least one, and sometimes up to four memory buffers, which are allocated from the shared pool. Multiple memory buffers facilitate asynchronous communication among the parallel execution servers.

 

The number of parallel execution servers associated with a single operation is known as the degree of parallelism (DOP). Parallel execution is designed to effectively use multiple CPUs. the Oracle Database parallel execution framework enables you to either explicitly choose a specific degree of parallelism or to rely on Oracle Database to automatically control it.

 

 

PARALLEL_MIN_TIME_THRESHOLD is the second initialization parameter that controls automatic DOP. It specifies the minimum execution time a statement should have before the statement is considered for automatic DOP. By default, this is 10 seconds. The optimizer first calculates a serial execution plan for the SQL statement; if the estimated execution elapsed time is greater than PARALLEL_MIN_TIME_THRESHOLD (10 seconds), the statement becomes a candidate for automatic DOP

By default, the system only uses parallel execution when a parallel degree has been explicitly set on an object or if a parallel hint is specified in the SQL statement. The degree of parallelism used is exactly what was specified

 

 

When an instance starts, Oracle Database creates a pool of parallel execution servers, which are available for any parallel operation. The initialization parameter PARALLEL_MIN_SERVERS specifies the number of parallel execution servers that Oracle Database creates at instance startup

When executing a parallel operation, the parallel execution coordinator obtains parallel execution servers from the pool and assigns them to the operation. If necessary, Oracle Database can create additional parallel execution servers for the operation. These parallel execution servers remain with the operation throughout execution. After the statement has been processed completely, the parallel execution servers return to the pool

If the number of parallel operations increases, Oracle Database creates additional parallel execution servers to handle incoming requests. However, Oracle Database never creates more parallel execution servers for an instance than the value specified by the initialization parameter PARALLEL_MAX_SERVERS

Oracle Database can process a parallel operation with fewer than the requested number of processes If all parallel execution servers in the pool are occupied and the maximum number of parallel execution servers has been started, the parallel execution coordinator switches to serial processing

 

 

The basic unit of work in parallelism is a called a granule. Oracle Database divides the operation executed in parallel (for example, a table scan, table update, or index creation) into granules. Parallel execution processes execute the operation one granule at a time. The number of granules and their sizes correlate with the degree of parallelism (DOP). The number of granules also affect how well the work is balanced across query server processes.

Block Range Granules

Block range granules are ranges of physical blocks from a table

Partition Granules

When partition granules are used, a parallel server process works on an entire partition or subpartition of a table or index

 

 

Parallel index creation works in much the same way as a table scan with an ORDER BY clause. The table is randomly sampled and a set of index keys is found that equally divides the index into the same number of pieces as the DOP. A first set of query processes scans the table, extracts key-rowid pairs, and sends each pair to a process in a second set of query processes based on a key. Each process in the second set sorts the keys and builds an index in the usual fashion. After all index pieces are built, the parallel execution coordinator simply concatenates the pieces (which are ordered) to form the final index.

Parallel local index creation uses a single server set. Each server process in the set is assigned a table partition to scan and for which to build an index partition.

 

 

When creating a table or index in parallel, each parallel execution server uses the values in the STORAGE clause of the CREATE statement to create temporary segments to store the rows.

When you create indexes and tables in parallel, each parallel execution server allocates a new extent and fills the extent with the table or index data.

 

原创粉丝点击