Pankaj Motewar
1. Definition of Collection




“A collection is an ordered group of elements, all of the same type. It is a general concept that encompasses lists, arrays, and other familiar datatypes. Each element has a unique subscript that determines its position in the collection.



Collection types:



• • Index-by tables, also known as associative arrays, let you look up elements using arbitrary numbers and strings for subscript values. (They are similar to hash tables in other programming languages.)



• • Nested tables hold an arbitrary number of elements. They use sequential numbers as subscripts. You can define equivalent SQL types, allowing nested tables to be stored in database tables and manipulated through SQL.



• • Varrays (short for variable-size arrays) hold a fixed number of elements (although you can change the number of elements at runtime). They use sequential numbers as subscripts. You can define equivalent SQL types, allowing varrays to be stored in database tables. They can be stored and retrieved through SQL, but with less flexibility than nested tables.



Although collections can have only one dimension, you can model multi-dimensional arrays by creating collections whose elements are also collections.



To use collections in an application, you define one or more PL/SQL types, then define variables of those types. You can define collection types in a procedure, function, or package. You can pass collection variables as parameters, to move data between client-side applications and stored subprograms.



To look up data that is more complex than single values, you can store PL/SQL records or SQL object types in collections. Nested tables and varrays can also be attributes of object types.”



2. Persistent and non-persistent collections



Index-by tables cannot be stored in database tables, so they are non-persistent.
You cannot use them in a SQL statement and are available only in PL/SQL blocks.
Nested tables and Varrays are persistent. You can use the CREATE TYPE statement to create them in the database, you can read and write them from/to a database column.


Nested tables and Varrays must have been initialized before you can use them.



3. Declarations
3.1 Nested tables


TYPE type_name IS TABLE OF element_type [NOT NULL];


With nested tables declared within PL/SQL, element_type can be any PL/SQL datatype except : REF CURSOR
Nested tables declared in SQL (CREATE TYPE) have additional restrictions. They cannot use the following element types:



 BINARY_INTEGER, PLS_INTEGER
 BOOLEAN
 LONG, LONG RAW
 NATURAL, NATURALN
 POSITIVE, POSITIVEN
 REF CURSOR
 SIGNTYPE
 STRING


PL/SQL



Declare
TYPE TYP_NT_NUM IS TABLE OF NUMBER ;
SQL
CREATE [OR REPLACE] TYPE TYP_NT_NUM IS TABLE OF NUMBER ;


3.2 Varrays


TYPE type_name IS {VARRAY
VARYING ARRAY} (size_limit)



OF element_type [NOT NULL];


size_limit is a positive integer literal representing the maximum number of elements in the array.


PL/SQL


Declare



TYPE TYP_V_CHAR IS VARRAY(10) OF VARCHAR2(20) ;
SQL
CREATE [OR REPLACE] TYPE TYP_V_CHAR IS VARRAY(10) OF VARCHAR2(20) ;


3.3 Index-by tables

TYPE type_name IS TABLE OF element_type [NOT NULL]




INDEX BY [BINARY_INTEGER
PLS_INTEGER
VARCHAR2(size_limit)];



INDEX BY key_type;


The key_type can be numeric, either BINARY_INTEGER or PLS_INTEGER(9i).



It can also be VARCHAR2 or one of its subtypes VARCHAR, STRING, or LONG. You must specify the length of a VARCHAR2-based key, except for LONG which is equivalent to declaring a key type of VARCHAR2(32760).



The types RAW, LONG RAW, ROWID, CHAR, and CHARACTER are not allowed as keys for an associative array.


Declare



TYPE TYP_TAB_VAR IS TABLE OF VARCHAR2(50) INDEX BY BINARY_INTEGER ;


4. Initalization


Only Nested tables and varrays need initialization.



To initialize a collection, you use the “constructor” of the collection which name is the same as the collection.


4.1 Nested tables
Declare
TYPE TYP_NT_NUM IS TABLE OF NUMBER ;
Nt_tab TYP_NT_NUM ;
Begin
      Nt_tab := TYP_NT_NUM( 5, 10, 15, 20 ) ;
End ;


4.2 Varrays


Declare
TYPE TYP_V_DAY IS VARRAY(7) OF VARCHAR2(15) ;
v_tab TYP_V_DAY ;



Begin
  v_tab := TYP_NT_NUM( ‘Sunday’,’Monday’,’Tuesday’,’Wedneday’,’Thursday’,’Friday’,’Saturday’ ) ;
End ;



It is not required to initialize all the elements of a collection. You can either initialize no element. In this case, use an empty constructor.
v_tab := TYP_NT_NUM() ;
This collection is empty, which is different than a NULL collection (not initialized).


4.3 Index-by tables
Declare
TYPE TYP_TAB IS TABLE OF NUMBER INDEX BY PLS_INTEGER ;
my_tab TYP_TAB ;
Begin
       my_tab(1) := 5 ;
         my_tab(2) := 10 ;        my_tab(3) := 15 ;
End ;


5. Handle the collection


While the collection is not initialized (Nested tables and Varrays), it is not possible to manipulate it.
You can test if a collection is initialized:


Declare
TYPE TYP_VAR_TAB is VARRAY(30) of varchar2(100) ;
tab1 TYP_VAR_TAB ; -- declared but not initialized
Begin
If Tab1 IS NULL Then
-- NULL collection, have to initialize it --
Tab1 := TYP_VAR_TAB('','','','','','','','','','');
End if ;
-- Now, we can handle the collection --
End ;


To access an element of a collection, we need to use a subscript value that indicates the unique element of the collection.
The subscript is of type integer or varchar2.
Declare
Type TYPE_TAB_EMP IS TABLE OF Varchar2(60) INDEX BY BINARY_INTEGER ;
emp_tab TYPE_TAB_EMP ;
i pls_integer ;
Begin
For i in 0..10 Loop
emp_tab( i+1 ) := 'Emp '
ltrim( to_char( i ) ) ;
End loop ;
End ;
Declare
Type TYPE_TAB_DAYS IS TABLE OF PLS_INTEGER INDEX BY VARCHAR2(20) ;
day_tab TYPE_TAB_DAYS ;
Begin
day_tab( 'Monday' ) := 10 ;
day_tab( 'Tuesday' ) := 20 ;
day_tab( 'Wednesday' ) := 30 ;
End ;


It is possible to assign values of a collection to another collection if they are of the same type.
Declare
Type TYPE_TAB_EMP IS TABLE OF EMP%ROWTYPE INDEX BY BINARY_INTEGER ;
Type TYPE_TAB_EMP2 IS TABLE OF EMP%ROWTYPE INDEX BY BINARY_INTEGER ;
tab1 TYPE_TAB_EMP := TYPE_TAB_EMP( ... );
tab2 TYPE_TAB_EMP := TYPE_TAB_EMP( ... );
tab3 TYPE_TAB_EMP2 := TYPE_TAB_EMP2( ... );
Begin
tab2 := tab1 ; -- OK
tab3 := tab1 ; -- Error : types not similar
...
End ;
Comparing collections
Until the 10g release, collections cannot be directly compared for equality or inequality.
The 10g release allows to do some comparaisons between collections:
You can compare collections of same type to verify if they ar equals or not equals
DECLARE
TYPE Colors IS TABLE OF VARCHAR2(64);
primaries Colors := Colors('Blue','Green','Red');
rgb Colors := Colors('Red','Green','Blue');
traffic_light Colors := Colors('Red','Green','Amber');
BEGIN
-- We can use = or !=, but not < or >.
-- 2 collections are equal even if the membersare not in the same order.
IF primaries = rgb THEN
dbms_output.put_line('OK, PRIMARIES & RGB have same members.');
END IF;
IF rgb != traffic_light THEN
dbms_output.put_line('RGB & TRAFFIC_LIGHT have different members');
END IF;
END;


You can also apply some operators on the collections:
DECLARE
TYPE nested_typ IS TABLE OF NUMBER;
nt1 nested_typ := nested_typ(1,2,3);
nt2 nested_typ := nested_typ(3,2,1);
nt3 nested_typ := nested_typ(2,3,1,3);
nt4 nested_typ := nested_typ(1,2,4);
reponse BOOLEAN;
combien NUMBER;
PROCEDURE verif(test BOOLEAN DEFAULT NULL, label IN VARCHAR2 DEFAULT NULL, quantity NUMBER DEFAULT NULL) IS



BEGIN
IF test IS NOT NULL THEN
dbms_output.put_line(label ' -> '
CASE test WHEN TRUE THEN 'True' WHEN FALSE THEN 'False' END);
END IF;



IF quantity IS NOT NULL THEN
dbms_output.put_line(quantity);
END IF;
END;



BEGIN
reponse := nt1 IN (nt2,nt3,nt4); -- true, nt1 correspond to nt2
verif(test => reponse, label => 'nt1 IN (nt2,nt3,nt4)');
reponse := nt1 SUBMULTISET OF nt3; -- true, all elements correpond
verif(test => reponse, label => 'nt1 SUBMULTISET OF nt3');
reponse := nt1 NOT SUBMULTISET OF nt4; -- true
verif(test => reponse, label => 'nt1 NOT SUBMULTISET OF nt4');
combien := CARDINALITY(nt3); -- number of elements of nt3
verif(quantity => combien);
combien := CARDINALITY(SET(nt3)); -- number of distinct elements
verif(quantity => combien);
reponse := 4 MEMBER OF nt1; -- false, no corresponding element
verif(test => reponse, label => '4 MEMBER OF nt1');
reponse := nt3 IS A SET; -- false, nt3 have duplicated elements
verif(test => reponse, label => 'nt3 IS A SET' );
reponse := nt3 IS NOT A SET; -- true, nt3 have diplicated elements
verif(test => reponse, label => 'nt3 IS NOT A SET' );
reponse := nt1 IS EMPTY; -- false, nt1 have elements
verif(test => reponse, label => 'nt1 IS EMPTY' );



END;


nt1 IN (nt2,nt3,nt4) -> True
nt1 SUBMULTISET OF nt3 -> True
nt1 NOT SUBMULTISET OF nt4 -> True
4
3
4 MEMBER OF nt1 -> False
nt3 IS A SET -> False
nt3 IS NOT A SET -> True
nt1 IS EMPTY -> False


6. Methods
We can use the following methods on a collection:
 • EXISTS
 • COUNT
 • LIMIT
 • FIRST and LAST
 • PRIOR and NEXT
 • EXTEND
 • TRIM
 • DELETE


A collection method is a built-in function or procedure that operates on collections and is called using dot notation.
collection_name.method_name[(parameters)]
Collection methods cannot be called from SQL statements.
Only the EXISTS method can be used on a NULL collection.
all other methods applied on a null collection raise the COLLECTION_IS_NULL error.
Pankaj Motewar
Tuning individual Oracle SQL statements


The acronym SQL stands for Structured Query Language. SQL is an industry standard database query language that was adopted in the mid-1980s. It should not be confused with commercial products such as Microsoft SQL Server or open source products such as MySQL, both of which use the acronym as part of the title of their products.

Do this before you start individual SQL statement tuning

This broad-brush approach can save thousands of hours of tedious SQL tuning because you can hundreds of queries at once. Remember, you MUST do this first, else later changes to the optimizer parameters or statistics may un-tune your SQL.

Remember, you must ALWAYS start with system-level SQL tuning, else later changes might undo your tuned execution plans:

• Optimize the server kernel - You must always tune your disk and network I/O subsystem (RAID, DASD bandwidth, network) to optimize the I/O time, network packet size and dispatching frequency.


• Adjusting your optimizer statistics - You must always collect and store optimizer statistics to allow the optimizer to learn more about the distribution of your data to take more intelligent execution plans. Also, histograms can hypercharge SQL in cases of determining optimal table join order, and when making access decisions on skewed WHERE clause predicates.


• Adjust optimizer parameters - Optimizer optimizer_mode, optimizer_index_caching, optimizer_index_cost_adj.

• Optimize your instance - Your choice of db_block_size, db_cache_size, and OS parameters (db_file_multiblock_read_count, cpu_count, &c), can influence SQL performance.



• Tune your SQL Access workload with physical indexes and materialized views - Just as the 10g SQLAccess advisor recommends missing indexes and missing materialized views, you should always optimize your SQL workload with indexes, especially function-based indexes, a Godsend for SQL tuning.

Once you create a workload (called a SQL Tuning Set, or STS), Oracle will repeatedly execute the workload, using sophisticated predictive models (using a regression testing approach) to accurately identify the salient changes to SQL execution plans, based on your environmental changes. Using SPA, we can predict the impact of system changes on a workload, and we can forecast changes in response times for SQL after making any change, like parameter changes, schema changes, hardware changes, OS changes, or Oracle upgrades.

Once the environment, instance, and objects have been tuned, the Oracle administrator can focus on what is probably the single most important aspect of tuning an Oracle database: tuning the individual SQL statements. In this final article in my series on Oracle tuning, I will share some general guidelines for tuning individual SQL statements to improve Oracle performance.

Oracle SQL tuning goals



The goals of SQL tuning focus on improving the execution plan to fetch the rows with the smallest number of database "touches" (LIO buffer gets and PIO physical reads).

• Remove unnecessary large-table full-table scans—Unnecessary full-table scans cause a huge amount of unnecessary I/O and can drag-down an entire database. The tuning expert first evaluates the SQL based on the number of rows returned by the query. The most common tuning remedy for unnecessary full-table scans is adding indexes. Standard b-tree indexes can be added to tables, and bitmapped and function-based indexes can also eliminate full-table scans. In some cases, an unnecessary full-table scan can be forced to use an index by adding an index hint to the SQL statement.



• Cache small-table full-table scans—In cases where a full-table scan is the fastest access method, the administrator should ensure that a dedicated data buffer is available for the rows. In Oracle8 and beyond, a small table can be cached by forcing it into the KEEP pool.



• Verify optimal index usage—Oracle sometimes has a choice of indexes, and the tuning professional must examine each index and ensure that Oracle is using the proper index.



• Materialize your aggregations and summaries for static tables - One features of the Oracle 10g SQLAccess advisor is recommendations for new indexes and suggestions for materialized views. Materialized views pre-join tables and pre-summarize data, a real silver bullet for data mart reporting databases where the data is only updated daily. Again, see the book "Oracle Tuning: The Definitive Reference", for complete details on SQL tuning with materialized views.

These are the goals of SQL tuning in a nutshell. However, they are deceptively simple, and to effectively meet them, we need to have a through understanding of the internals of Oracle SQL. Let's begin with an overview of the Oracle SQL optimizers.

Oracle SQL optimizers

One of the first things the Oracle DBA looks at is the default optimizer mode for the database. The Oracle initialization parameters offer many cost-based optimizer modes as well as the deprecated yet useful rule-based hint:

The cost-based optimizer uses “statistics” that are collected from the table using the “analyze table” command. Oracle uses these metrics about the tables in order to intelligently determine the most efficient way of servicing the SQL query. It is important to recognize that in many cases, the cost-based optimizer may not make the proper decision in terms of the speed of the query. The cost-based optimizer is constantly being improved, but there are still many cases in which the rule-based optimizer will result in faster Oracle queries.

Prior to Oracle 10g, Oracle's default optimizer mode was called “choose.” In the choose optimizer mode, Oracle will execute the rule-based optimizer if there are no statistics present for the table; it will execute the cost-based optimizer if statistics are present. The danger with using the choose optimizer mode is that problems can occur in cases where one Oracle table in a complex query has statistics and the other tables do not.

Starting in Oracle 10g, the default optimizer mode is all_rows, favoring full-table scans over index access. The all_rows optimizer mode is designed to minimize computing resources and it favors full-table scans. Index access (first_rows_n) adds additional I/O overhead, but they return rows faster, back to the originating query:

Hence, many OLTP shops will choose first_rows, first_rows_100 or first_rows_10, asking Oracle to use indexes to reduce block touches:

When only some tables contain CBO statistics, Oracle will use the cost-based optimization and estimate statistics for the other tables in the query at runtime. This can cause significant slowdown in the performance of the individual query.

In sum, the Oracle database administrator will always try changing the optimizer mode for queries as the very first step in Oracle tuning. The foremost tenet of Oracle SQL tuning is avoiding the dreaded full-table scan. One of the hallmarks of an inefficient SQL statement is the failure of the SQL statement to use all of the indexes that are present within the Oracle database in order to speed up the query.



Of course, there are times when a full-table scan is appropriate for a query, such as when you are doing aggregate operations such as a sum or an average, and the majority of the rows within the Oracle table must be read to get the query results. The task of the SQL tuning expert is to evaluate each full-table scan and see if the performance can be improved by adding an index.

In most Oracle systems, a SQL statement will be retrieving only a small subset of the rows within the table. The Oracle optimizers are programmed to check for indexes and to use them whenever possible to avoid excessive I/O. However, if the formulation of a query is inefficient, the cost-based optimizer becomes confused about the best access path to the data, and the cost-based optimizer will sometimes choose to do a full-table scan against the table. Again, the general rule is for the Oracle database administrator to interrogate the SQL and always look for full-table scans.

A strategic plan for Oracle SQL tuning

Many people ask where they should start when tuning Oracle SQL. Tuning Oracle SQL is like fishing. You must first fish in the Oracle library cache to extract SQL statements and rank the statements by their amount of activity.



Step 1—Identify high-impact SQL



The SQL statements will be ranked according the number of executions and will be tuned in this order. The executions column of the v$sqlarea view and the stats$sql_summary or the dba_hist_sql_summary table can be used to locate the most frequently used SQL. Note that we can display SQL statements by:

• Rows processed—Queries that process a large number of rows will have high I/O and may also have impact on the TEMP tablespace.



• Buffer gets—High buffer gets may indicate a resource-intensive query.



• Disk reads—High disk reads indicate a query that is causing excessive I/O.



• Memory KB—The memory allocation of a SQL statement is useful for identifying statements that are doing in-memory table joins.



• CPU secs—This identifies the SQL statements that use the most processor resources.



• Sorts—Sorts can be a huge slowdown, especially if they’re being done on a disk in the TEMP tablespace.



• Executions—The more frequently executed SQL statements should be tuned first, since they will have the greatest impact on overall performance.



Step 2—Determine the execution plan for SQL



As each SQL statement is identified, it will be “explained” to determine its existing execution plan. There are a host of third-party tools on the market that show the execution plan for SQL statements. The most common way of determining the execution plan for a SQL statement is to use Oracle's explain plan utility. By using explain plan, the Oracle DBA can ask Oracle to parse the statement and display the execution class path without actually executing the SQL statement.



To see the output of an explain plan, you must first create a “plan table.” Oracle provides a script in $ORACLE_HOME/rdbms/admin called utlxplan.sql. Execute utlxplan.sql and create a public synonym for the plan_table:



sqlplus > @utlxplan

Table created.



sqlplus > create public synonym plan_table for sys.plan_table;

Synonym created.

Most relational databases use an explain utility that takes the SQL statement as input, runs the SQL optimizer, and outputs the access path information into a plan_table, which can then be interrogated to see the access methods. Listing 1 runs a complex query against a database.



EXPLAIN PLAN SET STATEMENT_ID = 'test1' FOR

SET STATEMENT_ID = 'RUN1'

INTO plan_table

FOR

SELECT 'T'

plansnet.terr_code, 'P'

detplan.pac1



detplan.pac2

detplan.pac3, 'P1', sum(plansnet.ytd_d_ly_tm),

sum(plansnet.ytd_d_ty_tm),

sum(plansnet.jan_d_ly),

sum(plansnet.jan_d_ty),

FROM plansnet, detplan

WHERE

plansnet.mgc = detplan.mktgpm

AND

detplan.pac1 in ('N33','192','195','201','BAI',

'P51','Q27','180','181','183','184','186','188',

'198','204','207','209','211')

GROUP BY 'T'

plansnet.terr_code, 'P'

detplan.pac1

detplan.pac2

detplan.pac3;

This syntax is piped into the SQL optimizer, which will analyze the query and store the plan information in a row in the plan table identified by RUN1. Please note that the query will not execute; it will only create the internal access information in the plan table. The plan tables contains the following fields:

• operation—The type of access being performed. Usually table access, table merge, sort, or index operation



• options—Modifiers to the operation, specifying a full table, a range table, or a join



• object_name—The name of the table being used by the query component



• Process ID—The identifier for the query component



• Parent_ID—The parent of the query component. Note that several query components may have the same parent.

Now that the plan_table has been created and populated, you may interrogate it to see your output by running the following query in Listing 2.

plan.sql - displays contents of the explain plan table

SET PAGES 9999;

SELECT lpad(' ',2*(level-1))

operation operation,

options,

object_name,

position

FROM plan_table

START WITH id=0

AND

statement_id = 'RUN1'

CONNECT BY prior id = parent_id

AND

statement_id = 'RUN1';

Listing 3 shows the output from the plan table shown in Listing 1. This is the execution plan for the statement and shows the steps and the order in which they will be executed.

SQL> @list_explain_plan



OPERATION

-------------------------------------------------------------------------------------

OPTIONS OBJECT_NAME POSITION

------------------------------ -------------------------------------------------------

SELECT STATEMENT

SORT

GROUP BY 1

CONCATENATION 1

NESTED LOOPS 1

TABLE ACCESS FULL PLANSNET 1

TABLE ACCESS BY ROWID DETPLAN 2

INDEX RANGE SCAN DETPLAN_INDEX5 1

NESTED LOOPS



From this output, we can see the dreaded TABLE ACCESS FULL on the PLANSNET table. To diagnose the reason for this full-table scan, we return to the SQL and look for any plansnet columns in the WHERE clause. There, we see that the plansnet column called “mgc” is being used as a join column in the query, indicating that an index is necessary on plansnet.mgc to alleviate the full-table scan.



While the plan table is useful for determining the access path to the data, it does not tell the entire story. The configuration of the data is also a consideration. The SQL optimizer is aware of the number of rows in each table (the cardinality) and the presence of indexes on fields, but it is not aware of data distribution factors such as the number of expected rows returned from each query component.



Step 3—Tune the SQL statement



For those SQL statements that possess a sub-optimal execution plan, the SQL will be tuned by one of the following methods:



• Adding SQL “hints” to modify the execution plan



• Re-write SQL with Global Temporary Tables

• Rewriting the SQL in PL/SQL. For certain queries this can result in more than a 20x performance improvement. The SQL would be replaced with a call to a PL/SQL package that contained a stored procedure to perform the query.

Using hints to tune Oracle SQL



Among the most common tools for tuning SQL statements are hints. A hint is a directive that is added to the SQL statement to modify the access path for a SQL query.

select /*+ opt_param('optimizer_mode','first_rows_10') */ col1, col2 . . .

select /*+ opt_param('optimizer_index_cost_adj',20) */ col1, col2 . .

Oracle publishes many dozens of SQL hints, and hints become increasingly more complicated through the various releases of Oracle and on into Oracle.

Note: Hints are only used for de-bugging and you should adjust your optimizer statistics to make the CBO replicate the hinted SQL. Let’s look at the most common hints to improve tuning:

• Mode hints: first_rows_10, first_rows_100

• Oracle leading and ordered hints Also see how to tune table join order with histograms



• Dynamic sampling: dynamic_sampling



• Oracle SQL undocumented tuning hints - Guru's only

• The cardinality hint



• Self-order the table joins - If you find that Oracle is joining the tables together in a sub-optimal order, you can use the ORDERED hint to force the tables to be joined in the order that they appear in the FROM clause. See



• Try a first_rows_n hint. Oracle has two cost-based optimizer modes, first_rows_n and all_rows. The first_rows mode will execute to begin returning rows as soon as possible, whereas the all_rows mode is designed to optimize the resources on the entire query before returning rows.



SELECT /*+ first_rows */





A case study in SQL tuning



One of the historic problems with SQL involves formulating SQL queries. Simple queries can be written in many different ways, each variant of the query producing the same result—but with widely different access methods and query speeds.



For example, a simple query such as “What students received an A last semester?” can be written in three ways, as shown in below, each returning an identical result.



A standard join:



SELECT *

FROM STUDENT, REGISTRATION

WHERE

STUDENT.student_id = REGISTRATION.student_id

AND

REGISTRATION.grade = 'A';



A nested query:



SELECT *

FROM STUDENT

WHERE

student_id =

(SELECT student_id

FROM REGISTRATION

WHERE

grade = 'A'

);



A correlated subquery:



SELECT *

FROM STUDENT

WHERE

0 <

(SELECT count(*)

FROM REGISTRATION

WHERE

grade = 'A'

AND

student_id = STUDENT.student_id

);

Let’s wind up with a review of the basic components of a SQL query and see how to optimize a query for remote execution.



Tips for writing more efficient SQL



Space doesn’t permit me to discuss every detail of Oracle tuning, but I can share some general rules for writing efficient SQL in Oracle regardless of the optimizer that is chosen. These rules may seem simplistic but following them in a diligent manner will generally relieve more than half of the SQL tuning problems that are experienced:

• Rewrite complex subqueries with temporary tables - Oracle created the global temporary table (GTT) and the SQL WITH operator to help divide-and-conquer complex SQL sub-queries (especially those with with WHERE clause subqueries, SELECT clause scalar subqueries and FROM clause in-line views). Tuning SQL with temporary tables (and materializations in the WITH clause) can result in amazing performance improvements.



• Use minus instead of EXISTS subqueries - Some say that using the minus operator instead of NOT IN and NOT Exists will result in a faster execution plan.



• Use SQL analytic functions - The Oracle analytic functions can do multiple aggregations (e.g. rollup by cube) with a single pass through the tables, making them very fast for reporting SQL.



• Re-write NOT EXISTS and NOT EXISTS subqueries as outer joins - In many cases of NOT queries (but ONLY where a column is defined as NULL), you can re-write the uncorrelated subqueries into outer joins with IS NULL tests. Note that this is a non-correlated sub-query, but it could be re-written as an outer join.

select book_key from book

where

book_key NOT IN (select book_key from sales);

Below we combine the outer join with a NULL test in the WHERE clause without using a sub-query, giving a faster execution plan.

select b.book_key from book b, sales s

where

b.book_key = s.book_key(+)

and

s.book_key IS NULL;

• Index your NULL values - If you have SQL that frequently tests for NULL, consider creating an index on NULL values. To get around the optimization of SQL queries that choose NULL column values (i.e. where emp_name IS NULL), we can create a function-based index using the null value built-in SQL function to index only on the NULL columns.



• Leave column names alone - Never do a calculation on an indexed column unless you have a matching function-based index (a.k.a. FBI). Better yet, re-design the schema so that common where clause predicates do not need transformation with a BIF:

where salary*5 > :myvalue

where substr(ssn,7,4) = "1234"

where to_char(mydate,mon) = "january"

• Avoid the use of NOT IN or HAVING. Instead, a NOT EXISTS subquery may run faster (when appropriate).



• Avoid the LIKE predicate = Always replace a "like" with an equality, when appropriate.



• Never mix data types - If a WHERE clause column predicate is numeric, do not to use quotes. For char index columns, always use quotes. There are mixed data type predicates:

where cust_nbr = "123"

where substr(ssn,7,4) = 1234

• Use decode and case - Performing complex aggregations with the “decode” or "case" functions can minimize the number of times a table has to be selected.



• Don't fear full-table scans - Not all OLTP queries are optimal when they uses indexes. If your query will return a large percentage of the table rows, a full-table scan may be faster than an index scan. This depends on many factors, including your configuration (values for db_file_multiblock_read_count, db_block_size), query parallelism and the number of table/index blocks in the buffer cache.



• Use those aliases - Always use table aliases when referencing columns.

Also, see these related SQL tuning notes:

Oracle automatic SQL tuning

Oracle 10g AWR SQL Tuning Scripts

Oracle SQL Tuning init.ora parameters

Oracle SQL tuning with column histograms

Oracle SQL undocumented tuning hints

SQL tuning using materialized views



Conclusion



This article should provide you with a good overall background in Oracle SQL tuning, although there are many details that are too involved to discuss in one article. For complete steps and automatic SQL tuning scripts, see my book "Oracle Tuning: The Definitive Reference", with 950 pages of SQL tuning tips and scripts. You can buy it directly from the publisher and get instant access to the Oracle SQL tuning scripts.
Pankaj Motewar
Oracle Indexes

Oracle includes numerous data structures to improve the speed of Oracle SQL queries. Taking advantage of the low cost of disk storage, Oracle includes many new indexing algorithms that dramatically increase the speed with which Oracle queries are serviced. This article explores the internals of Oracle indexing; reviews the standard b-tree index, bitmap indexes, function-based indexes, and index-only tables (IOTs); and demonstrates how these indexes may dramatically increase the speed of Oracle SQL queries.



Oracle uses indexes to avoid the need for large-table, full-table scans and disk sorts, which are required when the SQL optimizer cannot find an efficient way to service the SQL query. I begin our look at Oracle indexing with a review of standard Oracle b-tree index methodologies.



The Oracle b-tree index



The oldest and most popular type of Oracle indexing is a standard b-tree index, which excels at servicing simple queries. The b-tree index was introduced in the earliest releases of Oracle and remains widely used with Oracle.



B-tree indexes are used to avoid large sorting operations. For example, a SQL query requiring 10,000 rows to be presented in sorted order will often use a b-tree index to avoid the very large sort required to deliver the data to the end user.



An Oracle b-tree index



Oracle offers several options when creating an index using the default b-tree structure. It allows you to index on multiple columns (concatenated indexes) to improve access speeds. Also, it allows for individual columns to be sorted in different orders. For example, we could create a b-tree index on a column called last_name in ascending order and have a second column within the index that displays the salary column in descending order.

create index

name_salary_idx

on

person

(

last_name asc,

salary desc);



While b-tree indexes are great for simple queries, they are not very good for the following situations:

• Low-cardinality columns—columns with less than 200 distinct values do not have the selectivity required in order to benefit from standard b-tree index structures.



• No support for SQL functions—B-tree indexes are not able to support SQL queries using Oracle's built-in functions. Oracle9i provides a variety of built-in functions that allow SQL statements to query on a piece of an indexed column or on any one of a number of transformations against the indexed column.



Prior to Oracle9i, the Oracle SQL optimizer had to perform time-consuming long-table, full-table scans due to these shortcomings. Consequently, it was no surprise when Oracle introduced more robust types of indexing structures.



Bitmapped indexes



Oracle bitmap indexes are very different from standard b-tree indexes. In bitmap structures, a two-dimensional array is created with one column for every row in the table being indexed. Each column represents a distinct value within the bitmapped index. This two-dimensional array represents each value within the index multiplied by the number of rows in the table. At row retrieval time, Oracle decompresses the bitmap into the RAM data buffers so it can be rapidly scanned for matching values. These matching values are delivered to Oracle in the form of a Row-ID list, and these Row-ID values may directly access the required information.



The real benefit of bitmapped indexing occurs when one table includes multiple bitmapped indexes. Each individual column may have low cardinality. The creation of multiple bitmapped indexes provides a very powerful method for rapidly answering difficult SQL queries.



For example, assume there is a motor vehicle database with numerous low-cardinality columns such as car_color, car_make, car_model, and car_year. Each column contains less than 100 distinct values by themselves, and a b-tree index would be fairly useless in a database of 20 million vehicles. However, combining these indexes together in a query can provide blistering response times a lot faster than the traditional method of reading each one of the 20 million rows in the base table. For example, assume we wanted to find old blue Toyota Corollas manufactured in 1981:

select

license_plat_nbr

from

vehicle

where

color = ‘blue’

and

make = ‘toyota’

and

year = 1981;



Oracle uses a specialized optimizer method called a bitmapped index merge to service this query. In a bitmapped index merge, each Row-ID, or RID, list is built independently by using the bitmaps, and a special merge routine is used in order to compare the RID lists and find the intersecting values. Using this methodology, Oracle can provide sub-second response time when working against multiple low-cardinality columns:





Oracle bitmap merge join





Function-based indexes



One of the most important advances in Oracle indexing is the introduction of function-based indexing. Function-based indexes allow creation of indexes on expressions, internal functions, and user-written functions in PL/SQL and Java. Function-based indexes ensure that the Oracle designer is able to use an index for its query.



Prior to Oracle8, the use of a built-in function would not be able to match the performance of an index. Consequently, Oracle would perform the dreaded full-table scan. Examples of SQL with function-based queries might include the following:



Select * from customer where substr(cust_name,1,4) = ‘BURL’;

Select * from customer where to_char(order_date,’MM’) = ’01;

Select * from customer where upper(cust_name) = ‘JONES’;

Select * from customer where initcap(first_name) = ‘Mike’;



In Oracle, Oracle always interrogates the where clause of the SQL statement to see if a matching index exists. By using function-based indexes, the Oracle designer can create a matching index that exactly matches the predicates within the SQL where clause. This ensures that the query is retrieved with a minimal amount of disk I/O and the fastest possible speed.

Once a function-based index is created, you need to create CBO statistics, but beware that there are numerous bugs and issues when analyzing a function-based index. See these important notes on statistics and function-based indexes.



Index-only tables



Beginning with Oracle8, Oracle recognized that a table with an index on every column did not require table rows. In other words, Oracle recognized that by using a special table-access method called an index fast full scan, the index could be queried without actually touching the data itself.



Oracle codified this idea with its use of index-only table (IOT) structure. When using an IOT, Oracle does not create the actual table but instead keeps all of the required information inside the Oracle index. At query time, the Oracle SQL optimizer recognizes that all of the values necessary to service the query exist within the index tree, at which time the Oracle cost-based optimizer has a choice of either reading through the index tree nodes to pull the information in sorted order or invoke an index fast full scan, which will read the table in the same fashion as a full table scan, using sequential prefetch (as defined by the db_file_multiblock_read_count parameter). The multiblock read facility allows Oracle to very quickly scan index blocks in linear order, quickly reading every block within the index tablespace. Here is an example of the syntax to create an IOT.

CREATE TABLE emp_iot (

emp_id number,

ename varchar2(20),

sal number(9,2),

deptno number,

CONSTRAINT pk_emp_iot_index PRIMARY KEY (emp_id) )

ORGANIZATION index

TABLESPACE spc_demo_ts_01

PCTHRESHOLD 20 INCLUDING ename;

Index performance

Oracle indexes can greatly improve query performance but there are some important indexing concepts to understand.

• Index clustering

• Index blocksizes

Indexes and blocksize

Indexes that experience lots of index range scans of index fast full scans (as evidence by multiblock reads) will greatly benefit from residing in a 32k blocksize.

"A bigger block size means more space for key storage in the branch nodes of B-tree indexes, which reduces index height and improves the performance of indexed queries."

In any case, there appears to be evidence that block size affects the tree structure, which supports the argument that data blocks affect the structure of the tree.

Indexes and clustering

The CBO's decision to perform a full-table vs. an index range scan is influenced by the clustering_factor (located inside the dba_indexes view), db_block_size, and avg_row_len. It is important to understand how the CBO uses these statistics to determine the fastest way to deliver the desired rows.



Conversely, a high clustering_factor, where the value approaches the number of rows in the table (num_rows), indicates that the rows are not in the same sequence as the index, and additional I/O will be required for index range scans. As the clustering_factor approaches the number of rows in the table, the rows are out of sync with the index.

Oracle Metalink Note:223117.1 has some great advice for tuning-down “db file sequential read” waits by table reorganization in row-order:



- If Index Range scans are involved, more blocks than necessary could be being visited if the index is unselective: by forcing or enabling the use of a more selective index, we can access the same table data by visiting fewer index blocks (and doing fewer physical I/Os).



- If the index being used has a large Clustering Factor, then more table data blocks have to be visited in order to get the rows in each Index block: by rebuilding the table with its rows sorted by the particular index columns we can reduce the Clustering Factor and hence the number of table data blocks that we have to visit for each index block.



This validates the assertion that the physical ordering of table rows can reduce I/O (and stress on the database) for many SQL queries.
Pankaj Motewar
Oracle Architecture

The Oracle Relational Database Management System, or RDBMS, is designed to allow simultaneous access to large amounts of stored information. The RDBMS consists of the database (the information) and the instance (the embodiment of the system). The database contains the physical files that reside on the system and the logical pieces such as the database schema. These database files take various forms, as described in the following section. The instance is the method used to access the data and consists of processes and system memory.

Definations

The Database

The Oracle database has a logical layer and a physical layer. The physical layer consists of the files that reside on the disk; the components of the logical layer map the data to these physical components.

The Physical Layer

The physical layer of the database consists of three types of files:

• One or more datafiles--Datafiles store the information contained in the database. You can have as few as one datafile or as many as hundreds of datafiles. The information for a single table can span many datafiles or many tables can share a set of datafiles. Spreading tablespaces over many datafiles can have a significant positive effect on performance. The number of datafiles that can be configured is limited by the Oracle parameter MAXDATAFILES.

• Two or more redo log files--Redo log files hold information used for recovery in the event of a system failure. Redo log files, known as the redo log, store a log of all changes made to the database. This information is used in the event of a system failure to reapply changes that have been made and committed but that might not have been made to the datafiles. The redo log files must perform well and be protected against hardware failures (through software or hardware fault tolerance). If redo log information is lost, you cannot recover the system.

• One or more control files--Control files contain information used to start an instance, such as the location of datafiles and redo log files; Oracle needs this information to start the database instance. Control files must be protected. Oracle provides a mechanism for storing multiple copies of control files.

The Logical Layer

The logical layer of the database consists of the following elements:

• One or more tablespaces.

• The database schema, which consists of items such as tables, clusters, indexes, views, stored procedures, database triggers, sequences, and so on.

Tablespaces and Datafiles

The database is divided into one or more logical pieces known as tablespaces. A tablespace is used to logically group data together. For example, you can create one tablespace for accounting and a separate tablespace for purchasing. Segmenting groups into different tablespaces simplifies the administration of these groups (see Figure 2.1). Tablespaces consist of one or more datafiles. By using more than one datafile per tablespace, you can spread data over many different disks to distribute the I/O load and improve performance.



The relationship between the database, tablespaces, and datafiles.

As part of the process of creating the database, Oracle automatically creates the SYSTEM tablespace for you. Although a small database can fit within the SYSTEM tablespace, it's recommended that you create a separate tablespace for user data. The SYSTEM tablespace is where the data dictionary is kept. The data dictionary contains information about tables, indexes, clusters, and so on.

Datafiles can be operating system files or, in the case of some operating systems, RAW devices. Datafiles and data access methods are described in detail on Day 12.

The Database Schema

The database schema is a collection of logical-structure objects, known as schema objects, that define how you see the database's data. These schema objects consist of structures such as tables, clusters, indexes, views, stored procedures, database triggers, and sequences.

• Table--A table, which consists of a tablename and rows and columns of data, is the basic logical storage unit in the Oracle database. Columns are defined by name and data type. A table is stored within a tablespace; often, many tables share a tablespace.

• Cluster--A cluster is a set of tables physically stored together as one table that shares a common column. If data in two or more tables is frequently retrieved together based on data in the common column, using a clustered table can be quite efficient. Tables can be accessed separately even though they are part of a clustered table. Because of the structure of the cluster, related data requires much less I/O overhead if accessed simultaneously.

• Index--An index is a structure created to help retrieve data more quickly and efficiently (just as the index in this book allows you to find a particular section more quickly). An index is declared on a column or set of columns. Access to the table based on the value of the indexed column(s) (as in a WHERE clause) will use the index to locate the table data.

• View--A view is a window into one or more tables. A view does not store any data; it presents table data. A view can be queried, updated, and deleted as a table without restriction. Views are typically used to simplify the user's perception of data access by providing limited information from one table, or a set of information from several tables transparently. Views can also be used to prevent some data from being accessed by the user or to create a join from multiple tables.

• Stored procedure--A stored procedure is a predefined SQL query that is stored in the data dictionary. Stored procedures are designed to allow more efficient queries. Using stored procedures, you can reduce the amount of information that must be passed to the RDBMS and thus reduce network traffic and improve performance.

• Database trigger--A database trigger is a procedure that is run automatically when an event occurs. This procedure, which is defined by the administrator or developer, triggers, or is run whenever this event occurs. This procedure could be an insert, a deletion, or even a selection of data from a table.

• Sequence--The Oracle sequence generator is used to automatically generate a unique sequence of numbers in cache. By using the sequence generator you can avoid the steps necessary to create this sequence on your own such as locking the record that has the last value of the sequence, generating a new value, and then unlocking the record.

Segments, Extents, and Data Blocks

Within Oracle, the space used to store data is controlled by the use of logical structures. These structures consist of the following:

• Data blocks--A block is the smallest unit of storage in an Oracle database. The database block contains header information concerning the block itself as well as the data.

• Extents--Extents consist of data blocks.

• Segments--A segment is a set of extents used to store a particular type of data, as shown in Figure 2.2.

Segments, extents, and data blocks.

Segments

An Oracle database can use four types of segments:

• Data segment--Stores user data within the database.

• Index segment--Stores indexes.

• Rollback segment--Stores rollback information used when data must be rolled back.

• Temporary segment--Created when a SQL statement needs a temporary work area; these segments are destroyed when the SQL statement is finished. These segments are used during various database operations, such as sorts.

Extents

Extents are the building blocks of segments; in turn, they consist of data blocks. An extent is used to minimize the amount of wasted (empty) storage. As more and more data is entered into tablespaces in your database, the extents used to store that data can grow or shrink as necessary. In this manner, many tablespaces can share the same storage space without preallocating the divisions between those tablespaces.

At tablespace-creation time, you can specify the minimum number of extents to allocate as well as the number of extents to add at a time when that allocation has been used. This arrangement gives you efficient control over the space used in your database.

Data Blocks

Data blocks are the smallest pieces of an Oracle database; they are physically stored on disk. Although the data block in most systems is 2KB (2,048 bytes), you can change this size for efficiency depending on your application or operating system.

The Oracle Instance

The Oracle instance consists of the Oracle processes and shared memory necessary to access information in the database. The instance is made up of the user processes, the Oracle background processes, and the shared memory used by these processes (see Figure 2.3).

The Oracle Memory Structure

Oracle uses shared memory for several purposes, including caching of data and indexes as well as storing shared program code. This shared memory is broken into various pieces, or memory structures. The basic memory structures associated with Oracle are the System Global Area (SGA) and the Program Global Area (PGA).

The Oracle instance.

The System Global Area (SGA)

The SGA is a shared memory region that Oracle uses to store data and control information for one Oracle instance. The SGA is allocated when the Oracle instance starts and deallocated when the Oracle instance shuts down. Each Oracle instance that starts has its own SGA. The information in the SGA consists of the following elements, each of which has a fixed size and is created at instance startup:

The database buffer cache--This stores the most recently used data blocks. These blocks can contain modified data that has not yet been written to disk (sometimes known as dirty blocks), blocks that have not been modified, or blocks that have been written to disk since modification (sometimes known as clean blocks). Because the buffer cache keeps blocks based on a most recently used algorithm, the most active buffers stay in memory to reduce I/O and improve performance.

• The redo log buffer--This stores redo entries, or a log of changes made to the database. The redo log buffers are written to the redo log as quickly and efficiently as possible. Remember that the redo log is used for instance recovery in the event of a system failure.

• The shared pool--This is the area of the SGA that stores shared memory structures such as shared SQL areas in the library cache and internal information in the data dictionary. The shared pool is important because an insufficient amount of memory allocated to the shared pool can cause performance degradation. The shared pool consists of the library cache and the data-dictionary cache.

The Library Cache

The library cache is used to store shared SQL. Here the parse tree and the execution plan for every unique SQL statement are cached. If multiple applications issue the same SQL statement, the shared SQL area can be accessed by each to reduce the amount of memory needed and to reduce the processing time used for parsing and execution planning.

The Data-Dictionary Cache

The data dictionary contains a set of tables and views that Oracle uses as a reference to the database. Oracle stores information here about the logical and physical structure of the database. The data dictionary contains information such as the following:

• User information, such as user privileges

• Integrity constraints defined for tables in the database

• Names and data types of all columns in database tables

• Information on space allocated and used for schema objects

The data dictionary is frequently accessed by Oracle for the parsing of SQL statements. This access is essential to the operation of Oracle; performance bottlenecks in the data dictionary affect all Oracle users. Because of this, you should make sure that the data-dictionary cache is large enough to cache this data. If you do not have enough memory for the data-dictionary cache, you see a severe performance degredation. If you ensure that you have allocated sufficient memory to the shared pool where the data-dictionary cache resides, you should see no performance problems.

The Program Global Area (PGA)

The PGA is a memory area that contains data and control information for the Oracle server processes. The size and content of the PGA depends on the Oracle server options you have installed. This area consists of the following components:

• Stack space--This is the memory that holds the session's variables, arrays, and so on.

• Session information--If you are not running the multithreaded server, the session information is stored in the PGA. If you are running the multithreaded server, the session information is stored in the SGA.

• Private SQL area--This is an area in the PGA where information such as binding variables and runtime buffers is kept.

Processes

In many operating systems, traditional processes have been replaced by threads or lightweight processes. The term process is used in this book to describe a thread of execution, or a mechanism that can execute a set of code; process refers to the mechanism of execution and can refer to a traditional process or a thread.

The Oracle RDBMS uses two types of processes: user processes and Oracle processes (also known as background processes). In some operating systems (such as Windows NT), these processes are actually threads; for the sake of consistency, I will refer to them as processes.

User Processes

User, or client, processes are the user's connections to the RDBMS system. The user process manipulates the user's input and communicates with the Oracle server process through the Oracle program interface. The user process is also used to display the information requested by the user and, if necessary, can process this information into a more useful form.

Oracle Processes

Oracle processes perform functions for users. Oracle processes can be split into two groups: server processes (which perform functions for the invoking process) and background processes (which perform functions on behalf of the entire RDBMS).

Server Processes (Shadow Processes)

Server processes, also known as shadow processes, communicate with the user and interact with Oracle to carry out the user's requests. For example, if the user process requests a piece of data not already in the SGA, the shadow process is responsible for reading the data blocks from the datafiles into the SGA. There can be a one-to-one correlation between user processes and shadow processes (as in a dedicated server configuration); although one shadow process can connect to multiple user processes (as in a multithreaded server configuration), doing so reduces the utilization of system resources.

Background Processes

Background processes are used to perform various tasks within the RDBMS system. These tasks vary from communicating with other Oracle instances and performing system maintenance and cleanup to writing dirty blocks to disk. Following are brief descriptions of the nine Oracle background processes:

• DBWR (Database Writer)--DBWR is responsible for writing dirty data blocks from the database block buffers to disk. When a transaction changes data in a data block, that data block need not be immediately written to disk. Therefore, the DBWR can write this data to disk in a manner that is more efficient than writing when each transaction completes. The DBWR usually writes only when the database block buffers are needed for data to be read. Data is written in a least recently used fashion. For systems in which asynchronous I/O (AIO) is available, there should be only one DBWR process. For systems in which AIO is not available, performance can be greatly enhanced by adding more DBWR processes.

• LGWR (Log Writer)--The LGWR process is responsible for writing data from the log buffer to the redo log.

• CKPT (Checkpoint)--The CKPT process is responsible for signaling the DBWR process to perform a checkpoint and to update all the datafiles and control files for the database to indicate the most recent checkpoint. A checkpoint is an event in which all modified database buffers are written to the datafiles by the DBWR. The CKPT process is optional. If the CKPT process is not present, the LGWR assumes these responsibilities.

• PMON (Process Monitor)--PMON is responsible for keeping track of database processes and cleaning up if a process prematurely dies (PMON cleans up the cache and frees resources that might still be allocated). PMON is also responsible for restarting any dispatcher processes that might have failed.

• SMON (System Monitor)--SMON performs instance recovery at instance startup. This includes cleaning temporary segments and recovering transactions that have died because of a system crash. The SMON also defragments the database by coalescing free extents within the database.

• RECO (Recovery)--RECO is used to clean transactions that were pending in a distributed database. RECO is responsible for committing or rolling back the local portion of the disputed transactions.

• ARCH (Archiver)--ARCH is responsible for copying the online redo log files to archival storage when they become full. ARCH is active only when the RDBMS is operated in ARCHIVELOG mode. When a system is not operated in ARCHIVELOG mode, it might not be possible to recover after a system failure. It is possible to run in NOARCHIVELOG mode under certain circumstances, but typically should operate in ARCHIVELOG mode.

• LCKn (Parallel Server Lock)--Up to 10 LCK processes are used for interinstance locking when the Oracle Parallel Server option is used.

• Dnnn (Dispatcher)--When the Multithreaded Server option is used, at least one Dispatcher process is used for every communications protocol in use. The Dispatcher process is responsible for routing requests from the user processes to available shared server processes and back.

How Transactions Work

To give you a better idea how Oracle operates, this section analyzes a sample transaction. Throughout this book, the term transaction is used to describe a logical group of work that can consist of one or many SQL statements and must end with a commit or a rollback. Because this example is of a client/server application, SQL*Net is necessary. The following steps are executed to complete the transaction:

1. The application processes the user input and creates a connection to the server via SQL*Net.

2. The server picks up the connection request and creates a server process on behalf of the user.

3. The user executes a SQL statement or statements. In this example, the user changes the value of a row in a table.

4. The server process checks the shared pool to see whether there is a shared SQL area that has this identical SQL statement. If it finds an identical shared SQL area, the server process checks whether the user has access privileges to the data. If so, the server process uses the shared SQL area to process the request. If a shared SQL area is not found, a new shared SQL area is allocated, and the statement is parsed and executed.

5. The server process finds the data in the SGA (if it is present there) or reads the data from the datafile into the SGA.

6. The server process modifies the data in the SGA. Remember that the server processes can read only from the datafiles. At some later time, the DBWR process writes the modified blocks to permanent storage.

7. The user executes either the COMMIT or ROLLBACK statement. A COMMIT will finalize the transaction, a ROLLBACK will undo the changes. If the transaction is being committed, the LGWR process immediately records the transaction in the redo log file.

8. If the transaction is successful, a completion code is returned across the network to the client process. If a failure has occurred, an error message is returned.

IMPORTANT : A transaction is not considered committed until the write to the redo log file is complete. This arrangement ensures that in the event of a system failure, a committed transaction can be recovered. If a transaction has been committed, it is set in stone.


While transactions occur, the Oracle background processes do their jobs, keeping the system running smoothly. While this process occurs, hundreds of other users might be performing similar tasks. Oracle's job is to keep the system in a consistent state, to manage contention and locking, and to perform at the necessary rate.

This overview is intended to give you an understanding of the complexity and amount of interaction involved in the Oracle RDBMS. As you look in detail at the tuning of the server processes and applications later in this book, you can use this overview as a reference to the basics of how the Oracle RDBMS operates. Because of the differences in operating systems, minor variances in different environments will be discussed individually.

RDBMS Functionality

If the RDBMS is to operate, you must provide for certain functions, including data integrity, recovery from failure, error handling, and so on. This is accomplished via events such as checkpointing, logging, and archiving. The following sections list and describe some of these functions.

Checkpointing

You know that Oracle uses either the CKPT background process or the LGWR process to signal a checkpoint; but what is a checkpoint and why is it necessary?

Because all modifications to data blocks are done on the block buffers, some changes to data in memory are not necessarily reflected in the blocks on disk. Because caching is done using a least recently used algorithm, a buffer that is constantly modified is always marked as recently used and is therefore unlikely to be written by the DBWR. A checkpoint is used to ensure that these buffers are written to disk by forcing all dirty buffers to be written out on a regular basis. This does not mean that all work stops during a checkpoint; the checkpoint process has two methods of operation: the normal checkpoint and the fast checkpoint.

In the normal checkpoint, the DBWR merely writes a few more buffers every time it is active. This type of checkpoint takes much longer but affects the system less than the fast checkpoint. In the fast checkpoint, the DBWR writes a large number of buffers at the request of the checkpoint each time it is active. This type of checkpoint completes much quicker and is more efficient in terms of I/Os generated, but it has a greater effect on system performance at the time of the checkpoint.

You can use the time between checkpoints to improve instance recovery. Frequent checkpoints reduce the time required to recover in the event of a system failure. A checkpoint automatically occurs at a log switch.

Logging and Archiving

The redo log records all changes made to the Oracle database. The purpose of the redo log is to ensure that in the event of the loss of a datafile as a result of some sort of system failure, the database can be recovered. By restoring the datafiles back to a known good state from backups, the redo log files (including the archive log files) can replay all the transactions to the restored datafile, thus recovering the database to the point of failure.

When a redo log file is filled in normal operation, a log switch occurs and the LGWR process starts writing to a different redo log file. When this switch occurs, the ARCH process copies the filled redo log file to an archive log file. When this archive process has finished copying the entire redo log file to the archive log file, the redo log file is marked as available. It's critical that this archive log file be safely stored because it might be needed for recovery.

IMPORTANT : Remember that a transaction has not been committed until the redo log file has been written. Slow I/Os to the redo log files can slow down the entire system.

What Affects Oracle Performance?

Because one of the roles of the DBA is to anticipate, find, and fix performance problems, you must know what types of things affect performance. To understand why these things affect performance, you must first review the basics of how a computer system works.

Overview of Computer Architecture

Your computer system consists of thousands of individual components that work in harmony to process data. Each of these components has its own job to perform, and each has its own performance characteristics.

The brainpower of the system is the Central Processing Unit (CPU), which processes all the calculations and instructions that run on the computer. The job of the rest of the system is to keep the CPU busy with instructions to process. A well-tuned system runs at maximum performance if the CPU or CPUs are busy 100% of the time.

So how does the system keep the CPUs busy? In general, the system consists of different layers, or tiers, of progressively slower components. Because faster components are typically the most expensive, you must perform a balancing act between speed and cost efficiency.

CPU and Cache

The CPU and the CPU's cache are the fastest components of the system. The cache is high-speed memory used to store recently used data and instructions so that it can provide quick access if this data is used again in a short time. Most CPU hardware designs have a cache built into the CPU chip. This internal cache is known as a Level 1 (or L1) cache. Typically, an L1 cache is quite small--8-16KB.

When a certain piece of data is wanted, the hardware looks first in the L1 cache. If the data is there, it's processed immediately. If the data is not available in the L1 cache, the hardware looks in the L2 cache, which is external to the CPU chip but located close to it. The L2 cache is connected to the CPU chip(s) on the same side of the memory bus as the CPU. To get to main memory, you must use the memory bus, which affects the speed of the memory access.

Although the L2 cache is twice as slow as the L1 cache, it's usually much larger. Its larger size means you have a better chance of getting a cache hit. Typical L2 caches range in size from 128KB to 4MB.

Slower yet is the speed of the system memory--it's probably five times slower than the L2 cache. The size of system memory can range from 4MB for a small desktop PC to 2-4GB for large server machines. Some supercomputers have even more system memory than that.

As you can see from the timeline shown in Figure 2.4, there is an enormous difference between retrieving data from the L1 cache and retrieving data from the disk. This is why you spend so much time trying to take advantage of the SGA in memory. This is also why hardware vendors spend so much time designing CPU caches and fast memory buses.

Component speed comparison.

CPU Design

Most instruction processing occurs in the CPU. Although certain intelligent devices, such as disk controllers, can process some instructions, the instructions these devices can handle are limited to the control of data moving to and from the devices. The CPU works from the system clock and executes instructions based on clock signals. The clock rate and type of CPU determine how quickly these instructions are executed.

The CPU usually falls into one of two groups of processors: Complex Instruction Set Computer (CISC) or Reduced Instruction Set Computer (RISC).

CISC Processors

CISC processors (like the ones Intel builds) are by far the most popular processors. They are more traditional and offer a large instruction set to the program developer. Some of these instructions can be quite complicated; most instructions require several clock cycles to complete.

CISC processors are complex and difficult to build. Because these chips contain millions of internal components, the components are extremely close together. The physical closeness causes problems because there is no room for error. Each year, technology allows more complex and faster chips to be built, but eventually, physics will limit what can be done.

CISC processors carry out a wide range of tasks and can sometimes perform two or more instructions at a time in parallel. CISC processors perform most tasks, such as RDBMS processing, very well.

RISC Processors

RISC processors are based on the principle that if you can reduce the number of instructions processed by the CPU, the CPU can be simpler to build and can run faster. By putting fewer internal components inside the chip, the speed of the chip can be accelerated. One of the most popular RISC chips on the market is the DEC Alpha.

The system compiler determines what instructions are executed on the CPU chips. When the number of instructions was reduced, compilers were written to exploit this and to compensate for the missing instructions.

By reducing the instruction set, RISC manufacturers have been able to increase the clock speed to many times that of CISC chips. Although the faster clock speed is beneficial in some cases, it offers little improvement in others. One effect of a faster CPU is that the surrounding components such as L2 cache and memory must also run faster at an increase in cost.

One goal of some RISC manufacturers is to design the chip so that the majority of instructions complete within one clock cycle. Some RISC chips can already do this. But because some operations that require a single instruction for a CISC chip might require many instructions for a RISC chip, a speed-to-speed comparison cannot be made.

CISC versus RISC

Both CISC and RISC processors have their advantages and disadvantages; it's up to you to determine whether a RISC processor or a CISC processor will work best for you. When comparing the two types of processors, be sure to look at performance data and not just clock speed. Although the RISC chips have a much faster clock speed, they do less work per instruction. The performance of the system cannot be determined by clock speed alone.

Multiprocessor Systems

Multiprocessor systems can provide significant performance with very good value. With such a system, you can start with one or two processors and add more as needed. Multiprocessors fall into several categories; two of the main types of multiprocessor systems are the Symmetric Multiprocessor (SMP) system and the Massively Parallel Processing (MPP) system.

SMP Systems

SMP systems usually consist of a standard computer architecture with two or more CPUs that share the system memory, I/O bus, and disks. The CPUs are called symmetric because each processor is identical to any other processor in terms of function. Because the processors share system memory, each processor looks at the same data and the same operating system. In fact, the SMP architecture is sometimes called tightly coupled because the CPUs can even share the operating system.

In the typical SMP system, only one copy of the operating system runs. Each processor works independently by taking the next available job. Because the Oracle architecture is based on many processes working independently, you can see great improvement by adding processors.

The SMP system has these advantages:

• It's cost effective--The addition of a CPU or CPU board is much less expensive than adding another entire system.

• It's high performing--Under most applications, additional CPUs provide an incremental performance improvement.

• It's easily upgradable--Simply add a CPU to the system to instantly and significantly increase performance.

A typical SMP system supports between four and eight CPUs. Because the SMP system shares the system bus and memory, only a certain amount of activity can occur before the bandwidth of the bus is saturated. To add more processors, you must go to an MPP architecture.

MPP Systems

MPP systems are based on many independent units. Each processor in an MPP system typically has its own resources (such as its own local memory and I/O system). Each processor in an MPP system runs an independent copy of the operating system and its own independent copy of Oracle. An MPP system is sometimes called loosely coupled.

Think of an MPP system as a large cluster of independent units that communicate through a high-speed interconnect. As with SMP systems, you will eventually hit the bandwidth limitations of the interconnect as you add processors. However, the number of processors with which you hit this limit is typically much larger than with SMP systems.

If you can divide the application among the nodes in the cluster, MPP systems can achieve quite high scalability. Although MPP systems can achieve much higher performance than SMP systems, they are less economical: MPP systems are typically much higher in cost than SMP systems.

CPU Cache

Regardless of whether you use a single-processor system, an SMP system, or an MPP system, the basic architecture of the CPUs is similar. In fact, you can find the same Intel processors in both SMP and MPP systems.

As you learned earlier today, the system cache is important to the system. The cache allows quick access to recently used instructions or data. A cache is always used to store and retrieve data more quickly than the next level of storage (the L1 cache is faster than the L2 cache, the L2 cache is faster than main memory, and so on).

By caching frequently used instructions and data, you increase the likelihood of a cache hit. This can save precious clock cycles that would otherwise have been spent retrieving data from memory or disk.

System Memory Architecture

The system memory is basically a set of memory chips, either protected or not protected, that stores data and instructions used by the system. System memory can be protected by parity or by a more sophisticated advanced ECC correction method. Data parity will detect an incorrect value in memory and flag it to the system. An advanced ECC correction method will not only detect an incorrect value in memory, but in many cases can correct it. The system memory can range in size from 4MB on a small PC to 4GB on a large SMP server.

Typically, the more memory available to Oracle, the better your performance. Allocation of a large SGA allows Oracle to cache more data, thus speeding access to that data.

System memory is accessed by the CPUs through a high-speed bus that allows large amounts of data and instructions to be quickly moved from the CPU to L2 cache. Data and instructions are typically read from memory in large chunks and put into the cache. Because the CPU expects that memory will be read sequentially, in most cases it will read ahead the data or instruction that it thinks will be needed next. Sometimes this works, so the data that is needed next is already in cache; sometimes the CPU has guessed incorrectly and other data needs to be retrieved. This process of prereading the data is known as prefetching.

Depending on the specific implementation of an SMP system, the memory bus might be shared by all system processors; alternatively, each processor might have a private bus to memory.

Virtual Memory System

In a virtual memory system, the OS and hardware allow programs and users to use more memory than is actually available in the system hardware. This memory, known as virtual memory, can be mapped to physical memory. Code or data that is being run by the CPU must reside in physical memory. If a program or data that is larger than physical memory is being accessed, the parts of code and data that are not immediately needed by the program can reside in virtual memory, not physical memory. As that bit of code or data is needed, it can be copied into physical memory, and parts no longer needed can be copied to disk. The process of mapping virtual memory onto physical memory by copying the memory to and from disk is called paging or swapping (depending on the OS architecture).

Both paging and swapping serve the same purpose, but each operates slightly differently from the other. In a swapping system, an entire process is swapped out (moved from memory to disk) or swapped in (moved from disk to memory). In a paging system, the movement of data to and from the secondary storage occurs on a memory page basis; when more memory is needed, one or more pages is paged out (moved from memory to disk) to make room. A memory page is the smallest unit of memory that is used in the operating system. A typical memory page size is 4KB. If data is requested from virtual memory and is not in physical memory, that data is paged in (moved from disk to memory) as needed. The rest of this section uses the term paging to describe both paging and swapping.

Suppose you have a computer system with 16MB of physical memory. If you have a program that needs to access 20MB of data, it obviously won't fit in physical memory. In a virtual memory system, the data is read until little memory remains (the OS reserves some for itself), then the OS copies some of the data pages to disk with the paging mechanism. This is usually done using a least recently used algorithm in which the oldest data is moved out. When some memory has been freed, the program can read more data into memory. As far as the program is concerned, all the data is still in memory; in fact, it is--in virtual memory. As the program begins to reread some of the data and manipulate it, different pieces might be paged in (from disk to physical memory) and paged out (from physical memory to disk).

As you can imagine, paging in or out can be time consuming and uses a lot of system resources. This is why I warn you several times in this book to avoid using so much memory that you cause paging or swapping. Access to disk is approximately 50 times slower than access to memory.

Oracle Resources

The Oracle DBMS allocates different resources for various different functions, including the allocation of system memory. The memory might be allocated for database caching or for the data dictionary or library cache. The careful balance of this precious resource is very important in tuning the Oracle RDBMS.

As much data as possible must be cached to avoid the additional cost of going to disk. If you allocate a large Oracle data cache, a higher cache-hit rate can be achieved. A high cache-hit rate indicates that a large percentage of requested data is found in the Oracle cache rather than retrieved from disk.

Application Design

Application design can affect performance more than any other factor. In most cases, performance can be severely degraded by an application that does not have well-tuned SQL statements or does not use indexes. A good application design can also significantly improve performance. The application is typically the first place to look when you experience system performance problems.

If a database is built with indexes on a certain set of columns but those columns are not specified in the WHERE clause of the SQL statement, the index probably won't be used. It's not enough to create the correct index on tables; you must ensure that the indexes are used.

Oracle Features

Another way to improve Oracle performance is to enable Oracle performance features. Among the most important of these features (and my personal favorite) is the Oracle Parallel Query option. Other Oracle performance features include partitioned tables and the Oracle index-only table, both new in Oracle8.

The Oracle Parallel Query Option

The Oracle Parallel Query option allows parallelism of many different operations, which greatly enhances performance. The Oracle Parallel Query option consists of several different components, including

• Parallel query

• Parallel index creation

• Parallel recovery

• Parallel table creation

• Parallel index tables

Parallel Query

The Oracle parallel query allows a single query to be divided into components and run in parallel. Because a query spends much of its time waiting for I/O operations to complete, parallelizing queries can greatly improve performance. In a well-tuned system where I/O is not a problem, parallel queries can run many times faster than normal queries. Statements that can be parallelized include

• Table scans

• Sorts

• Joins

IMPORTANT : You might be wondering why parallelizing operations would help performance; after all, the work must still be done. In a typical Oracle operation (for example, a SELECT statement), the following steps occur:

1. Oracle performs some CPU processing to determine what data is needed.

2. Oracle submits an I/O request to disk (assuming that the data is not already in the SGA) and then waits for that I/O to complete.

3. This operation is repeated until all data is retrieved.

In the case of a parallel query, these steps would be adjusted like so:

1. Oracle performs some CPU processing to determine the query operation.

2. Different Oracle processes or threads receive their instructions on what data is needed.

3. Oracle thread 1 submits an I/O request to disk (if that data is not already in the SGA) and waits for that I/O to complete.

4. Oracle thread 2 submits an I/O request to disk (if that data is not already in the SGA) and waits for that I/O to complete.

5. Oracle thread 3 submits an I/O request to disk (if that data is not already in the SGA) and waits for that I/O to complete.

As shown here, that the time-consuming job of retrieving data from disk is duplicated, thus improving performance. This parallelism allows the CPU(s) to be utilized while other threads are waiting for I/Os.

Retrieving data from disk is a slow process compared to the activity of the CPU, and your goal is to keep the CPUs busy. Because a significant part of any Oracle operation involves CPU processing and I/Os, it is possible and desirable to keep the CPUs busy while many I/Os are being processed simultaneously. This is the main goal of the Parallel Query option.

Parallel Index Creation

Index creation involves reading from data tables and then writing to the index tables. Because the parallel query allows reading of tables to be accelerated, the index-creation process is sped up. Index creations can be quite time consuming, so this can be a real advantage.

Parallel Recovery

Recovery from a system failure can be quite time consuming. During recovery, users must usually wait for the system to come back online, so any improvement in performance is an advantage. Parallel recovery can speed the recovery process by parallelizing the read from the redo log files, and the roll forward and rollback process.

Parallel Table Creation

Although the Oracle Parallel Query option does not generally allow table creations to occur, it is often the case when a table is created as a subset of other tables. Data is often reduced from several large tables into a smaller subset, and this parallelism can be beneficial. In such instances, the following statement allows for parallelism:

CREATE TABLE table_name AS SELECT...

Oracle Index Tables

The index table allows indexes and tables to be stored together; this saves space and improves performance by reducing disk I/O. If you reduce the number of required disk I/Os, data can be accessed much faster.

OS Resources

In most systems, few resources can be allocated in the operating system. Most OS parameters are changed only to allocate sufficient resources to Oracle; additional resources usually do not improve performance. A lack of resources, however, can decrease performance. OS resources often refers to system memory or, in the case of UNIX, shared memory. Other OS resources and tunables include network buffers and disk I/O tunables.

IMPORTANT  :  Windows NT is fairly self tunable, but there are a few things, relating primarily to configuration, to look out for:

• Remove unnecessary network protocols--Depending on how the system is configured, several network protocols that you do not use might be configured into your system. These extra protocols use CPU and memory resources.

• Configure the protocols you use in order from most-often used to least-often used--This reduces some of the overhead associated with traversing the infrequently used protocols.

• Keep a close eye on paging--Windows NT treats all memory as virtual. The best way to determine whether your system is paging is to watch Pages/Sec in NT's perfmon. If paging occurs, lower the amount of memory allocated to Oracle.

Hardware

Several hardware factors can affect your system's performance. These factors include

• Memory capacity

• Number of CPUs

• CPU cache

• Memory-bus bandwidth

• I/O capacity

Memory Capacity

Earlier today you saw an overview of how the system hardware operates. Clearly, any operation that must access slower components, such as a disk or network, will slow down processing. Therefore, it is important that you have sufficient memory in your system.

Most hardware architectures are limited to 4GB of physical memory, but some architectures on the market support much more. These architectures are said to support a VLM, or Very Large Memory, architecture. Soon it will be possible to support hundreds of gigabytes of physical memory in a system, allowing for very fast RDBMS operations.

System memory is allocated to Oracle and used for database caching, user memory, and the shared pool, which is used for both the data dictionary and the library cache. You must have enough memory for the shared pool because an insufficient shared pool can hurt performance. When the shared pool is satisfied, the more database buffers you can allocate to the DBMS the better. Be careful, though, to avoid starving the PGA memory needed by your processes, and avoid paging at all costs. You can never have too much memory in your system. Anything that can be cached will reduce system I/O, improving performance.

Number of CPUs

Oracle typically scales well with additional CPUs. By adding CPUs you can see significant performance improvement with little additional cost. Some factors that determine how much improvement you will see by adding more processors are the CPU cache and memory- bus bandwidth.

CPU Cache

A large CPU cache allows more data and executable code to be stored on the local processor than in memory. This reduces the number of times the CPU must access main memory. Whenever the CPU accesses memory, a slowdown occurs while the CPU waits for that data or code to be retrieved. It is especially bad when the memory bus is busy; the CPU waits even longer until the bus becomes free.

Memory-Bus Bandwidth

The memory-bus bandwidth determines how quickly data can be transferred between CPU to memory. If the memory bus is busy when data or code is needed, a CPU stalls waiting for the bus to free. This can severely degrade performance in a multiprocessor computer. A fast memory bus can reduce this problem. A large CPU cache can also reduce this problem by allowing more data and code to be cached.

I/O Capacity

I/O is typically one of the biggest factors limiting system performance. Because most DBMS operations involve retrieving data from disk, I/O can be a limiting factor if you do not have adequate capacity for your system load. Fortunately, you can usually solve this problem by carefully configuring your system for proper I/O distribution and by having sufficient I/O capacity. Simply having adequate disk space is insufficient; you must also have enough disk drives to support the number of disk I/Os that the system requires.

Oracle8 New Features

Oracle8 has introduced many new features, and I would like to focus on a few key features for the Oracle8 DBA:

• Partitioned objects

• Improved parallelism

• New index types

• Enhanced recovery features

Partitioned Objects

Partitioned objects allow Oracle objects, such as tables and indexes, to be broken into smaller, more manageable pieces. Partitioning these objects allows many operations that could normally be performed on only a table or an index to be divided into operations on a partition. By dividing these operations, you can often increase the parallelism of those operations, thus improving performance and minimizing system downtime.

Partitions are enabled via the PARTITION BY RANGE parameter of the CREATE TABLE statement. In this manner, ranges of data are assigned to each individual partition like so:

CREATE TABLE emp

(

name CHAR(30),

address CHAR(40),

region INTEGER

)

PARTITION BY RANGE ( region)

(

PARTITION VALUES LESS THAN (10) TABLESPACE tbl0,

PARTITION VALUES LESS THAN (20) TABLESPACE tbl1,

PARTITION VALUES LESS THAN (30) TABLESPACE tbl2

);


Table partitioning.

Partitioning is recommended for large tables because it makes them much more manageable. Oracle does not currently support partitioning of clusters. By partitioning a table, you can break that large table into several much smaller pieces. A partitioned table can take advantage of some of the following features:

• Partitioned DML

• Exporting/importing by partition

• Range partitioning

• Local and global indexing

• Parallel loading by partition

Partitioned DML

Parallel INSERT, DELETE, and UPDATE operations can occur on a partition basis. Using partitions allows these operations to be conducted either globally or locally within a partition.

Exporting/Importing by Partition

Partitioning allows operations such as exports and imports to be performed on a partition basis. This can reduce the time required by some maintenance operations, such as reorganization of data or reclustering. This also allows you to change the physical layout of your database on a partition basis. If you limit the scope of export and import operations, they can benefit from a large degree of parallelism.

Range Partitioning

Range partitioning is a method whereby the partitioning of data is done based on the value of the data itself. This allows for tremendous flexibility in distributing data based on ranges of data values. Range partitioning allows you to partition high-volume data separately from low-volume data or to separate current from old data.

Local and Global Indexing

New Term: A local index indexes data that resides in only one partition. A global index indexes data that resides on more than one partition. This allows for great flexibility in terms of adding new indexes, reducing index sizes, and allowing for partition independence.

An example of where local indexing might be beneficial is a table where sales records are stored. Using table and index partitioning, you can store data and indexes separately based on calendar months; doing this allows reduced index size and faster index lookups for entries of a particular month. If you partition these entries you can add new months and delete outdated entries without reindexing the entire table. You could keep 12 months of partitions and indexes online in this manner.

Parallel Loading by Partition

With a partitioned table, SQL*Loader can either load an entire table in parallel by partition or simply load a single partition. Either method provides great flexibility.

If you use the conventional path load, the loader automatically distributes the data to the correct partition and updates the local and global indexes. You can also use the loader to load a partitioned table or a partition of a table. Again, indexes are built automatically. It is also possible to direct-load a partition in parallel provided that no global indexes exist, but you must rebuild the local indexes yourself.

Improved Parallelism

The arrival of Oracle8 has heralded tremendous improvement in the area of parallelization. In addition to the new parallel features listed previously, some existing parallel operations have been extended.

Parallel recovery has been improved by allowing rollbacks of parallel DML operations that have failed to be performed in parallel. This parallel transaction recovery is supported on transaction and process failures but not during instance recovery.

New parallel hints have been added for parallel insert operations. The APPEND hint tells the optimizer to append the insert data beyond the high water mark of the segment.

New Index Types

The index-only table is new in Oracle8. With traditional indexes and tables, data and indexes are stored separately. With an index-only table, the data to which the index refers is stored in the leaf block or lowest level block of the index, so the data and indexes are stored together. Depending on your application, this can be an advantage.

Applications that access data primarily via a key value can see an advantage from the use of index-only tables. Because the data is stored within the index, the data is immediately available when the index has reached its lowest level. This can speed data retrieval.

Applications that do not access data primarily via a key value will see no improvement; indeed, performance will likely be degraded in these applications. Any application that involves table scans or requires multiple indexes will not benefit from the index table. The index table is covered in much more detail on Day 13.

Enhanced Recovery Features

Oracle has made tremendous improvements in the areas of backup and recovery. Most of these new features revolve around the Recovery Manager. Another recovery feature in Oracle8 is the image copy backup, which can improve recovery time in the event of a failure.

Recovery Manager

Recovery Manager is an online utility designed to assist the DBA with all backup and recovery operations. Not only does it perform the backup and recovery, it maintains a database called the recovery catalog that stores information about these operations.

Image Copy Backup

An image copy backup essentially allows you to copy a datafile to another place on disk or to another disk on your system. In the event of a failure, no recovery is necessary from the image copy; you must simply switch to that backup copy. You must, however, perform a recovery to make that copy current. In the event of a failure, this might be the fastest way to recover.

Oracle Products

As part of the overview of the Oracle system, I would like to briefly cover the optional available Oracle products. Although many of these products are covered elsewhere in this book, you should at least aware of their existence. The Oracle product line is divided into three areas:

• The Oracle server

• Development tools

• Applications