Oracle database Performance Tuning FAQ

From Oracle FAQ
Jump to: navigation, search

General Oracle database Performance Tuning FAQ. Remember: The best performance comes from the unnecessary work you don't do.

Why and when should one tune?[edit]

One of the biggest responsibilities of a DBA is to ensure that the Oracle database is tuned properly. The Oracle RDBMS is highly tunable and allows the database to be monitored and adjusted to increase its performance.

One should do performance tuning for the following reasons:

  • The speed of computing might be wasting valuable human time (users waiting for response);
  • Enable your system to keep-up with the speed business is conducted; and
  • Optimize hardware usage to save money (companies are spending millions on hardware).

Never forget that tuning should be aimed at fixing business problems: the overnight batch jobs don't finish till lunchtime; a screen takes 10 seconds to refresh, and SLA says 1 second; a report is needed every 10 minutes, but it takes 20 minutes to generate. Do not tune because you see some figure in a report that you don't like. For example, no end user has ever telephoned the help desk to complain that "there are too many buffer busy wait events".

Although this site is not overly concerned with hardware issues, one needs to remember than you cannot tune a Buick into a Ferrari.

Where should the tuning effort be directed?[edit]

Consider the following areas for tuning. The order in which steps are listed needs to be maintained to prevent tuning side effects. For example, it is no good increasing the buffer cache if you can reduce I/O by rewriting a SQL statement.

  • Database Design (if it's not too late):
Poor system performance usually results from a poor database design. One should generally normalize to the 3NF. Selective denormalization can provide valuable performance improvements. When designing, always keep the "data access path" in mind. Also look at proper data partitioning, data replication, aggregation tables for decision support systems, etc.
  • Application Tuning:
Experience shows that approximately 80% of all Oracle system performance problems are resolved by coding optimal SQL. Also consider proper scheduling of batch tasks after peak working hours.
  • Memory Tuning:
Properly size your database buffers (shared_pool, buffer cache, log buffer, etc) by looking at your wait events, buffer hit ratios, system swapping and paging, etc. You may also want to pin large objects into memory to prevent frequent reloads.
  • Disk I/O Tuning:
Database files needs to be properly sized and placed to provide maximum disk subsystem throughput. Also look for frequent disk sorts, full table scans, missing indexes, row chaining, data fragmentation, etc.
  • Eliminate Database Contention:
Study database locks, latches and wait events carefully and eliminate where possible.
  • Tune the Operating System:
Monitor and tune operating system CPU, I/O and memory utilization. For more information, read the related Oracle FAQ dealing with your specific operating system.

What tools/utilities does Oracle provide to assist with performance tuning?[edit]

Oracle provides the following tools/ utilities to assist with performance monitoring and tuning:

  • ADDM (Automated Database Diagnostics Monitor) introduced in Oracle 10g - Tuning Pack (cost option)
  • AWR - Tuning Pack (cost option)
  • Statspack - Free
  • TKProf - Free
  • Oracle Enterprise Manager - Tuning Pack (cost option)
  • Old UTLBSTAT.SQL and UTLESTAT.SQL - Begin and end stats monitoring, and report generation

When is cost based optimization triggered?[edit]

It's important to have statistics on all tables for the CBO (Cost Based Optimizer) to work correctly. If one table involved in a statement does not have statistics, and optimizer dynamic sampling isn't performed, Oracle has to revert to rule-based optimization for that statement. So you really want for all tables to have statistics right away; it won't help much to just have the larger tables analyzed.

Generally, the CBO can change the execution plan when you:

  • Change statistics of objects by doing an ANALYZE;
  • Change some initialization parameters (for example: hash_join_enabled, sort_area_size, db_file_multiblock_read_count).

How can one optimize %XYZ% queries?[edit]

It is possible to improve %XYZ% (wildcard search) queries by forcing the optimizer to scan all the entries from the index instead of the table. This can be done by specifying hints.

If the index is physically smaller than the table (which is usually the case) it will take less time to scan the entire index than to scan the entire table.

Where can one find I/O statistics per table?[edit]

The STATSPACK and UTLESTAT reports show I/O per tablespace. However, they do not show which tables in the tablespace has the most I/O operations.

The $ORACLE_HOME/rdbms/admin/catio.sql script creates a sample_io procedure and table to gather the required information. After executing the procedure, one can do a simple SELECT * FROM io_per_object; to extract the required information.

For more details, look at the header comments in the catio.sql script.

My query was fine last week and now it is slow. Why?[edit]

The likely cause of this is because the execution plan has changed. Generate a current explain plan of the offending query and compare it to a previous one that was taken when the query was performing well. Usually the previous plan is not available.

Some factors that can cause a plan to change are:

  • Which tables are currently analyzed? Were they previously analyzed? (ie. Was the query using RBO and now CBO?)
  • Has OPTIMIZER_MODE been changed in INIT<SID>.ORA?
  • Has the DEGREE of parallelism been defined/changed on any table?
  • Have the tables been re-analyzed? Were the tables analyzed using estimate or compute? If estimate, what percentage was used?
  • Have the statistics changed?
  • Has the SPFILE/ INIT<SID>.ORA parameter DB_FILE_MULTIBLOCK_READ_COUNT been changed?
  • Has the INIT<SID>.ORA parameter SORT_AREA_SIZE been changed?
  • Have any other INIT<SID>.ORA parameters been changed?

What do you think the plan should be? Run the query with hints to see if this produces the required performance.

It can also happen because of a very high high water mark. Typically when a table was big, but now only contains a couple of records. Oracle still needs to scan through all the blocks to see if they contain data.

Does Oracle use my index or not?[edit]

One can use the index monitoring feature to check if indexes are used by an application or not. When the MONITORING USAGE property is set for an index, one can query the v$object_usage to see if the index is being used or not. Here is an example:

SQL> CREATE TABLE t1 (c1 NUMBER);
Table created.

SQL> CREATE INDEX t1_idx ON t1(c1);
Index created.

SQL> ALTER INDEX t1_idx MONITORING USAGE;
Index altered.

SQL>
SQL> Prompt this view should be consulted as the owner of the object of interest (e.g. system will mostly see an empty view).
SQL> SELECT table_name, index_name, monitoring, used FROM v$object_usage;
TABLE_NAME                     INDEX_NAME                     MON USE
------------------------------ ------------------------------ --- ---
T1                             T1_IDX                         YES NO

SQL> SELECT * FROM t1 WHERE c1 = 1;
no rows selected

SQL> SELECT table_name, index_name, monitoring, used FROM v$object_usage;
TABLE_NAME                     INDEX_NAME                     MON USE
------------------------------ ------------------------------ --- ---
T1                             T1_IDX                         YES YES

To reset the values in the v$object_usage view, disable index monitoring and re-enable it:

ALTER INDEX indexname NOMONITORING USAGE;
ALTER INDEX indexname MONITORING   USAGE;

Why is Oracle not using the darn index?[edit]

This problem normally only arises when the query plan is being generated by the Cost Based Optimizer (CBO). The usual cause is because the CBO calculates that executing a Full Table Scan would be faster than accessing the table via the index. Fundamental things that can be checked are:

  • USER_TAB_COLUMNS.NUM_DISTINCT - This column defines the number of distinct values the column holds.
  • USER_TABLES.NUM_ROWS - If NUM_DISTINCT = NUM_ROWS then using an index would be preferable to doing a FULL TABLE SCAN. As the NUM_DISTINCT decreases, the cost of using an index increase thereby making the index less desirable.
  • USER_INDEXES.CLUSTERING_FACTOR - This defines how ordered the rows are in the index. If CLUSTERING_FACTOR approaches the number of blocks in the table, the rows are ordered. If it approaches the number of rows in the table, the rows are randomly ordered. In such a case, it is unlikely that index entries in the same leaf block will point to rows in the same data blocks.
  • Decrease the INIT<SID>.ORA parameter DB_FILE_MULTIBLOCK_READ_COUNT - A higher value will make the cost of a FULL TABLE SCAN cheaper.

Remember that you MUST supply the leading column of an index, for the index to be used (unless you use a FAST FULL SCAN or SKIP SCANNING).

There are many other factors that affect the cost, but sometimes the above can help to show why an index is not being used by the CBO. If from checking the above you still feel that the query should be using an index, try specifying an index hint. Obtain an explain plan of the query either using TKPROF with TIMED_STATISTICS, so that one can see the CPU utilization, or with AUTOTRACE to see the statistics. Compare this to the explain plan when not using an index.

When should one rebuild an index?[edit]

You can run the ANALYZE INDEX <index> VALIDATE STRUCTURE command on the affected indexes - each invocation of this command creates a single row in the INDEX_STATS view. This row is overwritten by the next ANALYZE INDEX command, so copy the contents of the view into a local table after each ANALYZE. The 'badness' of the index can then be judged by the ratio of 'DEL_LF_ROWS' to 'LF_ROWS'.

For example, you may decide that index should be rebuilt if more than 20% of its rows are deleted:

select del_lf_rows * 100 / decode(lf_rows,0,1,lf_rows) from index_stats
where name = 'index_ name';

However, generally speaking, indexes should not be rebuilt because performance will be degraded for a considerable time afterwards. The issue is that after rebuild, all blocks are full up to the percent free setting. Subsequent DML will therefore cause many block splits, until the index stabilizes with an appropriate amount of free space.

How does one tune Oracle Wait event XYZ?[edit]

Here are some of the wait events from V$SESSION_WAIT and V$SYSTEM_EVENT views:

  • db file sequential read: Tune SQL to do less I/O. Make sure all objects are analyzed. Redistribute I/O across disks.
  • buffer busy waits: Increase DB_CACHE_SIZE (DB_BLOCK_BUFFERS prior to 9i)/ Analyze contention from SYS.V$BH ("buffer busy waits" was replaced with "read by other session" Oracle 10g).
  • log buffer space: Increase LOG_BUFFER parameter or move log files to faster disks
  • log file sync: If this event is in the top 5, you are committing too often (talk to your developers)
  • log file parallel write: deals with flushing out the redo log buffer to disk. Your disks may be too slow or you have an I/O bottleneck.

Two useful sections in Oracle's Database Performance Tuning Guide:

What is the difference between DBFile Sequential and Scattered Reads?[edit]

Both "db file sequential read" and "db file scattered read" events signify time waited for I/O read requests to complete. Time is reported in 100's of a second for Oracle 8i releases and below, and 1000's of a second for Oracle 9i and above. Most people confuse these events with each other as they think of how data is read from disk. Instead they should think of how data is read into the SGA buffer cache.

db file sequential read:

A sequential read operation reads data into contiguous memory (usually a single-block read with p3=1, but can be multiple blocks). Single block I/Os are usually the result of using indexes. This event is also used for rebuilding the controlfile and reading datafile headers (P2=1). In general, this event is indicative of disk contention on index reads.

db file scattered read:

Similar to db file sequential reads, except that the session is reading multiple data blocks and scatters them into different discontinuous buffers in the SGA. This statistic is NORMALLY indicating disk contention on full table scans. Rarely, data from full table scans could be fitted into a contiguous buffer area, these waits would then show up as sequential reads instead of scattered reads.

The following query shows average wait time for sequential versus scattered reads:

prompt "AVERAGE WAIT TIME FOR READ REQUESTS"
select a.average_wait "SEQ READ", b.average_wait "SCAT READ"
from   sys.v_$system_event a, sys.v_$system_event b
where  a.event = 'db file sequential read'
and    b.event = 'db file scattered read';

How does one tune the Redo Log Buffer?[edit]

The size of the Redo log buffer is determined by the LOG_BUFFER parameter in your SPFILE/INIT.ORA file. The default setting is normally 512 KB or (128 KB * CPU_COUNT), whichever is greater. This is a static parameter and its size cannot be modified after instance startup.

SQL> show parameters log_buffer
NAME                                 TYPE        value
------------------------------------ ----------- ------------------------------
log_buffer                           integer     262144

When a transaction is committed, info in the redo log buffer is written to a Redo Log File. In addition to this, the following conditions will trigger LGWR to write the contents of the log buffer to disk:

  • Whenever the log buffer is MIN(1/3 full, 1 MB) full; or
  • Every 3 seconds; or
  • When a DBWn process writes modified buffers to disk (checkpoint).

Larger LOG_BUFFER values reduce log file I/O, but may increase the time OLTP users have to wait for write operations to complete. In general, values between the default and 1 to 3MB are optimal. However, you may want to make it bigger to accommodate bulk data loading, or to accommodate a system with fast CPUs and slow disks. Nevertheless, if you set this parameter to a value beyond 10M, you should think twice about what you are doing.

SQL> SELECT name, value
  2    FROM SYS.v_$sysstat
  3   WHERE NAME in ('redo buffer allocation retries',
  4                  'redo log space wait time');
NAME                                                                  value
---------------------------------------------------------------- ----------
redo buffer allocation retries                                            3
redo log space wait time                                                  0

Statistic "REDO BUFFER ALLOCATION RETRIES" shows the number of times a user process waited for space in the redo log buffer. This value is cumulative, so monitor it over a period of time while your application is running. If this value is continuously increasing, consider increasing your LOG_BUFFER (but only if you do not see checkpointing and archiving problems).

"REDO LOG SPACE WAIT TIME" shows cumulative time (in 10s of milliseconds) waited by all processes waiting for space in the log buffer. If this value is low, your log buffer size is most likely adequate.