DB_BLOCK_SIZE (2048 bytes) - 100 bytes overhead = 1948 bytes available for data
if avg row size = 195 bytes then = 10% of 1948
set DB_BLOCK_SIZE larger than 2K default (4K, 8K, etc.) - especially when mostly batch operations
use high PCTINCREASE if table will grow significantly; otherwise since growth rates are linear over long periods set PCTINCREASE to 0 for segments, but properly set initial and next values (since pctincrease is 0 SMON won't be coalescing neighboring free extents into 1 large free extent so periodically run alter table with coalesce option
PCTUSED + PCTFREE should = 90% (instead of 100%)
PCTFREE
better to have small PCTFREE and some migrated rows than high PCTFREE and only a few rows/block
low PCTFREE if rows will not be updated much
increase PCTFREE to decrease row migration
if update-intensive set PCTFREE that reflects avg growth rate of rows in table
set to 0 for DSS applications (completely static tables) - OLTP needs higher or will cause row migration
PCTFREE rule of thumb: ((1 - avg row length when inserted) / (avg row length)) * 100
PCTUSED
low PCTUSED if there will be excessive insert activity
high PCTUSED if there will be excessive delete activity
basically irrelevant for DSS applications
PCTUSED for tables with avg activity: (((100 - avg total row length) * 100) / (block size)) - PCTFREE (only relevant for tables that undergo deletes)
use ANALYZE to identify chained and migrated rows
chained rows - when rows are too large to fit into a single data block (if so increase DB_BLOCK_SIZE)
row migration - when an update is done on a row that increases the row length so that it no longer fits in the same data block - another block is found which then points back to original block - so now 2 I/Os instead of 1 - set PCTFREE to allow enough space for updates
ANALYZE TABLE LIST CHAINED ROWS
stores it in CHAINED_ROWS table in schema (or specifically use into table clause) - need to run utlchain.sql first in schema
MAXEXTENTS - extents can't span data files and reads can't span extents so make sure size of extents is a multiple of the size of each read that is performed (so full table scan performance isn't degraded)
Create separate tablespaces for specific functions, should have SYSTEM, RBS (when create non-system tablespaces you need to create a non-system RBS), TEMP, DATA (for each application), and USERS (for user's personal tables) tablespaces. Use 'create user' or 'alter user' statements so that users access appropriate tablespaces.
Sample disk configuration for tablespaces (DBA_TABLESPACES) - spread across devices - example 9-disk environment:
disk 8 - all 3 redo-log files and export dump files
disk 9 - application software and archived redo logs
Put Temp tablespace (for temporary segments) on a disk w/ low activity. They are used for sorting, index creation, and joining that can't be done in memory. Have to then change users so that they use it by a 'create user' or 'alter user' statements. The INITIAL parameter for the temp tablespace should be moderately sized via trial and error. The NEXT parameter should be twice as large as the INITIAL and PERCENT_INCREASE should be 50 so that normal transactions only need 1 extent.
Put index and data tablespaces on separate disks. Then 1 disk head can read the index information and then pass it off to the other disk head to read the data. This is only if data is large b/c otherwise everything will fit in memory anyway.
Create at least 2 user-defined rollback tablespaces on separate disks to hold rollback segments. Also, create multiple rollback segments in each rollback tablespace - all segments should be private - use is determined via init.ora. Can also create 1 large RBS w/ larger initial and next parameters and in code use 'set transaction use rollback segment' to utilize it for larger transactions. Make sure optimal value for RBS isn't too high so that additional overhead isn't incurred after each transaction.
Order your rollback segments in init.ora so that they toggle across multiple disks. For instance rollback_segments = (rbs01, rbs04, rbs02, rbs05, rbs03, rbs06) if 1,2,3 on disk 1 and 4,5,6 on disk 2.
Oracle optimizer
- rule - fixed set of rules based on SQL statement syntax - doesn't require gathering of any statistics - in init.ora have OPTIMIZER_MODE=RULE or type ALTER SESSION SET OPTIMIZER_GOAL=RULE
- cost - uses previously gathered statistics got by running the ANALYZE command (exclusively locks objects so run at off-peak hours - results stored in data dictionary) - need to regularly run ANALZYE command - if no statistics have been gathered the rule method will be used instead - can estimate or get exact statistics
- set by OPTIMZER_MODE=CHOOSE (default) or by a ALTER SESSION SET OPTIMIZER_GOAL=CHOOSE
- use hints if not doing what you want - for example to tell the optimizer what table to use as the driving table in a join (should be one with less rows):
SELECT /*+ ORDERED(A,B) */ COUNT(*)
FROM ORDERS A, ORDER_DETAILS B
WHERE A.ORDERID = B.ORDERID
Indexes
- contain pointers to rows matching key value to rowid
- 3 types - unique, non-unique, composite (max of 16 columns + column order
matters)
ALTER TABLE emp ADD CONSTRAINT emp_pk_empno PRIMARY KEY (empno)
CREATE INDEX i_emp_ename ON emp(ename)
CREATE INDEX i_emp_deptno_ename ON emp(deptno, ename)
- Oracle uses B-tree indexes (balanced indexes) - faster queries, but inserts will be slower as you have more indexes on the table - also updates/deletes may be possibly slower
- as indexes grow they run out of space and then branch levels (upper tiers) are created which then identify the contents in each block on the next tier which finally point to leaf levels (final tier) where index keys and row pointers are stored - need to rebuild indexes
Clusters
- store table data in a group (or cluster) based on a key value
index cluster (or just cluster)
- key value is indexed - need index on it referred to as a cluster index
- tables in it have 1 or more columns in common, are often joined together, and are stored in the same data blocks
- will also save disk space when multiple rows have same key value they are stored in the same data block
- good for large non-volatile tables (infrequently updated), tables with redundant values within a key column (but a wide range of values - not just M or F), and tabled that are often joined in queries or accessed by the key column
- the cluster index is a B-tree index on the cluster key columns - different from a basic table index b/c nulls keys still have an entry in the cluster index, the table can't be accessed w/o the cluster index, the index must exist before data insertion, and the cluster index points to blocks (not a rowid)
- you should put the index in a different tablespace than the cluster data to avoid contention
steps to create it:
CREATE CLUSTER personnel (deptno number)
CREATE INDEX idx_personnel ON CLUSTER personnel
CREATE TABLE dept (deptno number, dname varchar2(15)) CLUSTER personnel(deptno)
CREATE TABLE emp (ename varchar2(25) not null check(ename=UPPER(ename)), deptno number not null, ...) CLUSTER personnel(deptno)
hash cluster
- doesn't use an index - uses mathematical hash function to locate or distribute hashed data - allows with same key values are stored in the same block
- quicker b/c no implied sort or range search and don;t nned to scan blocks for the row - rather the hash function determines the exact location of the desired row
- goal is that duplicate values will generate the same block address, but a collision happens when the 2 different inputs gives the same address - when hash function is built you tell it how many unique outputs you want it to handle (if more then collisions can't be avoided) - can't be changed later - excessive collisions lead to block overflow and chaining
- cluster key is the input to the function (1 or more columns)
- max # of hashkeys tells it how many unique outputs (block ids) are needed
- use a hash cluster when the clustered tables are very large, you know the table data very well, the data is fairly static, the key values are unique and evenly distributed, need adequte space available from the start, and queries used must be equality ones - otherwise must scan all blocks
steps to create it:
- 2000 rows - 100 uniques values - each row is 100 bytes long
- each Oracle block holds 4000 bytes of row data
- size = (# of rows/key unique values)x(bytes/row) = 20x100 = 2000
CREATE CLUSTER emp (number) SIZE 2000 HASH IS deptno HASHKEYS 100
CREATE TABLE emp (ename varchar2(25) not null check(ename=UPPER(ename)), deptno number not null, ...) CLUSTER emp(deptno)
tune memory allocation
- reduce SQL statement parsing and paging or swapping
- what to tune in the SGA (in order)
shared pool
- library cache - stores SQL and PL/SQL statements
- first ensure apps use identical SQL statements
- tune library cache to keep cache miss rate < 5%
SELECT sum(pins) 'Cache Requests', sum(reloads) 'Cache Misses', (sum(reloads)/sum(pins))*100 '% Misses' FROM V$LIBRARYCACHE
- if > 5% then increase SHARED_POOL_SIZE parameter in init.ora
data dictionary cache - info about database, structures, and users
- info in it is used to parse SQL statements
- keep miss ratio in 7-10% range
SELECT sum(gets) 'Gets', sum(getmisses) 'Misses', (sum(getmisses)/sum(gets))*100 '% Misses' FROM V$ROWCACHE
- if > 10% then increase SHARED_POOL_SIZE parameter in init.ora
database buffer cache - each buffer is a copy of a single data block
- keep hit ratio in 85-95% range
SELECT name, value FROM V$SYSSTAT where name in('db block gets', 'consistent gets','physical reads')
hit ratio = 1 - (physical reads/(db block gets + consistent gets))
- if < 75% increase db_block_buffers parameter in init.ora
- to estimate what gain will be first increase DB_BLOCK_LRU_EXTENDED_STATISTICS parameter in init.ora to max # of blocks that you could add
- then Select * FROM X$KCBRBH - equal # of rows to blocks you would add - the count shows # of additional cache hits you would get by adding this buffer - INDX is 1 less than # of blocks to add
SELECT sum(count) 'Addtl Hits' FROM SYS.X$KCBRBH where indx < 1000
- need to sum up to get total # of extra hits that you will get for specific # of blocks you want to add
hit ratio = 1 - ((physical reads - addtl cache hits)/(db block gets + consistent gets))
tune disk i/o (among server, DBWR, and LGWR)
- reduce disk contention by distributing i/o
- ensure that disk aactivity is spread evenly among all active disks
SELECT name, phyrds, phywrts FROM V$DATAFILE df, V$FILESTAT fs WHERE df.file# = fs.file#
- i/o rate for each disk = (phyrds + phywrts)/interval of time took statistics over
- system tablespace should have less i/o rate b/c of memory caches
- compare rate to hardware documentation capacity of your disks (remember that other apps may be accessing fdisk also)
- if contention rearrange data so as to distrbute i/o evenly across disks by putting data files and redo log files on separate disks (LGWR writes whenever a commit whereas DBWR writes in batches (checkpoint also)), spreading data from large files over several disks (stripe the table), put tables and associated indexes on different disks, and reduce i/o not related to Oracle
- store data efficiently in data blocks to prevent chaining and create large enough extents to avoid fragmentation
- when update a row to include more data row may have to be migrated to fit or if enough space isn't found in a block the more gets chained into 2 or more blocks - also happens when do insert and row size > DB_BLOCK_SIZE
ANALYZE TABLE emp LIST CHAINED ROWS into CHAINED_ROWS (created w/ utlchain.sql)
- simplest way to fix it is create a copy of the table w/ corrected storage parameters and then delete the original table and rename the new one
- also can lower PCTFREE to prevent chaining - for tables with large rows that are rarely modified or increase PCTFREE to lower migration for tables w/ a lot of nulls or variable fields - good estimate is that PCTFREE should be twice the size of the avg row
- also a lower PCTUSED keeps the block off the free list longer so it reduces migration - use for volatile tables, increase PCTUSED for static tables
tune contention
- for rollback segments (provide read consistency)
SELECT name, waits, gets from V$ROLLSTAT, V$ROLLNAME WHERE V$ROLLSTAT.usn = V$ROLLNAME.usn
"wait to get" ratio = sum(waits)/sum(gets) - should never > 1% - any waits are bad
- increase # of rollback segents which create more header slots so more simultaneous transactions can take place - with CREATE ROLLBACK SEGMENT command
# of concurrent transactions
# of rollback segments
n < 16
4
16 <= n < 32
8
32 <= n
n/4 (but no more than 50)
- assign updates affecting data accessed by long queries to large rollback segments - b/c may need read-consistent image
- assign long transactions to large rollback segments - won't need to do dynamic extent allocation as much
- assign OLTP transactions to small rollback segments (only if not simultaneously queried)
- usually transaction automatically assigned to RBS least busy, but can also use SET TRANSACTION USE ROLLBACK SEGMENT segment_name
- for multi-threaded server processes
- covered in 'SQL*Net and Multi-threaded Server' class
always do 'analyze estimate' before querying a temporary table
Oracle data files should be < 2 GB - to find out run select * from sys.dba_data_files;
make sure NTFS allocation unit size <= Oracle db_block_size (default is 2048 bytes) - to find out run: including db_block_size: select * from v$parameter order by name, num, type, value, is default;