Database Definitions
Asynchronous Read-Ahead
Used on large table scans so that additional data is synchronously retrieved from disk while the current set is being processed - improves performance. Achieves overlap between I/O and processing so increases performance of parallel query operations.
B-Tree
The most important access path structure in database and file systems. Each node has manu seccessors (children) and each possible search path has the same length measured in terms of nodes (pages). The structure of a B-Tree is a branch block whose rows hold (separator key, kdba) pairs which guide the B-Tree to a row in a leaf block, and a leaf block whose rows hold the (KEY, KEYDATA) pairs stored by the B-Tree. B-Trees are automatically balanced by definition and always grow in an upward direction.
Check Constraint
A set of rules that govern whether a value can be updated or deleted.
Client
A software application (q.v.) that works on your behalf to extract a service from a server somewhere on the network. Think of your telephone as a client and the telephone company as a server to get the idea.
Client-Server (C/S) Architecture
Software architecture based on a separation of processing between two CPUs, one acting as the server that provides services in a transaction and the other as the client in the transaction, requesting and receiving services. This architecture allows for separate processing between the database server and the client application programs.
Clustered Parallel System
A loosely coupled hardware architecture where each node is a separate machine. Parallel system where a few SMP machines are linked together by an interconnect. Each node has one or more CPUs with its own private memory so messages are sent across an interconnect (specialized cabling) so nodes can communicate. A CPU is one node can't access memory in another node. A separate copy of the OS controls each node.
Clustering
Oracle's multitable clustering ability allows the server to store rows physically close together from multiple tables. Stores tables pre-joined. Requires that user be able to choose multiple tables that are queries together at least 80% of the time. Form of server-level data partitioning unlike doing a UNION of smaller tables views which is a form of application-level partitioning. A cluster index does clustering on 1 table - stores data pre-ordered.
Column
A vertical space in a database table that represents a particular domain of data. A column has a column name and a specific datatype. For example, in a table of employee information, all of the employees' dates of hire would constitute one column.
Commit
To make permanent changes to data (inserts, updates, deletes) in the database. Before changes are committed, both the old and new data exist so that changes can be stored or the data can be restored to its prior state.
Concurrency
The simultaneous access of the same data by multiple users executing different database applications or the same application. In database software, concurrency requires additional logic to ensure that all users see consistent data and that all changes are made in proper order.
Constraint
A rule or restriction concerning a piece of data (such as a NOT NULL restriction on a column) that is enforced at the database level, rather than the object or application level.
CURSORs
CURrent Set Of Rows - used to return multi-row sets from stored procedures.
Data Dictionary
A comprehensive set of tables and views automatically created and updated by the Oracle Server, which contains administrative information about users, data storage, and privileges. It is installed when Oracle is initially installed and is a central source of information for the Oracle Server itself and for all users of Oracle. The tables are automatically maintained by Oracle. It is sometimes referred to as the catalog.
Data Mart
Data Mining
Data Warehouse
Database
(1) A set of operating system files, treated as a unit, in which an Oracle Server stores a set of data dictionary tables and user tables. A database requires three types of files: database files, redo log fields, and control files. (2) The disk space corresponding to this set of files. (3) A subset of database objects necessary to support a single database application.
Datacube
A potential implementation for a star query - each Dimension table is a dimension of the cube while the cube's contents are the Fact table. It is usually utilized by OLAP tools for multidimensional analysis.
Datatype
(1) A standard form of data. The Oracle datatypes are CHAR, DATE, NUMBER, LONG, RAW, and LONG RAW; however, the Oracle Server recognizes and converts other standard datatypes. (2) A named set of fixed attributes that can be associated with an item as a property
DBMS
Database Management Software - software that manages a large amount of data in a multi-user environment so that many users can concurrently access the same data. All this must be managed while delivering high performance to the users of the database.
DDL
Data Definition Language - The category of SQL statements that define or delete database objects such as tables or views. Examples are the CREATE, ALTER, and DROP statements. It does DMLs against the data dictionary
Deadlock
A situation in which two or more users of a database cannot complete their transactions, because each user is holding a resource that the other user requires in order to complete. The Oracle Server automatically avoids and resolves deadlocks.
Disk Affinity
For shared-nothing or hybrid architectures. The query coordinator knows which disks (data) are associated with which nodes (CPUs) and therefore ships queries to these nodes directly rather than uses the shared interconnect. Get benefits of shared-nothing architecture without the overhead and costs.
Distributed Database Computing
Distribute data across physically separate machines transparently to networked users as a single logical database. Can be done either synchronously or asynchronously. Synchronous updates ensures application integrity, decreases complexity, increases response time for the application, and decreases the availability in case the network link goes down. Asynchronous updates (symmetric replication) increases availability, decreases response time, increases the complexity to try and ensure data integrity. Try to optimize resource utilization by distributing application workloads across operating environments and networks - try for increased throughput and decreased cost. Goals are to ensure data integrity and simplify application development. Use to treat physically distributed databases as a single logical database. A query can span multiple machines and databases - updates and deletes are protected by a two-phase commit (commit points are used on the selected node to ensure that the commit on this node doesn't have to wait for the commit on others). They protect mainframe updates from PC updates - the commit point database can be non-Oracle. If want to access non-Oracle databases then use Oracle Gateways.
DML
Data Manipulation Language - the category of SQL statements that query and update the database data. Common DML statements are SELECT, INSERT, UPDATE, and DELETE.
Fail-Over Configuration
Protects against the failure of a primary system by enabling continued processing on a secondary fail-over system. Either use parallel server or a standby database to do it.
Fast Commits
Committed transactions aren't necessarily written immediately to database files; however they are written to redo log files (fast, sequential, multiblock I/O)
Group Commits
If transaction rate is high and they are many updates hitting the database then each write to the redo log will commit several transactions.
Indexes
Instance
SGA and minimum of 4 background processes (PMON, SMON, DBWriter, and LogWriter).
Integrated
No need for specialized servers - all functionality included in one server.
Integrity Constraint
A business rule applying to one or more columns in a table that defines the valid values for those columns. For example, a column value cannot be null, or it must have one of a predefined set of values (as in state or area codes), or it must be in a certain numeric range, or it must exist in another table (as in foreign key). Integrity constraints are defined for column when the table is created or altered. There are six possible types (CHECK, UNIQUE, PRIMARY, FOREIGN, DEFAULT, and NULL).
Joins
Lock
(1) (noun) A temporary "hold" on database resources, acquired by a user while using that resource, to support concurrency. A lock is a mechanism intended to prevent destructive interaction between users accessing Oracle data.(Destructive interaction can be interpreted as any interaction that incorrectly updates data or incorrectly alters underlying data structures). Locks are used to achieve consistency and integrity. There are numerous types of locks. (2) (verb) To request a lock on a database table or row with the SQL statement LOCK, in order to temporarily restrict other user's access to it and to perform updates or queries.
MTTR
Mean Time To Recovery
Non-Blocking Queries
Simultaneously run queries in the background while continuing to do other work.
Normalization
A step-by-step process that produces either entity or table definitions that have no repeating groups, the same kind of values assigned to attributes or columns, a distinct name, distinct and uniquely identifiable rows. Use normalization for administrative reasons - de-normalize data for performance reasons.
OLAP
OLTP
Partition Views
Allows a large table to be broken up into multiple smaller tables - the use a UNION ALL view to see the entire table. Improves availability, administration, and performance since can operate at partition level instead of entire table.
Precompiler
A tool that allows a user to embed any SQL statement in a 3GL (host language) program. The precomplier takes this program as input and produces as output another program, in the same 3GL, in which all the embedded SQL calls are translated into host language procedure calls. This resulting program can be compiled, linked, and executed. Oracle supports precompilers for the languages C, FORTRAN, Pascal, P/I, COBOL, and Ada.
Primary Key
In a database table, a set of columns used to enforce uniqueness of rows. The combination of column values is unique for each row in the table. They primary key is the most frequently used means of accessing rows.
Query
A SQL SELECT statement that retrieves data, in any combination, expression, or order. Queries are read-only operations; they do not change any data, they only retrieve data. Queries are often considered to be DML statements
Query Optimizer
Oracle has a parallel-aware cost-based optimizer which bases it on the number of available processors and disk devices that store the data. Decides on the access path choices like table scan vs. index scan. Creates histograms to estimate the distribution of column data to see if have a data skew. Also recognizes star queries and partition views (can eliminate the reading of some partitions). Use EXPLAIN PLAN facility to see SQL execution plan - use SQL hints to override. Use ANALYZE TABLE command first to gather statistics on the table populations.
RBS
rollback segment - created whenever a DML is issued
RDBMS
Relational DataBase Management System - database system which organizes and accesses data as two-dimensional rows and columns. Use SQL to load, manage, update, and access information - they drive DSS or OLTP applications.
Read Consistency
Query executed by a transaction sees only data that was committed at the time of the query. If selecting sum of inventory then while it is doing that someone updates inventory it won't take that into consideration. Otherwise "dirty reads" so read locks are released and writers of data modify the information being read. No read locks in Oracle so reads don't block writes and writes don't block reads. Multiple concurrent transactions can modify different rows in the same block and there is no limit on the number of row locks - they also never escalate.
Referential Integrity
A condition that guarantees that the values in one column also exist in another column. The relationship between a primary key (unique, not null) in one table and the foreign key it refers to in another table.. This guarantee is enforced through the use of integrity constraints.
Replication
Enables the placement of shared data at locations throughout the organization. Provides "fast" local access an helps ensure against single system failures. Different architectures are distribution, consolidation, data off-loading, disaster site maintenance, workflow, and multi-site data sharing. Can be Row-change or procedural call replication and can be synchronous (real-time - uses distributed locks on data) or asynchronous (store-and-forward). Important for data warehousing applications. Needed when local copies can increase performance and availability, but they don't have to be current. It copies tables from one node on a network to another. Synchronous (real-time) methods do it through the use of triggers - however there are drawbacks because of the overhead and availability requirements. To do it at specific times and intervals (asynchronously) use 'CREATE SNAPSHOT' or deferred RPCs- this is the preferred method usually since real-time updates aren't always demanded. Oracle's is peer-to-peer, not slave/master so it is more robust. You can also write customized conflict resolution routines.
ROLAP
Rollback (noun)
The second half of the recovery procedures. After the roll forward, any changes that were not committed must be undone. After the redo log files have been applied, then the rollback segments are used to identify and undo transactions that were never committed, yet were recorded in the redo log. Oracle completes this step automatically.
Row
(1) Synonym for record; one row of data in a database table, having values for one or more columns. (2) One set of field values in the output of a query.
Schema
A collection of table definitions. What a user owns or has access to in the database.
Serializable Transactions
Server
(a) Software that allows a computer to offer a service to another computer. Other computers contact the server program by means of matching client (q.v.) software. (b) The computer on which the server software runs.
SGA
System Global Area - shared memory area for query execution plans, data, etc. Also includes Shared SQL area for parsed SQL statements and procedures which reduces memory utilization and application startup time. Great for OLTP systems running many users. The key to tuning Oracle is the size of the SGA.
Snowflake Schema
It is a complex star schema that has more Dimension tables and more foreign key/primary key relationships. However, all Dimension tables should still be joined first.
SQL
Structured Query Language - The internationally accepted standard for relational systems covering not only query, but also data definitions, manipulation, security and some aspects of referential integrity. It's non-procedural - the RDBMS determines how to do things - 1 line is run at a time and there is no connection between them
Star Query
Use for data warehouse queries (with a star schema representation). In this type a query it is more efficient to defer the costly operation of accessing the Fact table until the end - it will do a Cartesian join between the Dimension tables first since join predicates don't exist between them. Oracle's cost optimizer can tell if it is a star query by seeing if 1 table is significantly larger than the others and if the larger table has a concatenated index. Use a hint is it can't recognize it. Relational OLAP tools use them to have OLAP-style processing on top of relational databases.
Star Schema
A natural representation for a data warehouse - it will have one very large Fact table like sales, and multiple smaller Dimension tables like products, customers, suppliers, stores, etc. Each Dimension table will share a primary key/foreign key relationship with the Fact table.
Dimension Table Dimension Table
Fact Table
Dimension Table Dimension Table
Stored Procedure
A stored procedure is a group of pre-compiled and pre-optimized SQL statements that performs some database operation for repeated execution. They are not automatically enabled like triggers - they are called to perform specific work. Reduces network traffic/performance b/c code will execute on server, not client.
Symmetric Replication
Asynchronous distributed computing. Called symmetric since all replicates are potentially updateable (through 2 different mechanisms - multiple masters or updateable snapshots (both done through deferred RPCs)). Automatically propagates data to other databases using Oracle Gateways - a two-phase commit guarantees consistent data. Used for off-loading data to a data warehouse, backing up data, and disseminating and collecting data to remote sites. Can support both conflict avoidance (primary-site ownership (local sites only can change their accounts) or dynamic ownership (workflow)) and conflict detection and resolution (timestamp, priority, additive, min, max, avg, discard, etc.). A replication catalog provides a single repository that contains metadata (like objects, sites, mechanisms) that define the replicated environment.
Synonym
An alias for a table, view, sequence , or program unit; a synonym is not actually an object itself, it is a direct reference to its base object. Synonyms are used to mask the real name and owner of an object, provide public access to an object, provide location transparency for tables, views, or program units of a remote database and simplify the SQL statements for database users. A synonym can be public or private.
3-Tier
A classic 3-tier is a client - client - server setup. You get extended scalability, a thin client, load balancing, architectural insurance, and centralized management. The client in the middle is your application server and the server on the back-end is the database server. Traditionally in the middle will sit a TP monitor or DCE, CORBA, or a 3GL. It extends scalability, but you have to do a lot more coding.
Table
The basic unit of storage in a relational database management system. A table represents entities and relationships, and consists of one or more units of information (rows), each of which contains the same kinds of values (columns) Each column is given a column name, a datatype (such as CHAR, DATE, or NUMBER), and a width (the width may be predetermined by the datatype, as in DATE). Once a table is created, valid rows of data can be inserted into it. Table information can then be queried, deleted, or updated. To enforce defined business rules on a table's data, integrity constraints and triggers can also be defined for a table. In SQL* forms, a block in a form is associated with one base table.
Table Constraint
An integrity constraint that applies to multiple columns of the same table.
Table Partitioning
Dynamic table partitioning divides work at runtime according to the degree of concurrent processing desired. It tries to keep all of the query processors busy and have them all finish at the same time. Physical table partitioning require unloading, repartitioning, reloading, and reindexing the data if you want to modify it. The best performance full table scan on an SMP machine is the time it takes a single CPU to process the largest partition. Both physical and dynamic can be offered at the same time.
Tablespace
A logical portion of an Oracle database used to allocate storage for table and index data. Each tablespace corresponds to one or more physical database files. Every Oracle database has a tablespace called SYSTEM and may have additional tablespaces. A tablespace is used to group related logical structures together. For example, tablespaces commonly group all of an application's objects to simplify certain administrative operations.
TPC
Transaction Processing Performance Council -
TPC-A = lightweight - simulation of banking system (1 transaction type)
TPC-B = rarely used
TPC-C = measures OLTP workload containing a mixture of read-only and update-intensive transactions simulating an order-entry and order-processing system (9 tables, 5 transaction types) - heavyweight benchmark - includes tpmC (transactions per minute) and $tpmC (price/performance based on 5-year of ownership including maintenance).
TPC-C/S = client/server benchmark
TPC-D = new decision-support benchmark (17 queries against large database)
TPC-E = enterprise benchmark
TPC-S = server benchamrk
Transaction
A logical unit of work that comprises one or more SQL statements executed by a single user. According to the ANSI/ISO SQL standard, with which Oracle is compatible, a transaction begins with the user's first executable SQL statement. A transaction ends when it is explicitly committed or rolled back by the user. Has the following ACID properties:
Atomicity - either all changes happen or none.
Consistency - no integrity constraints are violated by the transactions' actions.
Isolation - can be executed concurrently with other transactions.
Durability - once committed, changes to the state survive failures.
Trigger
(1) A PL/SQL procedure that is executed, or "fired," at runtime. (2) A stored procedure that is fired (implicitly executed) when an INSERT, UPDATE, or DELETE statement is issued against the associated table. Database triggers can be used to customize a database management system with such features as value-based auditing and enforcement of complex security checks and integrity rules. For example, a database trigger might be created to allow a table to be modified only during normal business hours. Used for business rules like to make sure a salary is within the range for that specific job grade.
Two-phase Commit
A method used by a distributed DBMS (database management system) to ensure that a transaction is valid at all sites by the time it commits (to make a change to data permanent) or rolls back (to restore data to its prior state before the change was made) . All sites either commit or roll back together, no matter what errors occur in the network or on the machines tied together by the network. Used to control and monitor distributed transactions - it maintains integrity. Commit or rollback of a distributed transaction must be coordinated over a network so that the participating nodes will either all commit or all rollback the transaction. Has two phases - prepare (transaction coordinator asks all participants to get ready), and commit phase (if can't doesn't happen on all coordinator asks all nodes to rollback). A non-Oracle database can also participate via Gateways. It is the building block upon which other distributed strategies like data replication are built.
VLDB