Search This Blog
Friday, September 18, 2009
Physical Structure
The Oracle database consists of several types of operating system files, which it uses to manage itself. Although you call these files physical files to distinguish them from the logical entities they contain, understand that from an operating system point of view, even these files are not really physical, but rather logical components of the physical disk storage supported by the operating system.
Oracle Data Files
Data files are the way Oracle allocates space to the tablespaces that make up the logical database. The data files are the way Oracle maps the logical tablespaces to the physical disks. Each tablespace consists of one or more data files, which in turn belong to an operating system file system. Therefore, the total space allocated to all the data files in an Oracle database will give you the total size of that database.
When you create your database, your system administrator assigns you a certain amount of disk space based on your initial database sizing estimates. All you have at this point are the mount points of the various disks you are assigned (e.g., /prod01, /prod02, /prod03, and so on). You then need to create your directory structure on the various files. After you install your software and create the Oracle administrative directories, you can use the remaining file system space for storing database objects such as tables and indexes. The Oracle data files make up the largest part of the physical storage of your database. However, other files are also part of the Oracle physical storage, as you'll see in the following sections.
The Control File
Control files are key files that the Oracle DBMS maintains about the state of the database. They are a bit like a database within a database, and they include, for example, the names and locations of the data files and the all-important System Change Number (SCN), which indicates the most recent version of committed changes in the database. Control files are key to the functioning of the database, and recovery is difficult without access to an up-to-date control file. Oracle creates the control files during the initial database creation process.
Due to its obvious importance, Oracle recommends that you keep multiple copies of the control file. If you lose all your control files, you won't be able to bring up your database without re-creating the control file using special commands. You specify the name, location, and the number of copies of the control file in the initialization file (init.ora). Oracle highly recommends that you provide for more than one control file in the initialization file. Here's a brief list of the key information captured by the Oracle control file:
Checkpoint progress records
Redo thread records
Log file records
Data file records
Tablespace records
Log file history records
Archived log records
Backup set and data file copy records
Control files are vital during the time the Oracle instance is operating and during database recovery. During the normal operation of the database, the control file is consulted periodically for necessary information on data files and other things. Note that the Oracle server process continually updates the control file during the operation of the database. During recovery, it's the control file that has the data file information necessary to bring the database up to a stable mode.
The Redo Log Files
The Oracle redo log files are vital during the recovery of a database. The redo log files record all the changes made to the database. In case you need to restore your database from a backup, you can recover the latest change made to the database from the redo log files.
Redo log files consist of redo records, which are groups of change vectors, each referring to a distinct atomic change made to a data block in the Oracle database. Note that a single transaction may have more than one redo record. If your database comes down without warning, the redo log helps you determine if all transactions were committed before the crash or if some were still incomplete. Initially, the contents of the log will be in the redo log buffer (a memory area), but they will be transferred to disk very quickly.
Oracle redo log files contain the following information regarding database changes made by transactions:
Indicators to let you know when the transaction started
Name of the transaction
Name of the data object (e.g., an application table) that was being updated
The "before image" of the transaction (i.e., the data as it was before the changes were made)
The "after image" of the transaction (i.e., the data after the transaction changed the data)
Commit indicators that let you know if and when the transaction completed
Logical Structure
Oracle9i uses a set of logical database storage structures to use disk space, whether the database uses operating system files or "raw" database files. These logical structures, which primarily include tablespaces, segments, extents, and blocks, control the allocation and use of the "physical" space allocated to the Oracle database. Note that Oracle database objects such as tables, indexes, views, sequences, and others are also logical entities. These logical objects make up the relational design of the database, which I discuss in detail later on.
You can look at the logical composition of Oracle databases from either a top-down viewpoint or a bottom-up viewpoint. Let's use the bottom-up approach by first looking at the smallest logical components of an Oracle database and progressively move up to the largest entities. Before you begin learning about the logical components, remember that Oracle database objects such as tables, indexes, and packaged SQL code are actually logical entities. Taken together, a set of related database logical objects is called a schema. Dividing a database's objects among various schemas promotes ease of management and a higher level of security.
The Oracle data block is the foundation of the database storage hierarchy. The Oracle data block is the basis of all database storage in an Oracle database. Two or more contiguous Oracle data blocks form an extent. A set of extents that you allocate to a table or an index (or some other object) is termed a segment. A tablespace is a set of one or more data files, and it usually consists of related segments. The following sections explore each of these logical database structures in detail.
Data Blocks
The smallest logical component of an Oracle database is a data block. Data blocks are defined in terms of operating system bytes, and they are the smallest unit of space allocation in an Oracle database. For example, you can size an Oracle data block in units of 2KB, 4KB, 8KB, 16KB, or 32KB (or ever larger chunks). It is common to refer to the data blocks as Oracle blocks.
The storage disks on which the Oracle blocks reside are themselves divided into disk blocks, which are areas of contiguous storage containing a certain number of bytes—for example, 4096 or 32768 bytes (4KB or 32KB, because each kilobyte has 1024 bytes). Note that if the Oracle block size is smaller than the operating system file system buffer size, you may be wasting the capacity of the operating system to read and write larger chunks of data for each I/O. Ideally, the Oracle block size should be a multiple of the operating system block size. On an HP-UX system, for example, if you set your Oracle block size to a multiple of the operating system block size, you gain 5 percent in performance.
Multiple Oracle Block Sizes
The initialization parameter db_size determines the size of the Oracle data block in the database. Unlike previous Oracle database versions, Oracle9i lets you specify up to four additional nonstandard block sizes in addition to the standard block size. Thus, you can have a total of five different block sizes in an Oracle9i database. For example, you can have 2KB, 4KB, 8KB, 16KB, and 32KB block sizes all within the same database. If you choose to configure multiple Oracle block sizes, you must also configure corresponding subcaches in the buffer cache of the System Global Area (SGA, Oracle's memory allocation), as you'll see shortly. Multiple data block sizes aren't always necessary, and you'll do just fine with one standard Oracle block size.
Extents
When you combine several contiguous data blocks, you get an extent. Remember that you can allocate an extent only if you can find enough contiguous data blocks. Your choice of tablespace type will determine how the Oracle database allocates the extents. The traditional dictionary-managed tablespaces allow you to specify both the beginning allocation of space and further increments as needed.
All database objects are allocated an initial amount of space, called the initial extent, when they are created. When you create an object, you specify the size of the next and subsequent extents as well as the maximum number of extents for that object, in the object creation statement. Once allocated to a table or index, the extents remain allocated to the particular object, unless you drop the object from the database, at which time the space will revert to the pool of allocatable free space in the database.
The locally managed tablespaces (explained later in this chapter) use the simpler method of allocating a uniform extent size, which is chosen automatically by the database. Therefore, you don't have to worry about setting the actual sizes for future allocation of extents to any particular tablespace.
Segments
A set of extents forms the next higher unit of data storage, the segment. In an Oracle database you can store different kinds of objects, such as tables and indexes, as you'll see in the next few chapters. Oracle calls all the space allocated to any particular database object a segment. That is, if you have a table called customer, you simply refer to the space you allocate to it as the "customer segment."
Each object in the database has its own segment. For example, the customer table is associated with the customer segment. When you create a table called customer, Oracle will create a new segment named customer also and allocate a certain number of extents to it based on the table creation specifications for the customer table. When you create an index, it will have its own segment named after the index name.
You can have several types of segments, with the most common being the data segment and the index segment. Any space you allocate to these two types of segments will remain intact even if you truncate (remove all rows) a table, for example, as long as the object is part of the database. However, as you'll see later on, there's something called a temporary tablespace in Oracle, and Oracle deallocates the temporary segments as soon the transactions or sessions that are using the segments are completed.
Tablespaces
A tablespace is defined as the combination of a set of related segments. For example, all the data segments belonging to the out-of-town sales team can be grouped into a tablespace called out_of_town_sales. You can have other segments belonging to other types of data in the same tablespace, but usually you try to keep related information together in the same tablespace. Note that the tablespace is a purely logical construct, and it is the primary logical structure of an Oracle database. All tablespaces need not be of the same size within a database. For example, it is quite common to have tablespaces that are 100GB in size coexisting in the same database with tablespaces as small as 1GB. The size of the tablespace depends on the current and expected size of the objects included in the various segments of the tablespace.
As you have already seen, Oracle9i lets you have multiple block sizes, in addition to the default block size. Because tablespaces ultimately consist of data blocks, this means that you can have tablespaces with different block sizes in the same database. This is a great new feature, and it gives you the opportunity to pick the right block size for a tablespace based on the data structure of the tables within that tablespace. This customization of the block size for tablespaces provides several benefits:
· Optimal disk I/O: Remember that the Oracle server has to read the table data from mechanical disks into the buffer cache area for processing. One of your primary goals as a DBA is to optimize the expensive I/O involved in the reads and writes to disk. If you have tables that have very long rows, you're better off with a larger block size. Each read, for example, will fetch more data than with a smaller block size, and you'll need fewer read operations to get the same amount of data. Tables with large object (LOB) data will also benefit from a very high block size. Similarly, tables with small row lengths can have a small block size as the building block for the tablespace. If you have large indexes in your database, you need to have a large block size for their tablespace, so each read will fetch a larger number of index pointers for the data.
· Optimal caching of data: The Oracle9i feature of separate pools for the various block sizes leads to a better use of the buffer cache area.
· Easier to transport tablespaces: If you have tablespaces with multiple block sizes, it's easier to use the "transport tablespaces" feature.
Each Oracle tablespace consists of one or more operating system (or raw) files called data files, and a data file can only belong to one tablespace. An Oracle database can have any number of tablespaces in it. You could manage a database with ten or several hundred tablespaces, all sized differently. However, for every Oracle database, you'll need a minimum of two tablespaces: the System tablespace and the temporary tablespace. Later on, you can add and drop tablespaces as you wish, but you can't drop the System tablespace. At database creation time, the only tablespace you must have is the System tablespace, which contains Oracle's data dictionary. However, users need a temporary location to perform certain activities such as sorting, and if you don't provide a default temporary tablespace for them, they end up doing the temporary sorting in the System tablespace.
When you consider the importance of the System tablespace, which contains the data dictionary tables along with other important information, it quickly becomes obvious why you must have a temporary tablespace. Oracle9i allows you to create this temporary tablespace at database creation time, and all users will automatically have this tablespace as their default temporary tablespace. In addition, if you choose the Oracle-recommended Automatic Undo Management over the traditional manual rollback segment management mode, you'll also need to create the undo tablespace at database creation time. Thus, although only the System tablespace is absolutely required by Oracle, your database should have at least the System, temporary, and undo tablespaces when you initially create it.
Tuesday, June 10, 2008
Types of Databases
In many organizations, as a DBA you can face on a daily basis the possibility of working with different types of databases and, as a result, different types of data and management requirements. You may find yourself working on simple SQL queries with users and simultaneously wrestling with decision-support systems for management. Databases perform a variety of functions, but you can group all of those functions into two broad categories: online transaction processing (OLTP) and decision support system (DSS) (which is sometimes also called online analytical processing [OLAP]). Let's take a quick look at some of the basic classifications of Oracle databases.
Online Transaction Processing and Decision Support System Databases
The online transaction processing (OLTP) category of databases represents the bread and butter of most consumer- and supplier-oriented databases. This category includes order entry, billing, customer, supplier, and supply-chain databases. These databases are characterized by rapid transactions and a need to be online continuously, which today (given the use of the Internet to access such systems) means 24/7/365 availability, short maintenance intervals, and low tolerance for breakdowns in the system.
Decision support systems (DSSs) range from small databases to large data warehouses. These are typically not 24/7 operations and they can easily manage with regularly scheduled downtime and maintenance windows. The extremely large size of some of these data warehouses necessitates the use of special techniques both to load and to use the data.
There isn't a whole lot of difference between the administration of a DSS-oriented data warehouse and a transaction-oriented OLTP system from the DBA's perspective. The backup and recovery methodology is essentially the same, and database security and other related issues are also very similar. The big difference between the two types of databases occurs at the design and implementation stages. DSS systems usually involve a different optimization strategy for queries and different physical storage strategies. Oracle9i provides you a choice between using an OLTP database and a DSS database using the same database server software.
Performance design considerations that may work well with one type of database may be entirely inappropriate for another type of database. For example, a large number of indexes can help you query a typical data warehouse efficiently while you are getting some reports out of that database. If you have the same number of indexes on a live OLTP system with a large number of concurrent users, you may see a substantial slowing down of the database, because the many updates, inserts, and deletes on the OLTP system require more work on the part of the database. So, know what you are trying to achieve with your new system and design accordingly in terms of choosing indexes, index types, and space parameters for the various objects in your database.
Development, Test, and Production Databases
Oracle databases can generally be grouped into three categories: development, test, and production. Applications are developed, tested, and put into production. A firm usually has all three database versions in use at any given time, although for smaller companies, the test and development versions of the database may be integrated in one database.
Development databases are usually owned by the development team, which has full privileges to access and modify data and objects in those databases. The test databases are designed to simulate actual production databases and are used to test the functionality of code after it comes out of the development databases. No new code is usually implemented in the production or the "real" databases of the company unless it has been successfully tested in the test databases.
When a new application is developed, tested, and put into actual business use (production), the development and production cycle does not end. Application software is always being modified, for two reasons: to fix bugs and to improve the functionality of the application. Although most applications go through several layers of testing before they move into production, coding errors and the pressure to meet deadlines contribute to actual errors in software, which are sometimes caught later when the application is already in use. In addition, users continually request (or, more appropriately, demand) modifications in the software to improve the functionality of the application. Consequently, application code does not remain static; rather, developers and testers are always working on it.