Inexpensive hard drive space has reduced the significance of denormalisation. Oracle is taking advantage of added space by offering strategies to streamline database performance.
The process of removing redundancy from tables is called data normalisation, which attempts to minimise the amount of duplication within the database design. Although normalisation was an excellent technique during the 1980s, when disk space was very expensive, the rules have changed in the 21st century, with disk costs dramatically lower. Today, adding redundancy is a very important aspect of designing high-performance Oracle databases.
The introduction of redundancy to avoid costly table joins can dramatically improve the speed at which Oracle SQL queries are serviced. It is the challenge of the Oracle design professional to choose the appropriate database design to ensure that SQL queries are serviced as quickly as possible. Instead of removing redundancy, the Oracle designer controls the introduction of redundancy using specific rules.
When to add redundancy
Essentially, the introduction of redundancy is a function of the size of the redundant column and the frequency with which the column is updated. The ideal candidates for redundant duplication are table columns that meet the following criteria:
1. The introduction of redundancy will eliminate the need to repeatedly join two tables together.
2. The data column is small.
3. The data column is static and rarely updated.
Planned data denormalisation
Oracle was one of the first databases to introduce tools for planned data denormalization. As hard drives became cheaper throughout the 1990s, Oracle recognized that significant performance improvements could be introduced by deliberately introducing redundant data items into the Oracle table and index structures.
Snapshots
One of Oracle's first forays into data redundancy was the introduction of Oracle snapshots. With Oracle's advanced replication option, copies of tables could be made on remote database servers and refreshed at specific intervals. This redundant duplication of Oracle tables across widely dispersed geographical areas ensured that users were able to retrieve information quickly on a local server without the need to travel across a large network.
VARRAYs
Oracle also allows the introduction of redundant information using VARRAY table structures. In a VARRAY table, Oracle provides for the introduction of non-first-normal form data structures by inserting repeating groups of values directly within a single Oracle table row. This avoids the overhead of joining the base table into a subordinate table to retrieve the solution set.
Let's look at a simple VARRAY table example. Assume we have a Student table for a university. One of the table requirements is storing student SAT and ACT scores. The students may take the test only three times, and the test scores are a very small repeating group that repeats for only a specific number of values. Using traditional database design structures, we would be required to create a Test_scores table and join the Student table with it to see both the student data and the repeating values of their SAT and ACT test scores. Using Oracle8 VARRAY tables, you can create a table structure where repeating groups are automatically stored within the Oracle table itself (Figure A).
Figure A
An Oracle8 VARRAY table
Frequently updated large data columns can be very cumbersome for Oracle VARRAY tables. In the example of our test scores, the VARRAY tables allow Oracle to retrieve both the student and test information within a single disk I/O operation.
Another important VARRAY table characteristic is that the repeating information may be stored in presorted order. Upon retrieval, the information will always be displayed in sorted order. This alleviates the additional overhead of re-sorting the test scores every time a student row is retrieved.
Oracle Materialized views
After the popularity of snapshot replication, Oracle recognized that complex queries could be prebuilt to provide end users with the illusion of instantaneous response time. The precompilation process allowed five-way table joins, complex presummarisation of aggregation operations, and a host of other time-consuming and I/O expensive SQL queries that can be precalculated.
Basically, materialized views boil down to a "build it now or build it later" philosophy. Using this philosophy, you can preexecute Oracle queries in anticipation of the end user's query, thereby allowing the end user to retrieve complex information on a single disk I/O.
However, simply prebuilding complex queries is only a portion of the answer. A mechanism had to be created to make Oracle SQL aware of a query that had been prebuilt and to tell it to use the precreated summary. Oracle called this exciting new feature query rewrite. Using the Oracle parameter query_rewrite_enabled, Oracle automatically checks for materialised views whenever it notices a match between an incoming SQL statement and a prebuilt aggregate. If Oracle notices that the information has been presummarised, the cost-based optimiser goes directly to the presummarised information, thus saving thousands of expensive disk I/Os. For data warehouse applications and Oracle systems requiring complex SQL queries, materialised views can be the difference between subsecond response times and queries that may run for 30 minutes.
Here is a simple example of a materialized view:
create materialised view sum_sales
build immediate
refresh complete
enable query rewrite
as
select product_nbr,sum(sales) sum_sales
from sales;
When any query summarizes sales, that query will be dynamically
rewritten to reference the summary table:
alter session set query_rewrite_enabled=true;
set autotrace on
select sum(sales)
from sales;
In the execution plan for this query, we see that the sum_sales table is referenced:
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1 Bytes=83)
1 0 SORT (AGGREGATE)
2 1 TABLE ACCESS (FULL) OF 'SUM_SALES' (Cost=1 Card=423 Bytes=5342)
Materialised views, being redundant, need to be updated when their base tables change. Just as a snapshot needs to specify a refresh interval, an Oracle materialised view has to specify the rate at which the materialized view is recreated when any of the information that constitutes the materialised view has changed. Oracle offers a wealth of options for the frequency of rebuilding the views, ranging from instantaneous rebuilds (commit refresh) to more sophisticated refresh intervals that can be done according to the volatility of the base data.
Conclusion
Because disk prices are falling by orders of magnitude every year, Oracle professionals are very conscious of introducing redundancy into their Oracle data models to improve performance. A third-normal-form database design in the 21st century may be very efficient from a disk-storage point of view, but it will perform very poorly because everything has to be built from its atomic pieces every time the queries are executed. Using Oracle's denormalisation tools such as replication, VARRAY tables, and materialised views, the Oracle database designer can deliberately introduce redundancy into the data model, thereby avoiding expensive table joins and large-table full-table scan operations that are required to recompute the information at runtime.













