Tuesday, January 30, 2007

Interesting things during housekeeping a database

These days I've been housekeeping a database. Some tablespaces were created inefficiently which all have unnecessary huge extent size 16MB. As a result, even most of segments in those tablespaces have only few rows, they still occupy at least 16MB. Since there are a lot of such segments, the database has grown unnecessarily big. My job is to move these segments out of the tablespaces and shrink their size.
  • Move small tables to the small-extent 128KB tablespaces. Move big tables and table partitions to the big-extent 16MB tablespaces. Moving tables must be done at first as the rows movement will invalidate all the associated indexes.
select 'alter table '||owner||'.'||segment_name||' move tablespace '||
case when bytes/1048576 <>
from dba_segments
where tablespace_name = 'ts_unnecessary_big' and segment_type = 'TABLE'
union
select 'alter table '||owner||'.'||segment_name||' move partition '||
partition_name||' tablespace ts_big;'
from dba_segments
where tablespace_name = 'ts_unnecessary_big' and segment_type = 'TABLE PARTITION'
order by 1;

  • Rebuild small indexes into the small-extent 128KB tablespaces. Rebuild big indexes and index partitions into the big-extent 1MB tablespaces.
select 'alter index '||owner||'.'||segment_name||' rebuild tablespace '||
case when bytes/1048576 <>
from dba_segments
where tablespace_name = 'ts_unnecessary_big' and segment_type = 'INDEX'
union
select 'alter index '||owner||'.'||segment_name||' rebuild partition '||
partition_name||' tablespace ts_big_idx nologging;'
from dba_segments
where tablespace_name = 'ts_unnecessary_big' and segment_type = 'INDEX PARTITION'
order by 1;

  • Even the tablespaces into which those segments are being moved are created as locally managed and have uniformly small extents, the segments being moved into them will be created with a big initial extent which equals to their original size. After moving the segments, I need to deallocate the unused blocks and shrink their initial extent size. Firstly, I need to find out which segments are candidates to be deallocated. Probably whose initial extents are larger than the tablespace initial extent and the initial extent size is equal to it's segment size.
select 'alter table '||owner||'.'||segment_name||' deallocate unused keep 128k;'
from dba_segments
where tablespace_name = 'ts_small'
and segment_type = 'TABLE'
and initial_extent != 128*1024 and initial_extent = bytes
order by initial_extent;

Some interesting things show up during deallocating the segments:
  1. Some segments can be deallocated down to 128k, the tablespace initial extent size; some can NOT even their high watermark is not above 128k.
  2. Some segments can be deallocated down to the nearest size to the high watermark; some can NOT.
  3. Some statements will tell you how many unused blocks there are above the segment high watermark while some won't.
  4. After more aggressively changing 'keep 128k' to 'keep 64k' or down to another level, some segments which are not able to be shrank to 128k mentioned in point 1 have been deallocated to 128k.
  5. After more aggressively changing 'keep 128k' to 'keep 64k' or down to another level, some segments which are not able to be shrank to their possibly smallest size in point 2 have been deallocated more.
I need to invest more time to figure out these interesting things.

Forward or backward?

Recently I was assigned to a project. Actually, the project itself is pretty simple in terms of business logic. There are only 4 screens which need to be maintained on the front-end web pages. The maintenance is also pretty straightforward. Users just want to insert, update, delete or get a report on the data. What about the data? The data itself is relatively static as well. Till now, everybody would think this project is easy to be done and the data would be easy to be manipulated. I could think it's enough to have up to 4 or 5 tables out there in the database.

Wrong! The data model designed by somebody is extremely complicated. There are tens of tables in the database to support his idea and backup the front-end interface. This guy seems to live in a perfect world. He wants everything to be flexible. He wants users to control which table columns to be displayed on the web interface. He wants everything to be stored as their IDs rather than their real values. To find the real value, I need to come down 3 or more tables to fetch it. I could think even the simplest SQL needs to join 5 or more tables to get what we want. This guy wants to build a RDBMS system instead of an application on top of the Oracle RDBMS!

Yes, he can say the model is very flexible. Users can control almost everything even the data type and some constraints. But the price we pay here is a hard-to-understand data model(sometimes even the designer himself has no idea what he's doing during project meetings, funny!), the inevitable database performance overhead and the unnecessary flexibility. Users may not want those flexibilities we introduce to them at all! Actually, He complicates things completely!

I've been thinking about this. The database technology has been emerging for decades. From the layered database to the networking database to the relational database to the object-oriented database. One of the most important goals is to simplify the data model as much as possible so that developers can focus on the real business logic instead of the complicated data model. We do ours and you do yours! In this project, everything seems inversed! Instead of putting our effort to enforcing the business logic, we have to spend 90% of the project time to understand the data model, manipulate the data to stay consistent with one another. Is it really worth doing all of these? Are we going forward or backward? Time will tell!