One of my first programming jobs was at a company which was quite large – it had a large number of programming teams, and very separate teams that did system, network and storage administration. As a programmer, especially a ‘green’ one, a database was pretty much a big black box – accessed not via SQL but via a pretty well-defined API. When a program we were working on needed to store something, we’d call the ’store’ api; when we needed to update it, we’d call the ‘update’ API call, and when we needed it, we’d use a ‘retrieve’ call. Simple!
Skip several years later, and the database had grown to be huge for that time. Retrievals were taking longer and longer, and talking to some of the sysadmins late at night around the coffee machine, I learned that the backup was getting too large to even back it up within a week. The application was starting to hit pretty much a ‘brick wall’; it started to fall further and further behind, and something really needed to be done.
Management recognized that there was indeed a big problem, and created a team of people from sysadmin, storage, DBAs, the COO, and programmers, to find a solution – and the big question from the top was “Is it really required to spend a considerable amount of money on hardware to make it work?”. I was of course one of the lucky people assigned to that team.
It actually only took us a week to find the problem; there was no ‘cleanup’ or ‘delete’ call in the API, and there were no business rules in place concerning how long data needed to be retained – both from a physical standpoint (what is needed to keep the business functioning) as well as from a legal standpoint. These business concern & legal requirements can be quite different I later learned.
If all that a program can do is add & update data, then it is pretty inevitable that a datastore will eventually become just too big. After figuring out the business requirements for data retention, and creating a new API that would flag things for deletion, export them, and then remove them, the amount of count of rows in the database in question shrunk to about 1/10th of its original size, and the good news was that everything started to work much better, except of course the backups.
I should add that this was on Oracle v7, which didn’t have the ability to shrink and relocate segments, rebuild indexes online, or shrink datafiles.
The system administrators were at first confused. They asked “if we deleted almost 90% of the data, why was the database still so large?”.
Of course, the answer was ‘Well, when you delete rows in a table, Oracle does not shrink that table (or associated index) segments; there will be more ‘empty blocks’ or ‘non-full blocks’ available, but these are still owned by the table (or index) segment; they cannot be used for any other segments.’
So, even after deleting a majority of the data, we still had a pretty big problem concerning backups. And, fixing that took a pretty huge amount of effort – over several months and many man-hours involving people from both the business side & the technical side to fix it.
I learned a few very good lessons over the course of that problem;
- Databases that get too large can be very difficult to shrink without affecting uptime/availability,
- If someone had understood the problem before it became a critical issue, and if cleanup were done regularly from the beginning, a lot of work (and downtime) could be avoided,
- The amount of effort required to create and program those business rules were eventually required anyway, so the work done on shrinking it was extra work/expense on the company’s bottom-line,
- Understanding what can be deleted, what needs to be backed-up/archived before getting deleted, and what must not be deleted, will very likely require someone with business expertise – for example, the COO and the legal team. And often, this is not a simple question at all, and it can take considerable effort to even understand what rules need to be followed.
- Testing the delete/delete+archive processes is very, very important. Further, if archived data MAY be required in the future, it is important to do the work beforehand so that retrieving it in a suitable form does not take much effort or require programming resources,
- If the policies are clearly figured out in the early stages of a product, and suitable programming done so that it works reliably, the problem will likely never become an issue.
- Unless it’s a pretty trivial database, it’s very, very important not to forget the above lessons!
Now that we have Oracle v11G, it is possible to shrink segments online, rebuild indexes online, migrate segments between tablespaces, and even shrink datafiles, if the Enterprise Edition is in-use. If the database edition is not the Enterprise Edition, then reorganization will likely not be possible on-line while the database is up – it may even require a full export/expdp and import/impdp which can take a considerable amount of downtime.
Even if Enterpise Edition is in use, if there is a segment which is occuping the top part of a datafile, then it is not possible to shrink the datafile past that point, so you can still run into problems — see Mike Dean’s blog post on Resizing Datafiles and Understanding the High Water Mark.
TL;DR - it is quite important to insure that suitable data cleanup processes are in place early in a database’s life, to insure that performance doesn’t hit a ‘brick wall’ later in its life.