Tips and wisdom from 15 years of DBA experience

Monday, November 10, 2008

"SLRP" - the 4 major OLTP scalability inhibitors

OLTP databases have particular behavior that needs to be tended to. Below I will outline the four major categories that in my experience have been the greatest inhibitors to scalability. This is a holistic approach that will be useful when thinking big picture for your OLTP systems. It is not intended to provide a performance tuning method - that is a solved problem: read Cary Millsap's "Optimizing Oracle Performance."

Transactional databases capture data - sometimes LOTS of data. If your OLTP system is storing transactional data that would best be archived off to a data warehouse, do so. It will improve your mean time to recover during an emergency, and keep your sysadmin happy - or less miserable, at least.

If you often see "latch free" waits in V$SESSION_WAIT, check to see if it is latch# 66. (in Oracle 8i - number changed in 10g and possibly 9i). If so, that is most likely contention for one or more child latches of the "cache buffers chains" latch. Abusing these latches is the fastest way to spike the cpu, because each waiter for the latch spins the cpu in a well-defined algorithm. For high-transaction volume systems, even small reductions in the number of block gets can produce dramatic reductions in cpu load due to the exponential nature of queueing.

You may be over-commiting, or your system may just do a lot of updates, which generate redo. Explore ways to do operations that commonly generate redo as NOLOGGING operations - Inserting /*+ APPEND */ into a temp table, then using some exchange partition trickery to "flip" the tables. If you ever have to recover and roll through a lot of archive logs, you know how important it is to reduce redo production. It is a critical factor in reducing MTTR.

Hard parsing is bad. Use bind variables - I think if Tom Kyte needs to say this once more, he will burst a vein. Soft parsing is better...but the "softer soft parse" that Tom speaks of is the best. What is it? It is a session cursor cache hit. How do you achieve it? execute the same SQL through the *same* Oracle session. That means the same V$SESSION entry, not some other definition that the application thinks a session is. You may want to try re-architecting things to send particular statements through the same Oracle sessions. Using stored procedures/functions can help immensely also when done right.The next time you hit a scalability barrier, chances are it falls into one of these categories.

No comments: