"Richard Foote" <richard.foote@tbigpond.nospam.com> wrote in message
news:yUIkc.6619$TT.2282@news-server.bigpond.net.au...
> many have got it so wrong. So I focused on those "experts" that seem to
> publish a lot of "stuff" ...
you mean the quantity of books published is not in direct
proportion to expertise? Now now, next you're gonna tell us that
BCHR is not the most performance gauge...
> No, it's definitely by volume. I showed this in a thread here a while back
> where after a leaf block split, the number of index entries in each block
> varied but the amount of used space sat at the 50% mark in each block.
Great prezzie, Richard. Excellent info. I do particularly like the point
you raise about clustering the table data, not the index! I've been advising
people to load their data in physical sequence of range scan keys for years
and have had tremendous performance improvements from just physically
sorting data in tables where range scans are involved. Now I know precisely
the reason why.
After reading it, I've got a question for you if I may:
Let's assume a situation where we have a root block, three branch blocks
and a bucketload of leaf blocks, like in your slide #22.
Assume as well that this index was created on a sequence-generated
column (regular but not necessarily consecutive increase). As such in the
pictorial representation the index leaf blocks would be organised left to
right in increase value of the key.
Now let's look at the physical (in the disk) distribution of this thing.
Presumably the root block and the first branch and a few leaf blocks
would be on consecutive blocks on the disk partition - assuming a raw
disk, an "intelligent" file system might do other things...
Followed by another branch block and more leaf blocks,then another branch
block and more leaf blocks. Correct assumption?
If so, then let's assume that later interspersed insertions of new keys
cause the third block from the left to split. Like you say in your slides,
no problemo: the new block is allocated from index freelist, the initial
block gets "emptied" 50% into the new one and we do not get an increase in
index level. Fine and dandy.
But now consider the physical distribution. Isn't that new block coming
from the freelist? Which might be pointing to a first free block on "the
other side of the disk", so to speak. This block will now LOGICALLY be
between the original third and fourth leaf blocks, but physically very
remote from these two original blocks.
You know where I'm getting at now, don't you? If someone is now doing an
index range scan - which for the sake of example will involve the third,
new fourth and old fourth leaf blocks using the block-to-block link
pointers, wouldn't it involve a heap of I/O wait while the arm got
repositioned to get the new fourth block from "the other side of the disk"?
Wouldn't that be a case for an index rebuild, based purely on physical
distribution rather than hazy "broken b-tree" concepts? What would be
a way of finding out if this was indeed the case, apart from the obvious
system I/O wait queue increasing in this device?
Thanks in advance for any feedback.
--
Cheers
Nuno Souto
wizofoz2k@yahoo.com.au.nospam