• the cost of changing the physical order within a page is not that much; the page is in cache anyway, it is alwayys read and written as a whole, so the only cost is that of moving at most 8,000 bytes around in a memory block.

    I have to agree with Hugo on this. I've done a lot of work in C++ with parsing text files and sorting and caching data. I/O (disk/network) is where you really take the hit. In-memory manipulations using efficient, low-level code can be very fast.