You're underestimating the importance of the software environment and config variability.
There are no indications that scrypt mining puts any significant load on even a single pci lane. I've tested this myself across many machines and lane width settings and always end up with the same results.
The reason you often cannot run high aggression and high thread concurrency at the same time is because GCN has optimizations still to be done with regards to addressing and re-writing its large buffer. 13.2 is supposed to bring some of those optimizations.
CPU speed also has nothing to do with it, even an atom could probably handle shares from several GPUs, at least. Very little data is being dealt with, just the completed share moved out to the pool. The GPU is doing all the math and randomly running a few reads to the secondary ddr3 system buffer.... Yes, even ddr2 yields the exact same results.
Focus on the software environment and config variability. Currently, the cgminer 512MB buffer limitation (implemented for stability) will hold back 6970s/7950/70s. It's your decision: stability and features vs. raw hashing speed.
PandaMiner, downgrading to below 12.6 with a GCN card is not recommended or it may not work at all. Even 12.6 is pre-gcn.
I suggest everyone download the AMD cleanup utility if you're going to be upgrading/downgrading looking for optimal hashrates. Recently released, no need for manual driver cleaning or bloated 'driver sweepers', this does a far better job in a much smaller package.