the cxpacket is a parallelism wait - you might want to experiment with the maxdop statement to see if using less procs speeds things up.
I see this with poor/complex sql where the cost generates a parallel plan but it actually slows things down.
I've just applied this to a 12 table cross database report - with all procs 18 - 20 secs --- with the maxdop hint 1 sec. ( 8 physical proc box running 16 with HT )
It's not that there is anything drastically wrong with the query - it's just messy < grin > .. sometimes it may be an indication of missing indexes.
[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/