This is a very good point. This is where the hardware architecture makes a difference...
For instance, on an ES7000, you could affinitize SQL Server to run on only 12 processors out of 16, and specify a MAX DOP of 4. However, there is no way to tell SQL Server which 4 processors are to be used for a given query. So SQL Server could pick 4 processors that are all on different sub-pods, or 4 processors that are all on the same sub-pod.
You would get very different results since in the first case the hardware has to do a lot of 3rd-level cache synchronization across sub-pods, whereas in the second scenario it would be minimized, and hence reducing concurrency issues. What makes it worse is that this is highly unpredictable.
Since you really never know exactly which processors are going to be used (because of the underlying hardware and the current CPU load), it is difficult to predict the performance outcome of any given statement, with or without the DOP implemented.
Because of the hardware-dependent nature of parallel queries, it is important to look at the DOP from a macro standpoint. What is the "overall" impact of its implementation? This question is valid even if the DOP is implemented for a single query.
I agree with you that you don't want to change the DOP unless necessary. SQL Server 2000 seems to implement the DOP in a very stable fashion. I know of a client that has successfully implemented this technique in production.
All-in-all, it is difficult to talk about parallel queries without considering the hardware SQL Server is running on. The underlying hardware architecture can have a very dramatic impact on SQL Server's behavior.
MAXDOP well where to start.
The first encounter of this was with SQL 7. a very badly written statment resulted in an almost infite loop due to the parallel processes not getting back together.
My view is that if you have a well defined statement then the default works well. However if you are resulting in large rowsets in your query i.e two sub queries returning lots or rows that then have to be joined, then using MAXDOP is a serious consideration.
When handling large sets of data within a query I have often seen SQL choose a parallel query plan that results in the query not completing.
Co-author of SQL Server 2000 XML Distilled
MCDBA, MCSE, MCSD
SQL Server Database Proxy/Firewall and Auditing