Whats the downside of having duplicate rows in nonleaf nodes? Aren't they still going to point to the correct child nodes that they're the parent of?
Yes, they will, however it can be more work when modifying or deleting the rows. Imagine an index on gender and a row is deleted. That row needs to be located in the nonclustered index to be removed (from the leaf). If all we had at the intermediate levels was the gender, SQL would have to scan half the index to find the row that it needs to remove. If the index rows are unique (from the clustered key or RID), SQL can just navigate straight to the row it need to remove.
http://technet.microsoft.com/en-us/sqlserver/gg508878.aspx (about 24 minutes in)
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
We walk in the dark places no others will enter
We stand on the bridge and no one may pass