Thanks, I got the values:
DATASOURCE= 2 distinct values
DATATYPE = 5 distinct values
FILEREFRECID = 3820131 distinct values
AXMATCHINGCODE = 12 distinct values
DATAAREAID = 2 distinct values
PARTITION = 1 distinct values.
A non-clustered index with PARTITION, DATAAREAID, FILEREFRECID columns seems to be the best fit for my query and indeed I get great performance when the optimizer picks that Index.
My real problem I guess is why does SQL pick sometimes the other terrible index? Even the execution plan does not make sense(why there is no sort operation) ?
Statistics are updated. Does not seem to be a parameter sniffing issue.