I receive data from an external data provider which comes in as a single large table. The data represents balance sheet information for several different companies.
The table structure boils down to 3 columns:
1. PK Company identifier (each different company has a different id number)
2. PK Field id number (each item on the balance sheet has a different id number, so total assets might be '15', total liabilities '23' etc.)
3. The data itself
For my OLAP system, I want to populate a table that contains a single row for each company. This table will be viewed regularly by users. The columns in the table would hold all the different balance sheet items for each company.
The SELECT statement I used for inserting data from the large source table to my destination table joins up on itself using the different field numbers as follows:
FROM SourceTable a
LEFT JOIN SourceTable a1
a.FieldIDNumber = '1'
AND a.CompanyIdentifier = a1.CompanyIdentifier
AND a1.FieldIDNumber = '2'
LEFT JOIN SourceTable a2
a.CompanyIdentifier = a2.CompanyIdentifier
AND a2.FieldIDNumber = '3'
LEFT JOIN SourceTable a3
a.CompanyIdentifier = a3.CompanyIdentifier
AND a3.FieldIDNumber = '4'
LEFT JOIN SourceTable a4
a.CompanyIdentifier = a4.CompanyIdentifier
AND a4.FieldIDNumber = '5'
This select statement takes a very long time to run. I tried to figure out what the bottleneck is and I think I'm CPU constrained for the following reasons:
1. Pattern of system resource use while query is running:
At the very beginning of the query execution, I see heavy reads on the database hard disks as the source table is loaded into memory, which I expected. Then for the vast majority of the time the query is running, hard disk read/write goes to zero and a single core on the server is 100% utilized. There is an occasional brief read of the database hard disk, maybe once every 20 minutes or so for about 2 or 3 minutes. The utilization of the remaining 5 cores is close to 0. Note that I have max degree of parallelism set to zero so all the 6 cores should be available to SQL server.
Previously on the same machine, I had hyper-threading enabled. Back then when I ran the same query, only 1 of the 12 threads was fully utilized. So CPU utilization went up when I disabled hyper-threading, but only until the point where 1 core was 100% utilized.
2. If I'm reading diagnostic checks from Glenn Berry correctly (and my correct reading isn't a given because I'm self-taught in all things programming), I don't think I'm memory constrained:
When I run this code:
-- SQL Server Process Address space info
--(shows whether locked pages is enabled, among other things)
FROM sys.dm_os_process_memory WITH (NOLOCK) OPTION (RECOMPILE);
The result is 0 for process_physical_memory_low and 0 for process_virtual_memory_low.
When I run this code:
SELECT total_physical_memory_kb, available_physical_memory_kb,
FROM sys.dm_os_sys_memory WITH (NOLOCK) OPTION (RECOMPILE);
system_memory_state_desc says 'Available physical memory is high'
When I look at the top wait types for the server, I see that the top wait type for the server is CX_PACKET, which is what the server is waiting on over 99% of the time.
I would be grateful if anyone had suggestions on how to improve the speed of execution, or could answer the following related questions:
1. My primary key on the source table includes both the company Id and the field ID number. Each of the joined tables has a different field ID number. Would the query run faster if I placed a non-clustered index on the field ID number, despite the fact that it is part of the primary key?
2. Would the following result in higher CPU utilization?
-- Use one statement to join tables a1 and a2 to table a into a temporary table.
-- Use a second statement running concurrently to join tables a3 and a4 to table a into a second temporary table. Perhaps this would utilize a different core to the first statement?
-- Use a third statement once the first two have finished to join both temporary tables together and insert into my destination table.