May 23, 2018 at 8:12 am
Hello,
I have to count the number of persons in a company according to multiple dimensions / filters. The output must be a table with all combinations and the number of persons for each combination. If 0, it doesn't have to be in the table.
For example, I have to find:
- Number ALL men
- Number ALL men who are from Germany
- Number ALL men who are from France
- Number ALL men who are from X
- Number ALL women
- Number ALL women who are from Germany
- Number ALL women who are from France
- Number ALL women who are from X
This is only with two dimensions. I could have betwen 5 and 15, which gives in the worst case 65 000 000 posible combinations.
For now I'm doing it with Excel / VBA, with up to 15 "For each".
But I feel like I could do something more efficient with a Select Count in SQL Server. I just have no idea how to do those loops in SQL, and if it's worth it.
Any help on this?
Thanks.
May 23, 2018 at 8:16 am
"Dimensions"? Is this in a cube and thus you need an MDX query, or in a table in the data engine and thus you'd need T-SQL. You've posted in a T-SQL forum, which would suggest the latter, however, you can't use T-SQL to query a Dimension in a cube, hence my confusion.
I must say, I don't really understand your requirement here; I have no idea how you suddenly got to "65M possible combinations". Could you provide some sample data and expected results to show what you're after here please? There's a link in my signature on how to post DDL and sample data for questions in my signature.
Thanks.
Thom~
Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
Larnu.uk
May 23, 2018 at 8:42 am
allard.julien.70 - Wednesday, May 23, 2018 8:12 AMHello,I have to count the number of persons in a company according to multiple dimensions / filters. The output must be a table with all combinations and the number of persons for each combination. If 0, it doesn't have to be in the table.
For example, I have to find:
- Number ALL men
- Number ALL men who are from Germany
- Number ALL men who are from France
- Number ALL men who are from X
- Number ALL women
- Number ALL women who are from Germany
- Number ALL women who are from France
- Number ALL women who are from XThis is only with two dimensions. I could have betwen 5 and 15, which gives in the worst case 65 000 000 posible combinations.
For now I'm doing it with Excel / VBA, with up to 15 "For each".
But I feel like I could do something more efficient with a Select Count in SQL Server. I just have no idea how to do those loops in SQL, and if it's worth it.Any help on this?
Thanks.
Have a look at GROUP BY in Books Online (the SQL Server help system)
SELECT Gender, Country, Count = COUNT(*)
FROM table
GROUP BY Gender, Country
WITH ROLLUP / WITH CUBE
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
May 23, 2018 at 8:43 am
Try this. If you wanted to go to an indeterminate number of dimensions, you'd need to use dynamic SQL. I don't know whether, or at what point, performance will become a problem.
-- Test harness
CREATE TABLE #People (
FullName varchar(100)
, Gender char(1)
, Country char(20)
);
TRUNCATE TABLE #People;
SELECT TOP 100
CAST(NEWID() AS varchar(100))
, CASE ABS(CHECKSUM(NEWID())) % 2
WHEN 0 THEN 'M'
ELSE 'F'
END
, CASE ABS(CHECKSUM(NEWID())) % 3
WHEN 0 THEN 'Germany'
WHEN 1 THEN 'France'
ELSE 'X'
END
FROM sys.all_columns;
-- Eyeball the test data
SELECT * FROM #People;
SELECT
Gender
, COUNT(*)
FROM #People
GROUP BY Gender;
SELECT
Country
, COUNT(*)
FROM #People
GROUP BY Country;
-- Get the breakdown
WITH Genders AS (
SELECT DISTINCT Gender
FROM #People
UNION ALL SELECT NULL
)
, Countries AS (
SELECT DISTINCT Country
FROM #People
UNION ALL SELECT NULL
)
SELECT
g.Gender
, c.Country
, COUNT(*)
FROM Genders g
CROSS JOIN Countries c
JOIN #People p
ON COALESCE(g.Gender,p.Gender) = p.Gender
AND COALESCE(c.Country,p.Country) = p.Country
GROUP BY
g.Gender
, c.Country;
May 23, 2018 at 1:35 pm
John Mitchell-245523 - Wednesday, May 23, 2018 8:43 AMTry this. If you wanted to go to an indeterminate number of dimensions, you'd need to use dynamic SQL. I don't know whether, or at what point, performance will become a problem.
John, your last statement can be reduced drastically using WITH CUBE.
SELECT
Gender
, Country
, COUNT(*)
FROM #People
GROUP BY Gender,
Country WITH CUBE
Viewing 5 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply