Retrieve more than 1 million rows from Cube

Question

Post reply

Retrieve more than 1 million rows from Cube

jondy87

SSC-Addicted

Points: 426
More actions
May 30, 2013 at 3:55 pm

#278008

Hello All,
Am new to ssas and been trying to implement it for our current system
Here is what i have built
Fact table: 70 million rows(potentially grows to 4 to 5 Billion)
10 Dimensions(1 dimension having 5 million rows and others vary from 1-10000 rows)
I have processed the cube and my results sets involving less than 10000 rows are amazingly fast.(With the huge dimension included in the MDX query)
What am trying to Do:
- I need to retrieve data based on this huge dimension and the result set will have close to 1 million rows. This is taking me close to 1 minute to retrieve this kind of data
My question is:
- Can the performance of the cube to retrieve data(in millions) be optimized? like less than 5 seconds
- Or Is it a wrong approach by using a cube to pull this kind of data?
Note:
-I have also used the aggregation wizard ans UBO that brought my query time from 65 seconds to 54 seconds:-)
-The MDX queries am running are from script view of the Cube Browser and other MDX designers i found online
Any Ideas would be of great help

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply

richykong SSCommitted Points: 1502 More actions · Answer 1

Sounds like you're building a pretty large cube and have a pretty powerful server to run it on.

Couple of things that worries me.

1. Why do you have a 5 million record dimension? Is this 5 million record dimension expected to grow? If it grow linearly as the fact grows, you're going to have an issue since the dimension only supports a little over 2 billion members. You'll also face extremely high processing times on the dimension unless your doing process add.

2. Why do you need to pull 1 million rows from SSAS? Can you just pull it form SQL?

SSAS is meant to provide a way to quickly analyze aggregate data. I can't imagine a scenario where any person is doing an analysis and needs to look at a million records at a time.

I have a feeling you're designing a cube to support something like a SalesOrder with SalesLine fact data, where you have a single SalesOrder with mulitple SalesLines and have made the SalesOrderID into a dimension. I'm assuming in this case, the business wants to be able to look analyze the data and be able to get the specific SalesOrders related to the slice of data. I've dealt with something like this by removing the SalesOrderID dimension and using ASSP[/url] in the drilldown actions to run a SQL query to retrieve the detailed SalesID form the Fact table. The idea is that the SSAS allows you to slice and dice aggregate data and then from there drilldown to the detailed records by passing in the Dimension filters as a parameter to a stored proc that returns the detail records.

jondy87 SSC-Addicted Points: 426 More actions · Answer 2

Hello richykong ,

Thanks a lot for your reply

I have looked up on ASSP you were talking about and It has Solved my problem in obtaining detail level information for the Dimension Values.:-)

Other than that,

1. The dimension with 5 million rows will not be growing. It will be fixed. All my measures are aggregated at this dimension level. So i need to include this dimension whenever User hits the Cube.

2. The row count user tries to access from Cube varies approx from 1 row - 1 million rows. Here, We use both the aggregated values and details for analysis.Since the aggregated values are based on this huge dimension, We have bigger data sets returned back depending on the user criteria.

So, Will it Still be a good idea to go for a cube for my scenario?

Can HOLAP storage be of any help?

Thankyou

RonKyle SSC-Dedicated Points: 31537 More actions · Answer 3

SSAS is meant to provide a way to quickly analyze aggregate data. I can't imagine a scenario where any person is doing an analysis and needs to look at a million records at a time.

I agree. Some of what you have said sounds like you understand this well. But others not so much. When I have users pull even 10,000 rows, I ask them what they need that for. Normally there's some calculation that I can provide for them so that they don't need to do that. Strictly speaking from performance, however, pulling a million rows back in one minute seems pretty fast to me.

jondy87 SSC-Addicted Points: 426 More actions · Answer 4

Thank you guys for Your replies

I can't imagine a scenario where any person is doing an analysis and needs to look at a million records at a time

Each row from my result set transforms as a point on the map and hence the larger data set depending on the geographical area.

So am still thinking if SSAS would be a viable solution to my problem