Selecting columns based on user input?

  • I am not sure how to describe this and I'm sorry if the subject isn't very clear.

    Here is what I want to do: I have to generate reports for my employer and sometimes he wants to see things at varying degrees of detail. For example:

    (The example will have a Table called "Cars" with columns "Yr", "Make", "Model", and "Price")

    Sometimes I need to know, say...the average price of all cars for a year (this query):

    SELECT Yr, AVG(Price)

    FROM Cars

    GROUP BY Yr

    And Sometimes I need the average price for all cars of a year and make (this query):

    SELECT Yr, Make, AVG(Price)

    FROM Cars

    GROUP BY Yr, Make

    Is there any way using just SQL to write a query that could handle both cases? I am integrating this within a .NET application so of course I could always build up the query string in my C# file depending on the requirements, but that feels so messy and is a real pain to debug, especially when you start adding rollups and whatnot. Also it seems that doing as much in SQL and as little in .NET is always best for performance.

    I've thought of is to write a procedure for each "level" of detail, but the amount of redundancy within those procedures would be really high, and the last thing I want to do is update 4 procedures when I need to make a change.

    I've also thought of writing a query that returns all of the data at every level and then just pick what is needed out of that, but that takes too much time.

    If you have any suggestions, throw them my way! Thanks in advance for any help,

  • Your options are either to limit the selections the manager has and run a specific stored procedure for each option, return the all the detail data and do all the manipulation and roll up in your application, or do a single stored procedure that uses dynamic sql that returns the data as desired. The downside to the third option is that you need to be sure to protect against SQL Injection. Here are a couple of links to articles acknowledged to be some of the best about dynamic sql.

    http://www.sommarskog.se/dynamic_sql.html

    http://www.sommarskog.se/dyn-search-2005.html

  • The thing I'd suggest is that you write two stored procedures, which is essentially two methods. Then if you want to build an abstraction, write a 3rd that takes both parameters, if only one comes in (check for NULL), call one proc, otherwise call the other.

    That seems like work, but if you build it up in your app, you're essentially doing the same thing. By encapsulating this into multiple procs, you can easily call them from other places, and also it is tune-able if more results or columns are needed.

  • Aha, dynamic SQL is the term I needed.

    I think I'm pretty well protected from SQL Injection by using .Net's SqlParameter object and specifying the type. I've seen in other projects where it has prevented some potentially malicious injection attacks.

    I'll get to reading...thanks a bunch!

  • Steve Jones - Editor (7/20/2009)


    The thing I'd suggest is that you write two stored procedures, which is essentially two methods. Then if you want to build an abstraction, write a 3rd that takes both parameters, if only one comes in (check for NULL), call one proc, otherwise call the other.

    That seems like work, but if you build it up in your app, you're essentially doing the same thing. By encapsulating this into multiple procs, you can easily call them from other places, and also it is tune-able if more results or columns are needed.

    That seems like a good solution, but it would ultimately lead to a lot of duplicating a lot of the same code. But, maybe if I put all of the complex joins and temporary tables into a view and then use the various procedures to aggregate the data from the view I could avoid having to rewrite all the crap. That might be just what I need.

  • You're not necessarily duplicating lots of code. You might dup some, but you're also separating out and abstracting the functions to do a specific thing. That's what's done in many methods in OOP programming. You separate out those different items that perform different functions.

    Don't forget, you copy this code and alter it a couple times. It's very little of your time. Trying to munge things together and having the end-user suffer later will happen over, and over again.

  • I understood you to mean that you could have several different ways the manager wants to look at the data. If it is only those 2 then I would do as Steve has suggested.

    To be honest, I have always preferred to write static reports for specific purposes. This way you can more easily optimize the T-SQL.

  • Jack Corbett (7/20/2009)


    I understood you to mean that you could have several different ways the manager wants to look at the data. If it is only those 2 then I would do as Steve has suggested.

    To be honest, I have always preferred to write static reports for specific purposes. This way you can more easily optimize the T-SQL.

    It's possible...One minute he wants one thing and then 2 years later he could decide he wants something completely different 😛

    In general though, the data can really be seen on 4 levels: customer, item category, item subcategory, and item. So there are potentially 4 of them in the specific case I'm working on now. I'd kind of like to set up a best practice sort of thing though, for all of my future reports because that's really a lot of my job and it's not something that's ever been done very well here.

    I definitely see your point about optimization though. At least you've both given me a good bit to chew on 😀

  • I'd be curious to know what you decide to do and why. It's a good debate, and others can learn from this. If you have a blog, post it there and link from here. Or write us an article about why you did what you did.

  • I definitely agree with Steve that you should post your choice. Especially why you chose that route.

  • I'd be happy to oblige! I don't have a blog, but I could whip something up and post it here. Maybe it's time to start a blog anyway.

  • Alright, so this ended up causing me quite the dilemma today. I'm not sure anything I write up would be extremely useful to anyone, though. I suppose I could sort of make a comparison of the two methodologies (dynamic SQL vs. .Net query string building) since I spent a bit of time with them.

    Essentially the problem is a lot more complicated than I originally stated here. I've determined that dynamic SQL/ ASP.Net environment query string building is probably the only way to do what I want to do and it's pretty much ugly no matter what you try to do.

    My original goal was to try and simplify everything so that I could still somewhat read my SQL at the end, but I don't believe there's really any way to make that happen. I think the best idea, and the idea I'm going with is to use .Net and rely on private functions to do a lot of the really ugly work.

    Here are the things that ultimately prevented me from using any of the methods we talked about here:

    1) The SQL needs to be able to filter out various things or nothing at all. In other words, the code needs to be able to handle

    WHERE Make = 'Ford' And Yr = '1999' As well as just WHERE Yr = '1999' or any possible combination of the columns (including nothing). This alone would prevent me from making a bunch of stored procedures as there are a lot of filters that can be applied.

    2) The sql needs to be able to use arrays/comma delimited lists. That pretty much kills the standard stored procedure. You need to use some kind of dynamic SQL or .Net code to do this, from my research (well, I did talk to someone about using a Cursor and a temporary table, but that seemed unnecessary and resource intensive to me).

    3) I need to make use of quite a few if statements. I did some of this with dynamic SQL and ultimately decided I preferred writing ifs in .Net because I could do a better job of code reuse.

    Consider this case (the SQL might have a couple of errors in it, but it should still illustrate my point):

    SELECT

    CASE WHEN GROUPING(Yr) = 1 THEN 'Totals' ELSE CONVERT(varchar(50), Yr) END AS Yr,

    CASE WHEN GROUPING(Make) = 1 THEN 'Year Totals' ELSE CONVERT(varchar(50), Make) END AS Make,

    CASE WHEN GROUPING(Model) = 1 THEN 'Make Totals' ELSE CONVERT(varchar(50), Model) END AS Model,

    AVG(Price)

    FROM Cars

    GROUP BY Yr, Make, Model WITH ROLLUP

    So now if the report needs to have the aggregates appear in any order combination, I have to put if statements around both the CASE statements and the GROUP BYs, and I'd need some kind of flag to tell me whether or not the possible aggregate field should be used.

    Anyway, to make a long story long(er)...I've decided to continue building strings in .Net, but try really hard to make sure it is coded and commented in a way that is somewhat easy to follow. In any event, at least I checked out the other options out there...

    thanks again for the advice you guys gave (that article on dynamic SQL in particular is really thorough!)

  • gage.trader (7/20/2009)


    Jack Corbett (7/20/2009)


    I understood you to mean that you could have several different ways the manager wants to look at the data. If it is only those 2 then I would do as Steve has suggested.

    To be honest, I have always preferred to write static reports for specific purposes. This way you can more easily optimize the T-SQL.

    It's possible...One minute he wants one thing and then 2 years later he could decide he wants something completely different 😛

    In general though, the data can really be seen on 4 levels: customer, item category, item subcategory, and item. So there are potentially 4 of them in the specific case I'm working on now. I'd kind of like to set up a best practice sort of thing though, for all of my future reports because that's really a lot of my job and it's not something that's ever been done very well here.

    I definitely see your point about optimization though. At least you've both given me a good bit to chew on 😀

    Frankly - if you're dealing with SSRS or some other reporting tool - play with it there. Dynamic grouping levels might work better for you than trying to write something ugly in C#.

    And it essentially allows you to leave the base query alone.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply