Extract data with variable column names and order

  • Hi guys,  Just looking for any good suggestions on how to tackle this problem.

    I have a table of attributes which are held in very wide sparsely populated table (at least it's not EAV!).  For display purposes in the application there is a separate table that defines which columns are being used, their display order and the label attached to each column.

    Attributes
    AttributeType OccurrenceID Text1 Text2 Text3 Date1 Date2 Date3 Integer1 Integer2 Integer3
    17 123 Hello World 22-02-17 42
    17 124 Goodbye Dolly 22-02-19 99
    18 125 BOO! 01-01-18 -99

    Config
    AttributeType ColumnName Position DisplayLabel
    17 Integer1 1 MembershipID
    17 Date2 2 Date Joined
    17 Text1 3 Locker Code
    17 Text3 4 Reset Code
    18 Text2 1 Greeting
    18 Integer3 2 Credit
    19 Date3 3 Last Updated

    I need to extract the data with the relevant column DiplayLabel in the correct position.  The only way I can think of to do this reliably is using a CURSOR for each attribute type with another CURSOR for config table to generate a Dynamic SQL statement then I then sp_execute_sql to a temp table and then BCP export the temp table to a CSV file or in XML format.  At the moment I only need to show the data in Excel so I can write some VBA code to pick up all the output files and generate the Excel file with the appropriate headings.  The business MAY want to put this into a reporting tool (possbily QlikView or Reporting Services).  I can't think of a way that SSRS can handle this for dynamic data as I won't know the data types or ordinal positions of the columns in advance.

    Can anyone come up with a suggestion other than using the dreaded CURSOR.  The data sets are not huge, only 300 attribute types and about 150K occurrences and the extracts will be ad-hoc so I'm not concerned about speed, rather about maintainability.

  • I would create a view matching each AttributeType.  I think that would be the easiest to use.  Is there some reason that wouldn't work for what you need here?

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • That is, something like this.  I had to set up directly usable data myself to verify the code, which, btw, took me longer than writing the code itself!  In the future, please provide CREATE TABLE and INSERT statements for your sample data.

    IF OBJECT_ID('tempdb.dbo.#Attributes') IS NOT NULL
    DROP TABLE #Attributes;
    CREATE TABLE #Attributes (
    AttributeType int NOT NULL,
    OccurrenceID bigint NOT NULL,
    Text1 varchar(8000) NULL,
    Text2 varchar(8000) NULL,
    Text3 varchar(8000) NULL,
    Date1 date NULL,
    Date2 date NULL,
    Date3 date NULL,
    Integer1 int NULL,
    Integer2 int NULL,
    Integer3 int NULL
    )
    SET NOCOUNT ON;
    INSERT INTO #Attributes VALUES
    (17, 123, 'Hello', 'World', NULL, '2022-02-17', NULL, NULL, 42, NULL, NULL),
    (17, 124, 'Goodbye', 'Dolly', NULL, NULL, '2022-02-19', NULL, NULL, 99, NULL),
    (18, 125, NULL, NULL, 'BOO!', NULL, NULL, '2018-01-18', NULL, NULL, -99)
    SET NOCOUNT OFF;

    IF OBJECT_ID('tempdb.dbo.#Config') IS NOT NULL
    DROP TABLE #Config;
    CREATE TABLE #Config (
    AttributeType int NOT NULL,
    ColumnName varchar(128) NOT NULL,
    Position smallint NOT NULL,
    DisplayLabel varchar(200) NULL
    )
    SET NOCOUNT ON;
    INSERT INTO #Config VALUES
    (17, 'Integer1', 1, 'MembershipID'),
    (17, 'Date2', 2, 'Date Joined'),
    (17, 'Text1', 3, 'Locker Code'),
    (17, 'Text3', 4, 'Reset Code'),
    (18, 'Text2', 1, 'Greeting'),
    (18, 'Integer3', 2, 'Credit'),
    (19, 'Date3', 3, 'Last Updated')
    SET NOCOUNT OFF;

    DECLARE @AttributeType int
    DECLARE @column_list varchar(max)
    DECLARE @sql varchar(max)
    DECLARE @sql_template varchar(max)

    SET @AttributeType = 17

    ----------------------------------------------------------------------------------------------------

    SET @sql_template = '
    CREATE VIEW dbo.config_$AttributeType$
    AS
    SELECT $column_list$
    FROM #Attributes
    WHERE AttributeType = $AttributeType$
    '

    SELECT @column_list = STUFF((
    SELECT ', [' + c.ColumnName + '] AS [' + ISNULL(c.DisplayLabel, c.ColumnName) + ']'
    FROM #Config c
    WHERE c.AttributeType = @AttributeType
    ORDER BY c.Position
    FOR XML PATH(''), TYPE).value('.', 'varchar(max)'), 1, 2, '')

    SET @sql = REPLACE(REPLACE(@sql_template,
    '$AttributeType$', CAST(@AttributeType AS varchar(10))),
    '$column_list$', @column_list)

    PRINT @sql
    --EXEC(@sql)

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • aaron.reese wrote:

    (at least it's not EAV!)

    Actually, too bad it's not! 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks Scott, you have waaay too much time on your hands.

    That was effectivley what I was going to do but rather than create 300 views, just dump the data sets, or write a sproc that takes the attribute type and generates the relevant results for that attribute.

    The actual scripts will be a little more complicated as some of the values are actually codes in one of two lookup tables so for some records in the config there will be an additional joins to be made.

    I like the use of STUFF though, cleaner and more performant than dynamically appending to a string.

     

     

  • not wanting to get off-topic and there are dozens of threads on EAV on this site, why would you prefer EAV for this problem.  I can turn it into EAV with a PIVOT.

  • aaron.reese wrote:

    not wanting to get off-topic and there are dozens of threads on EAV on this site, why would you prefer EAV for this problem.  I can turn it into EAV with a PIVOT.

    Because even some of the miseries provided by an EAV table are better than a too wide and too sparse denormalized table, some of which you're going through now.  I also wouldn't use a PIVOT... old CROSSTAB method, sure... just not PIVOT.  It's comparatively slow even though machines have gotten  fast enough to sometimes make it look otherwise.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • aaron.reese wrote:

    Thanks Scott, you have waaay too much time on your hands.

    That was effectivley what I was going to do but rather than create 300 views, just dump the data sets, or write a sproc that takes the attribute type and generates the relevant results for that attribute.

    The actual scripts will be a little more complicated as some of the values are actually codes in one of two lookup tables so for some records in the config there will be an additional joins to be made.

    I like the use of STUFF though, cleaner and more performant than dynamically appending to a string.

    Hmm, then I'm not sure why you asked the q.  Or at least didn't state your intentions in the q.

    Calling a proc every time seems far more overhead and difficulty to me than just creating a view that can be directly referenced.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • aaron.reese wrote:

    ...the extracts will be ad-hoc so I'm not concerned about speed, rather about maintainability.

    Right now, the "config" table is setup only to provide a list of columns by attribute type and so the dynamic SQL will only be a basic SELECT.  What happens if someone throws a monkey wrench into the works by saying they need an aggregate query of some kind for (just an example) all rows where the AttributeType is "20" or something crazy like a request to display the sum of 2 or more INT columns for each row?

    My suggestion would be to have a "config" table that contains the actual SQL to do the job rather than building it on the fly all the time.  It would be kind of like the suggestion Scott made about building a view for each query but without actually having to materialize a gazillion different views.  That would allow you to copy and paste the code from the SQL column into SSMS in a readable format (maybe even with a comment or two :D), make and test the changes, and then store the modified code back into the config table.

    Like I said, just a suggestion but that's the way I'd actually do it.  It might take a wee bit longer to do upfront but could save a huge amount of time after deployment when it came to troubleshooting and maintenance all while making it unnecessary to recalculate the SQL every time you wanted to run it AS WELL AS making it infinitely more flexible as to what the code could do.  Just select it into a variable and execute the variable.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • @Aaron.reese ,

    If you're absolutely stuck with doing the way that you've outlined, you've posted in a 2017 forum so lets use some of the functionality available.

    First, here's the test data for your original post in a readily consumable format.  Scott did a similar thing but I wanted to add some stuff in case I need to use it as an example somewhere.

    TEST DATA:

    --=====================================================================================================================
    -- Create and populate the "Attributes" test table.
    -- I named it "TestAttributes" because it needs to be a non-temporary table to demonstrate use with an iTVF
    -- and I didn't want to take the chance of someone dropping the real table by mistake.
    -- Added a clustered PK, which may be incorrect because I don't know your data.
    -- Also, just guessing what the datatypes may be here based on the screenshot in the orginal post.
    --=====================================================================================================================
    DROP TABLE IF EXISTS dbo.TestAttributes; --Just to make reruns in SSMS easier
    GO
    CREATE TABLE dbo.TestAttributes
    (
    AttributeType INT NOT NULL
    ,OccurrenceID BIGINT NOT NULL
    ,Text1 VARCHAR(8000) NULL
    ,Text2 VARCHAR(8000) NULL
    ,Text3 VARCHAR(8000) NULL
    ,Date1 DATE NULL
    ,Date2 DATE NULL
    ,Date3 DATE NULL
    ,Integer1 INT NULL
    ,Integer2 INT NULL
    ,Integer3 INT NULL
    ,CONSTRAINT PK_TestAttributes PRIMARY KEY CLUSTERED (AttributeType,OccurrenceID)
    )
    ;
    INSERT INTO dbo.TestAttributes
    (AttributeType,OccurrenceID,Text1,Text2,Text3,Date1,Date2,Date3,Integer1,Integer2,Integer3)
    VALUES (17,123,'Hello' ,NULL ,'World',NULL,'2022-02-17',NULL ,42 ,NULL,NULL)
    ,(17,124,'Goodbye',NULL ,'Dolly',NULL,'2022-02-19',NULL ,99 ,NULL,NULL)
    ,(18,125,NULL ,'BOO!',NULL ,NULL,NULL ,'2018-01-18',NULL,NULL,-99 )
    ;
    --=====================================================================================================================
    -- Create and populate the "Config" test table.
    -- I named it "TestAttributeConfig" because it needs to be a non-temporary table to demonstrate use with an iTVF
    -- and I didn't want to take the chance of someone dropping the real table by mistake.
    -- Added a clustered PK, which may be incorrect because I don't know your data.
    --=====================================================================================================================
    DROP TABLE IF EXISTS dbo.TestAttributeConfig; --Just to make reruns in SSMS easier
    GO
    CREATE TABLE dbo.TestAttributeConfig
    (
    AttributeType INT NOT NULL CHECK (LEN(AttributeType) > 0)
    ,ColumnName VARCHAR(128) NOT NULL CHECK (LEN(ColumnName) > 0)
    ,Position SMALLINT NOT NULL CHECK (Position > 0)
    ,DisplayLabel VARCHAR(200) NOT NULL CHECK (LEN(DisplayLabel) > 0)
    ,CONSTRAINT PK_TestAttributeConfig PRIMARY KEY CLUSTERED (AttributeType,Position)
    )
    ;
    INSERT INTO dbo.TestAttributeConfig
    (AttributeType,ColumnName,Position,DisplayLabel)
    VALUES (17,'Integer1',1,'MembershipID')
    ,(17,'Date2' ,2,'Date Joined' )
    ,(17,'Text1' ,3,'Locker Code' )
    ,(17,'Text3' ,4,'Reset Code' )
    ,(18,'Text2' ,1,'Greeting' )
    ,(18,'Integer3',2,'Credit' )
    ,(19,'Date3' ,3,'Last Updated')
    ;
    GO

    Here's a function that uses some of the newer capabilities of SQL Server 2017.  I made it so the code it creates is nicely formatted because, if the proverbial poo hits the fan and you need to do troubleshooting or some modification, it's a whole lot easier to do with formatted code than reading one, possible huge, single line of code.  Details for everything are in the comments.

    Easy to Use iTVF Function (no cursor)

     CREATE FUNCTION dbo.GetAttributeDisplaySQL
    /**********************************************************************************************************************
    Purpose:
    Given a valid AttributeType, return the dynamic SQL to return the defined column names (according to the
    dbo.TestAttributeConfig table) from the dbo.TestAttributes table.
    -----------------------------------------------------------------------------------------------------------------------
    Example Usage:
    --===== Return and execute the dynamic SQL to display the columns according to the given AttributeType and
    -- related column configuration.
    DECLARE @SQL NVARCHAR(MAX) --Needs to be NVARCHAR to support sp_executesql
    ,@AttributeType INT = 17
    ;
    SELECT @SQL = DisplaySQL
    FROM dbo.GetAttributeDisplaySQL(@AttributeType)
    ;
    PRINT @SQL --Can be removed. Just showing the code in the Messages tab of SSMS for demonstration purposes.
    ;
    EXEC sp_executesql @SQL -- sp_executesql used to hopefully cache the executed SQL so recompiles won't be necessary.
    ;
    -----------------------------------------------------------------------------------------------------------------------
    Dependencies:
    Table: dbo.TestAttributeConfig
    -----------------------------------------------------------------------------------------------------------------------
    Reference:
    https://www.sqlservercentral.com/forums/topic/extract-data-with-variable-column-names-and-order#new-post
    -----------------------------------------------------------------------------------------------------------------------
    Revision History:
    Rev 00 - 11 Dec 2020 - Jeff Moden
    - Initial creation an unit test.
    **********************************************************************************************************************/
    (@pAttributeType INT)
    RETURNS TABLE WITH SCHEMABINDING AS
    RETURN
    SELECT DisplaySQL =
    CONCAT(
    CONVERT(NVARCHAR(MAX),' SELECT ') --The conversion to NVARCHAR(MAX) forces CONCAT to return NVARCHAR(MAX).
    ,STRING_AGG(
    CONCAT(QUOTENAME(ColumnName),' AS ',QUOTENAME(DisplayLabel)) --SELECT List of Columns/Aliases
    ,CONCAT(CHAR(10),SPACE(8),',') --Delimiter, which also formats the code for troubleshooting purposes.
    ) WITHIN GROUP (ORDER BY Position) --Enforces the left to right order of columns by Position.
    ,CHAR(10) --Just to make the output easily readable
    ,' FROM dbo.TestAttributes'
    ,CHAR(10) --Just to make the output easily readable
    ,CONCAT(' WHERE AttributeType = ',@pAttributeType)
    ,CHAR(10),';'
    )
    FROM dbo.TestAttributeConfig
    WHERE AttributeType = @pAttributeType
    ;
    GO

    If you run the code in the "Example Usage" section of the flower-box in the function code, it'll print out the dynamic SQL it created (which doesn't need to be included in the example code) so here's that...

    Dynamic SQL Printed:

     SELECT  [Integer1] AS [MembershipID]
    ,[Date2] AS [Date Joined]
    ,[Text1] AS [Locker Code]
    ,[Text3] AS [Reset Code]
    FROM dbo.TestAttributes
    WHERE AttributeType = 17
    ;

    ... and here's the output of the actual code run...

    Results of Dynamic SQL:

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I admit that I prefer using a cursor here to make it easier to adjust and customize the code.  As the OP stated, there will be special requirements for some of the AttributeTypes.

    If necessary, you can even write the code to a temp table so that you can add TSQL earlier in the code than the current point, allowing you to "go back" and DECLARE variables in code if certain conditions arise.  I prefer that flexibility in this case since the number of fetches will be very limited anyway.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • Btw, how then do you exec the code the function created?  Wouldn't you some type of loop / cursor to do that anyway?

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • ScottPletcher wrote:

    Btw, how then do you exec the code the function created?  Wouldn't you some type of loop / cursor to do that anyway?

    Based on the original request, I assumed that it would be for singleton use by a GUI to populate a screen and wrote it that way.  Your code seems to be written in a similar fashion in that the correct name of a view would have to be stored somewhere.   Like I said, I wouldn't actually do it this way... rather, I do it the other way where the correct query was stored rather than a list of column names to be used for the dynamic creation of the necessary SQL.

    As for how to use for the intended purpose, I almost always include 1 or more usage examples in the flower-box of the code and this code is no exception to that.  Look in the flower-box for "Example Usage".

    If Aaron comes back with additional information/requirements, then another method can be explored but it's not likely that a single query solution can be rendered because of the nature of the changing position of columns from one AttributeType to another.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Actually I intended by code to be run anytime the Config for a given AttributeType was modified.  After the initial build for every AttributeType in Config, I expected it typically to run for only one AttributeType at a time.  Probably as part of a DML trigger, to insure that the code gets recreated when necessary.  Of course the code must be able to handle multiple AttributeType changes at a time, but I would expect that to be the rare exception not the common method of changing Configs.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • Just checking so correct me if I'm wrong, please... The end result of your code was to build a view for each AttributeType and then something would have to figure out which view to call from the front end.  No matter what either of us do, that's still going to be the case whether it's calling view that's been created, calling something that creates dynamic SQL on the fly to return a result set or, like the view, calling stored SQL to return the result set.  They all have equal merit with slightly different problems for this particular problem.

    The only reason why I posted what I posted was because Aaron didn't seem to like the multiple views method.  I did see that as a problem but offered an alternative if he's not the one large and in charge of design , etc.  Mine's no better than yours.  With yours they'll have, what, 300 or so views that they'll need to decide which to use.  The same holds true for mine... it's actually identical to the view except the code is based on selecting the desired attribute type by number rather by a view name and then uses, dynamic SQL that is hopefully cached about as good as views.

    I really wish I knew how the table of "Attributes" was built and why they thought such a utility from such a wide but sparse table was going to be a good idea.  I'm certainly missing the vision behind that.  Like I said in my first post, I actually do think an EAV would be better that this particular table.  I think it would be better still, though, just to query the data from whatever table(s) the Attributes table was created from.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply