Help with a query

  • Dear SQL experts,

    I need some help to solve the following problem:

    I've created a table made by an unique_id, a date (YYYYMMDD), a customer_id and a status columns.

    This status may switch between true and false over the time for each customer. I could log into this table something like the following (for each customer):

    unique_id,date,customer_id,status

    1,20120101,123,true

    2,20120102,123,true

    3,20120103,123,true

    10,20120104,123,true

    11,20120105,123,true

    12,20120106,123,false

    13,20120107,123,false

    25,20120108,123,false

    26,20120109,123,true

    28,20120110,123,true

    32,20120111,123,true

    41,20120112,123,true

    110,20120113,123,false

    I would like to get for each customer (in this case, only for the customer # 123), all the date when the status column changes, like the following:

    20120101,123,true

    20120106,123,false

    20120109,123,true

    20120113,123,false

    Using group and min(date) it groups the 2 different status and shows me instead the following two records:

    20120101,123,true

    20120106,123,false

    How can I solve this problem, in order to list the data in the desired form?

    Thank you so much for your kind support.

    Best regards.

  • This was removed by the editor as SPAM

  • with cte as (

    select 1 as id ,20120101 as dates,123 as username,1 as active union all

    select 2,20120102,123,1 union all

    select 3,20120103,123,1 union all

    select 10,20120104,123,1 union all

    select 11,20120105,123,1 union all

    select 12,20120106,123,0 union all

    select 13,20120107,123,0 union all

    select 25,20120108,123,0 union all

    select 26,20120109,123,1 union all

    select 28,20120110,123,1 union all

    select 32,20120111,123,1 union all

    select 41,20120112,123,1 union all

    select 110,20120113,123,0

    )

    , cte2 as (

    select ROW_NUMBER() OVER ( Partition by username , active order by dates) as cnt ,* from cte

    )

    select * from cte2 where cnt =1

    order by dates asc

    Jayanth Kurup[/url]

  • I believe we can add an "OR " clause to the query to get the 20120101 record into the result set.

  • Can't say if it is the most efficient solution or not but this should work:

    DECLARE @tbl TABLE (id INT, dates DATETIME, username INT, active INT)

    INSERT @tbl

    select 1 as id ,'20120101' as dates,123 as username,1 as active union all

    select 2,'20120102',123,1 union all

    select 3,'20120103',123,1 union all

    select 10,'20120104',123,1 union all

    select 11,'20120105',123,1 union all

    select 12,'20120106',123,0 union all

    select 13,'20120107',123,0 union all

    select 25,'20120108',123,0 union all

    select 26,'20120109',123,1 union all

    select 28,'20120110',123,1 union all

    select 32,'20120111',123,1 union all

    select 41,'20120112',123,1 union all

    select 110,'20120113',123,0

    SELECT * from @tbl

    select id, dates, username, active

    from @tbl t1

    WHERE active <> ISNULL((

    SELECT TOP 1 active

    FROM @tbl t2

    WHERE t2.dates < t1.dates and t1.username = t2.username

    ORDER BY t2.dates DESC),2)


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Dear sql experts,

    Thank you so much for your valuable help!!

    I've just tried the last solution (because to me, it was the easier one to understand) and I can confirm that it solved my problem.

    Thank you all so very much.

    Best regards.

  • CELKO (2/27/2012)


    CREATE TABLE Customer_History

    (customer_id CHAR(9) NOT NULL,

    start_date DATE DEFAULT CURRENT_TIMESTAMP NOT NULL,

    end_date DATETIME UNIQUE, --null means current

    CHECK (start_date <= end_date),

    foobar_status CHAR(??) NOT NULL,

    PRIMARY KEY (customer_id, start_date));

    Why should end_date be UNIQUE ? can't two customers have same end_date ? isn't your design/assumption wrong ?

  • ColdCoffee (2/27/2012)


    CELKO (2/27/2012)


    CREATE TABLE Customer_History

    (customer_id CHAR(9) NOT NULL,

    start_date DATE DEFAULT CURRENT_TIMESTAMP NOT NULL,

    end_date DATETIME UNIQUE, --null means current

    CHECK (start_date <= end_date),

    foobar_status CHAR(??) NOT NULL,

    PRIMARY KEY (customer_id, start_date));

    Why should end_date be UNIQUE ? can't two customers have same end_date ? isn't your design/assumption wrong ?

    And not to start a null war but using null to mean something kind of goes against the concept. The lack of a value should not be used a value. In this case Joe is suggesting that when end_date is null it means the row is current. I realize that in this example is PROBABLY works but we are making more assumptions. This is the kind of logic that gets people tripped up with the bit datatype. They assume it has 2 values "checked" and "unchecked" but it can also be NULL. The inverse holds true for using NULL to indicate a status. It could be NULL because there is no end_date but it could also be NULL because it was incorrectly not entered. It would seem that the reason end_date is NULL is unknown....hhhhhmmmmmm..........

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Sean Lange (2/27/2012)


    ColdCoffee (2/27/2012)


    CELKO (2/27/2012)


    CREATE TABLE Customer_History

    (customer_id CHAR(9) NOT NULL,

    start_date DATE DEFAULT CURRENT_TIMESTAMP NOT NULL,

    end_date DATETIME UNIQUE, --null means current

    CHECK (start_date <= end_date),

    foobar_status CHAR(??) NOT NULL,

    PRIMARY KEY (customer_id, start_date));

    Why should end_date be UNIQUE ? can't two customers have same end_date ? isn't your design/assumption wrong ?

    And not to start a null war but using null to mean something kind of goes against the concept. The lack of a value should not be used a value. In this case Joe is suggesting that when end_date is null it means the row is current. I realize that in this example is PROBABLY works but we are making more assumptions. This is the kind of logic that gets people tripped up with the bit datatype. They assume it has 2 values "checked" and "unchecked" but it can also be NULL. The inverse holds true for using NULL to indicate a status. It could be NULL because there is no end_date but it could also be NULL because it was incorrectly not entered. It would seem that the reason end_date is NULL is unknown....hhhhhmmmmmm..........

    I can understand why CELKO wanted NULL for end_date, any you explain it clearly , but why UNIQUE. if we expand the customer to inlcude another customer, then it is plausible to have other customers with the same end_date (even with the same milliseconds). My concern is , why UNIQUE for end_date ??

  • ColdCoffee (2/27/2012)


    Sean Lange (2/27/2012)


    ColdCoffee (2/27/2012)


    CELKO (2/27/2012)


    CREATE TABLE Customer_History

    (customer_id CHAR(9) NOT NULL,

    start_date DATE DEFAULT CURRENT_TIMESTAMP NOT NULL,

    end_date DATETIME UNIQUE, --null means current

    CHECK (start_date <= end_date),

    foobar_status CHAR(??) NOT NULL,

    PRIMARY KEY (customer_id, start_date));

    Why should end_date be UNIQUE ? can't two customers have same end_date ? isn't your design/assumption wrong ?

    And not to start a null war but using null to mean something kind of goes against the concept. The lack of a value should not be used a value. In this case Joe is suggesting that when end_date is null it means the row is current. I realize that in this example is PROBABLY works but we are making more assumptions. This is the kind of logic that gets people tripped up with the bit datatype. They assume it has 2 values "checked" and "unchecked" but it can also be NULL. The inverse holds true for using NULL to indicate a status. It could be NULL because there is no end_date but it could also be NULL because it was incorrectly not entered. It would seem that the reason end_date is NULL is unknown....hhhhhmmmmmm..........

    I can understand why CELKO wanted NULL for end_date, any you explain it clearly , but why UNIQUE. if we expand the customer to inlcude another customer, then it is plausible to have other customers with the same end_date (even with the same milliseconds). My concern is , why UNIQUE for end_date ??

    I guess that way you would ensure that one and only customer can be active at any one time??? 😛

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Sean Lange (2/27/2012)


    I guess that way you would ensure that one and only customer can be active at any one time??? 😛

    :hehe:

  • >> I've created a table made by an unique_id, a date (YYYY-MM-DD), a customer_id and a status columns.<

    Microsoft is catching up with ANSI and other SQLs, so you need to start inserting the dashes in dates. Please post DDL, so that people do not have to guess what the keys, constraints, Declarative Referential Integrity, data types, etc. in your schema are.

    What the original poster provided was sufficient for me to figure out what he was looking for. But a definite thanks to Jayanth_Kurup for putting the test data into the SELECT/UNION ALL/SELECT statements!

    >> This status may switch between TRUE and FALSE over the time for each customer. <<

    NO! The design is wrong. A status is what ISO calls an attribute property, so it has to be a “<something in particular>_status”; a status is a state of being so it has to have a duration. Your silly Booleans are flags and not valid values for a status. Would you use “Red” for a temporal value?

    Easy to say from a textbook point of view. Like it or not we all must deal with the poor design skills of our predecessors. At least though, with information like this we can get it right the next time.

    The name “unique_id” is also silly. Think about “non-unique_id” and giggle.

    Perhaps the poster was simply obfuscating his problem, which is something I often hear recommended on this site.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Dear friends,

    I'm very sorry, I didn't mean to hurt Mr. Celko with my very first post, anyway, as always, every constructive suggestions that helps me growing my knowledge are highly welcomed.

    Since I'm a very positive person, indeed this means to me that some further explanation is needed.

    I apologize also for my poor english language skills that could lead to some misunderstandings and, even if I've already solved my specific problem, I hope the following clarification could help people to better understand my first post and maybe find a solution to a similar problem, like I did.

    European Union's new VAT regulations, forces all members, before to issue an invoice to another European Union's member, to check if his VAT number is part of the vies system, to evince the need to indicate the VAT percentage and value related to the issuer's country inside the invoice. The web service provided by the EU site returns, for each VAT number checked, a "status" that declares if the specific VAT number is active into the vies system. This status may vary over the time and switch between "true" and "false" (both are returned in text format) and it could happen that a firm, once registered to the vies system (on an hypotetical history file: false->true), suddenly closes so that its status inside vies system, switches back to false (on an hypotetical history file: false->true->false).

    My transaction job, made with spoon (part of the pentaho business intelligence suite) on a linux server, every morning runs for each of our EU customers vat numbers and populates an "history" table with a behavior very similar to the aforementioned one. Its main purpose is to collect all the available data from the web service, for future reference. Since this web service works on the 50% of cases, the records on EU countries remote databases are often off-line and the returned status is only the last known one, I got to run this process more than once per day to be sure to get at least one record for each VAT "number" provided... (Unfortunately, facing real life cases, we are used to ran through plenty of issues to get a usable result...).

    I got one table holding the last "status" provided by the vies web service for each customer but, we would like now to monitor, for each customer, how often the status switch happens and to monitor every possible cases, I started months ago to populate an history table.

    I must say that I've not listed the real "column" names of my table in the first post, since maybe the real name could be not so informative (FSSVSIIDX,FSSVSICCOD,... 🙂 and maybe could not help in leading smarter people to a useful reply (as acutally and luckily happens). And the sample provided was ment as the result of a simple "virtual" (SELECT * FROM [fake_vies_history_table] where fake_customer_id=123); this to justify in part some of Mr. Celko's complains, since it seems to me that it wasn't so clear the "unique_id" column weird values and name...

    For the date format complains, indeed they are in a standard ISO8601 basic format.

    A big thanks to all the smart people that provide with an answer to solve my problem even with the poorer material provided by me at first place.

    Best regards to all.

  • .

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply