Need help to form query to get missing month data

  • I want to know what are the customers having missing month data based on the following data pattern. My data is present in table in below format. Please help me to form a SQL query to find out how we can get missing month flag value.

    Condition is : If a customer is having any single or multiple month missing in between of start and end date then that customer would be consider as having missing month data. Start Date and end date both could be vary based on customer to customer data. Here is a sample of data:

    SampleData

     

  • anuxps wrote:

    Please help me to form a SQL query to find out how we can get missing month flag value.

    Did you mean how to set the 'month skip' column or do you want to display customers who have 'yes' in the month skip column?

    Also please see here https://www.sqlservercentral.com/articles/forum-etiquette-how-to-post-datacode-on-a-forum-to-get-the-best-help. If you can provide the information using the method shown on the link it's easier for people to help you.

     

  • This was removed by the editor as SPAM

  • >> I want to know what are the customers having missing month data based on the following data pattern. My data is present in table in below format. Please help me to form a SQL query to find out how we can get missing month flag value. <<

    In SQL, unlike a spreadsheet, data is represented in tables. What you posted is a display spreadsheet. It also happens to have the wrong format for dates. Please read the part about posting DDL you want help. You'll find out very quickly that you need a good working DDL and that most of the problems come from bad DDL.  Also, flags are assembly language,  not SQL.

    For example, it would really help if we knew what your data was. We don't even know the name of your table! Your unit of measure seems to be a month according to your narrative. But you display only vague dates not associated with any year. I'm going to make a suggestion that you use the MySQL convention four months. It is based on ISO- 8601 standards used in SQL, but it is not yet official. The format is "yyyy-mm-00" and you'll have to use the string for where the double zeroes replace what would've been the day with in month.

    CREATE TABLE Foobar

    (customer_name CHAR(10) NOT NULL,

    posting_month CHAR(10) NOT NULL

    CHECK (posting_month LIKE '[12][0-9][0-9][0-9]-[0-1][0-9]-00)',

    PRIMARY KEY (customer_name, posting_month),

    foo_cnt INTEGER CHECK (foo_cnt >0)

    );

    Following your design, I will assume that every month that we consider, but has not counted its foos yet defaults to a NULL.

    SELECT customer_name

    FROM Foobar

    WHERE posting_month BETWEEN @start_month AND @end_month

    GROUP BY customer_name

    HAVING COUNT(*) > COUNT(customer_name)'

    Please post DDL and follow ANSI/ISO standards when asking for help. 

  • anuxps wrote:

    I want to know what are the customers having missing month data based on the following data pattern. My data is present in table in below format. Please help me to form a SQL query to find out how we can get missing month flag value.

    Condition is : If a customer is having any single or multiple month missing in between of start and end date then that customer would be consider as having missing month data. Start Date and end date both could be vary based on customer to customer data. Here is a sample of data:

    SampleData

    The data you posted is the result of a Pivot or a CrossTab.  It would be easier to calculate your "Month Skip" column if you would share the code that did the Pivot or CrossTab.  It would also be helpful if you posted some readily consumable data.  Please see the the first link under "Helpful Links" in my signature line below for one way to do that so it's useful.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • jcelko212 32090 wrote:

    >>But you display only vague dates not associated with any year. I'm going to make a suggestion that you use the MySQL convention four months. It is based on ISO- 8601 standards used in SQL, but it is not yet official. The format is "yyyy-mm-00" and you'll have to use the string for where the double zeroes replace what would've been the day with in month.

    The dates in the picture the OP posted clearly have the year contained in them in the format of MMM-YY.

    As for your MySQL convention, certainly, it's not necessary for the display of spreadsheet data especially if it's public facing where "It Depends" on the target audience.  I will agree, however, that it would be better in nearly every case to use a 4 digit year.  Still, "It Depends".

    jcelko212 32090 wrote:

    Also, flags are assembly language, not SQL.

    I agree that it's serious overkill and bad design to have(for example) a "flag" column such as "IsTerminated" (or the much worse "IsNotTerminated") when there should be a column with a termination date or a history row containing such a label but I'll continue to disagree with the idea that flags have no place in SQL or databases.  And, again... what the OP has displayed isn't a database... it's a desired output (whether it's stored in a table or not) where users frequently demand such things so that they don't actually have to analyze each row of data to come to the correct conclusion that there's some nested months that don't have data.

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden wrote:

    jcelko212 32090 wrote:

    >>But you display only vague dates not associated with any year. I'm going to make a suggestion that you use the MySQL convention four months. It is based on ISO- 8601 standards used in SQL, but it is not yet official. The format is "yyyy-mm-00" and you'll have to use the string for where the double zeroes replace what would've been the day with in month.

    The dates in the picture the OP posted clearly have the year contained in them in the format of MMM-YY.

    As for your MySQL convention, certainly, it's not necessary for the display of spreadsheet data especially if it's public facing where "It Depends" on the target audience.  I will agree, however, that it would be better in nearly every case to use a 4 digit year.  Still, "It Depends".

    jcelko212 32090 wrote:

    Also, flags are assembly language, not SQL.

    I agree that it's serious overkill and bad design to have(for example) a "flag" column such as "IsTerminated" (or the much worse "IsNotTerminated") when there should be a column with a termination date or a history row containing such a label but I'll continue to disagree with the idea that flags have no place in SQL or databases.  And, again... what the OP has displayed isn't a database... it's a desired output (whether it's stored in a table or not) where users frequently demand such things so that they don't actually have to analyze each row of data to come to the correct conclusion that there's some nested months that don't have data.

    I see no need to use a 4 digit year here.  Almost no businesses go back and analyze 100-year-old data, so there's no ambiguity.  Even more so if you get used to using military/European-style format where if day is given it is first (i.e. 19-Oct-20 is Oct 19, 1920, and 20-Oct-19 is Oct 20, 2019, and the 19 and 20 are still not ambiguous).

    An is_active flag (someone can be inactive and still not "terminated", e.g. "suspended"), specifically, makes perfect sense to me because the specific inactive date / conditions may be on a need-to-know-only basis.  That is, the vast majority of apps only need to know if the employee/customer/whatever is currently active, not the date and/or conditions they went inactive.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • ScottPletcher wrote:

    I see no need to use a 4 digit year here.  Almost no businesses go back and analyze 100-year-old data, so there's no ambiguity.  Even more so if you get used to using military/European-style format where if day is given it is first (i.e. 19-Oct-20 is Oct 19, 1920, and 20-Oct-19 is Oct 20, 2019, and the 19 and 20 are still not ambiguous).

    An is_active flag (someone can be inactive and still not "terminated", e.g. "suspended"), specifically, makes perfect sense to me because the specific inactive date / conditions may be on a need-to-know-only basis.  That is, the vast majority of apps only need to know if the employee/customer/whatever is currently active, not the date and/or conditions they went inactive.

    Agreed on the first paragraph above.  Like I said, "It Depends" on the target audience.

    On the second part, understood but wow... that's confusing.  No wonder Developers lose their minds. 😀

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • ScottPletcher wrote:

    8<

    >8

    An is_active flag (someone can be inactive and still not "terminated", e.g. "suspended"), specifically, makes perfect sense to me because the specific inactive date / conditions may be on a need-to-know-only basis.  That is, the vast majority of apps only need to know if the employee/customer/whatever is currently active, not the date and/or conditions they went inactive.

    One of our systems makes extensive use of IsActive columns, mainly for a "soft delete" function. The devs have a standardised method which allows "Power Users" of the system to toggle it to shift records between the Active list to the Inactive list, on records where there is no audit trail required.

    "Knowledge is of two kinds. We know a subject ourselves, or we know where we can find information upon it. When we enquire into any subject, the first thing we have to do is to know what books have treated of it. This leads us to look at catalogues, and at the backs of books in libraries."
    — Samuel Johnson
    I wonder, would the great Samuel Johnson have replaced that with "GIYF" now?

  • david.edwards 76768 wrote:

    ScottPletcher wrote:

    8<

    >8

    An is_active flag (someone can be inactive and still not "terminated", e.g. "suspended"), specifically, makes perfect sense to me because the specific inactive date / conditions may be on a need-to-know-only basis.  That is, the vast majority of apps only need to know if the employee/customer/whatever is currently active, not the date and/or conditions they went inactive.

    One of our systems makes extensive use of IsActive columns, mainly for a "soft delete" function. The devs have a standardised method which allows "Power Users" of the system to toggle it to shift records between the Active list to the Inactive list, on records where there is no audit trail required.

    Be careful with overusing is_active flags, as they can cause the optimizer to be misdirected sometimes.

    That will be less of an issue on small tables, such as employees and, for most companies, customers/clients.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • ScottPletcher wrote:

    david.edwards 76768 wrote:

    ScottPletcher wrote:

    8<

    >8

    An is_active flag (someone can be inactive and still not "terminated", e.g. "suspended"), specifically, makes perfect sense to me because the specific inactive date / conditions may be on a need-to-know-only basis.  That is, the vast majority of apps only need to know if the employee/customer/whatever is currently active, not the date and/or conditions they went inactive.

    One of our systems makes extensive use of IsActive columns, mainly for a "soft delete" function. The devs have a standardised method which allows "Power Users" of the system to toggle it to shift records between the Active list to the Inactive list, on records where there is no audit trail required.

    Be careful with overusing is_active flags, as they can cause the optimizer to be misdirected sometimes.

    That will be less of an issue on small tables, such as employees and, for most companies, customers/clients.

    Interesting, under what sort of circumstances? We're not talking massive tables here, but I don't like relying on that if something is sub-optimal....

    "Knowledge is of two kinds. We know a subject ourselves, or we know where we can find information upon it. When we enquire into any subject, the first thing we have to do is to know what books have treated of it. This leads us to look at catalogues, and at the backs of books in libraries."
    — Samuel Johnson
    I wonder, would the great Samuel Johnson have replaced that with "GIYF" now?

  • david.edwards 76768 wrote:

    ScottPletcher wrote:

    david.edwards 76768 wrote:

    ScottPletcher wrote:

    8<

    >8

    An is_active flag (someone can be inactive and still not "terminated", e.g. "suspended"), specifically, makes perfect sense to me because the specific inactive date / conditions may be on a need-to-know-only basis.  That is, the vast majority of apps only need to know if the employee/customer/whatever is currently active, not the date and/or conditions they went inactive.

    One of our systems makes extensive use of IsActive columns, mainly for a "soft delete" function. The devs have a standardised method which allows "Power Users" of the system to toggle it to shift records between the Active list to the Inactive list, on records where there is no audit trail required.

    Be careful with overusing is_active flags, as they can cause the optimizer to be misdirected sometimes.

    That will be less of an issue on small tables, such as employees and, for most companies, customers/clients.

    Interesting, under what sort of circumstances? We're not talking massive tables here, but I don't like relying on that if something is sub-optimal....

    Most notably when is_active is not placed into a non-clus index but the query contains "WHERE is_active = 1" (as almost all queries against tables with this type of flag do).  This forces SQL to revert to a full table scan.

    Since 95+% of typical processing uses "WHERE is_active = 1" anyway, move all rows with the soft delete flag on to another table.  Then create a view that combines those two tables for any processing that actually might need to see deleted rows.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • I did an article called Constraint Yourself! back in October 2008 on how to use DDL constraints to assure data integrity. One of the topics in that piece was a look at state transition constraints via an auxiliary table. I did not go into much detail since this was one of several topics I was covering. I introduced the topic of transition constraints to show how such constraints could be modeled as a state-transition diagram in order to enforce the rules when an entity can be updated only in certain ways. There is an initial state, flow lines that show what are the next legal states, and one or more termination states. The original example was a simple state change diagram of possible marital states, you can get some more details on this technique at

    https://www.red-gate.com/simple-talk/sql/t-sql-programming/state-transition-constraints/

    (08 October 2010),

    This method avoids problems with simply using assembly language style flags. I've also done articles on why I don't like bit flags and RDBMS. See: https://www.red-gate.com/simple-talk/sql/t-sql-programming/bit-of-a-problem/

     

    Please post DDL and follow ANSI/ISO standards when asking for help. 

  • jcelko212 32090 wrote:

    I did an article called Constraint Yourself! back in October 2008 on how to use DDL constraints to assure data integrity. One of the topics in that piece was a look at state transition constraints via an auxiliary table. I did not go into much detail since this was one of several topics I was covering. I introduced the topic of transition constraints to show how such constraints could be modeled as a state-transition diagram in order to enforce the rules when an entity can be updated only in certain ways. There is an initial state, flow lines that show what are the next legal states, and one or more termination states. The original example was a simple state change diagram of possible marital states, you can get some more details on this technique at

    https://www.red-gate.com/simple-talk/sql/t-sql-programming/state-transition-constraints/

    (08 October 2010),

    This method avoids problems with simply using assembly language style flags. I've also done articles on why I don't like bit flags and RDBMS. See: https://www.red-gate.com/simple-talk/sql/t-sql-programming/bit-of-a-problem/

    Totally agreed on BIT Flags and numeric flags where you can extract more than one state by the use of bit-wise ANDs but no way to do seeks on such things.  I think the latter is a left over from the days of needing to penny pinch every byte in RAM and on disk.

    Your description above does have me interested in "State Transistions".  Thank you for the link.  I'll take a look.

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden wrote:

    anuxps wrote:

    I want to know what are the customers having missing month data based on the following data pattern. My data is present in table in below format. Please help me to form a SQL query to find out how we can get missing month flag value.

    Condition is : If a customer is having any single or multiple month missing in between of start and end date then that customer would be consider as having missing month data. Start Date and end date both could be vary based on customer to customer data. Here is a sample of data:

    SampleData

    The data you posted is the result of a Pivot or a CrossTab.  It would be easier to calculate your "Month Skip" column if you would share the code that did the Pivot or CrossTab.  It would also be helpful if you posted some readily consumable data.  Please see the the first link under "Helpful Links" in my signature line below for one way to do that so it's useful.

    How about it, @anuxps???  The best time to solve for the  Month Skip column is before the pivot.  Got data?  Got code?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply