SQL query help

  • Hello,

    I need help with a sql query.

    CREATE TABLE #Orders

    (

    OrderID INT,

    ProjectID INT,

    ProductID INT NULL,

    TotalMiles FLOAT,

    EMiles FLOAT

    )

    INSERT INTO #Orders ( OrderID,ProjectID,ProductID,TotalMiles,EMiles)

    SELECT 53152,2351,NULL,0.0037780058,NULL UNION ALL

    SELECT 53152,2681,1,269.99,269.99 UNION ALL

    SELECT 53152,2681,NULL,269.99,269.99 UNION ALL

    SELECT 53152,2873,652,269.99,269.99 UNION ALL

    SELECT 53152,2873,1,269.99,269.99 UNION ALL

    SELECT 15552,1159,526,14.02,NULL UNION ALL

    SELECT 15552,1159,NULL,14.02,NULL

    SELECT * FROM #Orders

    DROP TABLE #Orders

    --For every OrderId,ProjectID combination where we have ProductID = 1 , I want to make TotalMiles and EMiles 0 for other rows of that combination.

    --Example :

    --For example : 53152,2873 we have a ProductId 1, so I want to make the other row 0 for TotalMiles and EMiles

    SELECT 53152,2873,652,0,0

    SELECT 53152,2873,1,269.99,269.99

    --Similarly : 53152,2681 we have a ProductId 1, so I want to make the other row 0 for TotalMiles and EMiles

    SELECT 53152,2681,NULL,0,0

    SELECT 53152,2681,1,269.99,269.99

    --Leave others as is where we donot have ProductID =1 for OrderId,ProjectID combination

    SELECT 53152,2351,NULL,0.0037780058,NULL

    SELECT 15552,1159,526,14.02,NULL

    SELECT 15552,1159,NULL,14.02,NULL

    Thanks,

    PSB

  • Here's my quick stab at it. Not sure about performance on a super large data set, but maybe the framework will help.

    - Adam

    ifOBJECT_ID('tempdb..#Orders') IS NOT NULL DROP TABLE #Orders;

    CREATE TABLE #Orders

    (

    OrderID INT,

    ProjectID INT,

    ProductID INT NULL,

    TotalMiles FLOAT,

    EMiles FLOAT

    );

    INSERT INTO #Orders ( OrderID,ProjectID,ProductID,TotalMiles,EMiles)

    SELECT 53152,2351,NULL,0.0037780058 ,NULL UNION ALL

    SELECT 53152,2681,1,269.99 ,269.99 UNION ALL

    SELECT 53152,2681,NULL,269.99 ,269.99 UNION ALL

    SELECT 53152,2873,652,269.99 ,269.99 UNION ALL

    SELECT 53152,2873,1,269.99 ,269.99 UNION ALL

    SELECT 15552,1159,526,14.02 ,NULL UNION ALL

    SELECT 15552,1159,NULL,14.02 ,NULL ;

    SELECT * FROM #Orders; -- Show table before change

    UPDATEOrd

    SETTotalMiles = 0,

    EMiles=0

    from#OrdersOrd

    join#OrdersOrd1onOrd1.OrderID=Ord.OrderID and Ord1.ProjectID=Ord.ProjectID and Ord1.ProductID=1

    where

    (Ord.ProductID is null or Ord.ProductID<>1);

    SELECT * FROM #Orders; -- Show table after change

  • Thanks. Worked great performance wise.

  • CELKO (4/21/2015)


    >> I need help with an SQL query. <<

    No, you need help with DDL. You have no idea how to make a valid data model A table has to have a key by definition and a key is NOT NULL. This is not a table; it is called a “pile” in the slang (usually with a derogatory adjective) Since we do not do math with identifiers, they are strings. We do not store computed values in a table and we seldom use FLOAT for anything; the rounding errors are illegal in most commercial applications. Do you really measure miles in nanometers?

    The product_id should reference an inventory table of some kind. The key there would be a SKU or an industry standard like UPC, GTIN, etc. A product cannot be NULL. Likewise, a mileage can be zero, but never NULL.

    But wait! In this non-table, the product_id is really a non-relational flag that changes the meaning of the row where appears. The data element name is garbage! This is a gross violation of First Normal Form! We do not use flags in RDBMS; that was assembly language programming.

    Why do you still use the old Sybase dialect syntax for insertion? Nobody else does, so this is like “hillbilly SQL” that MS gave up years ago.

    Let me get you started with a skeleton

    CREATE TABLE Orders

    (order_id CHAR(5) NOT NULL,

    project_id CHAR(4) NOT NULL,

    PRIMARY KEY (order_id, project_id), – is this the key??

    something_mileage_tot DEFAULT 0.00 DECIMAL (6,2)NOT NULL

    CHECK (something_mileage_tot >= 0.00),

    e_mileage DEFAULT 0.00 DECIMAL (6,2) NOT NULL

    CHECK (e_mileage >= 0.00)

    );

    INSERT INTO Orders (order_id, project_id, sku, something_mileage_tot, e_mileage)

    VALUES

    ('53152', '2351', 0.0037780058, 0.00),

    ('53152', '2681', 269.99, 269.99),

    ('53152', '2681', 269.99, 269.99),

    ('53152', '2873', 269.99, 269.99),

    ('53152', '2873', 269.99, 269.99),

    ('15552', '1159', 14.02, 0.00),

    ('15552', '1159', 14.02, 0.00);

    >> --For every order_id, project_id combination where we have product_id = 1, I want to set something_mileage_tot and e_mileage to for other rows of that combination.<<

    NO!! The flags to change the meaning of rows. In RDBMS, each row is a fact in itself.

    Why don't you write a spec we can understand in terms of a valid RDBMS model?

    Why are you changing obvious integer data types to character data types? The mileage types I'll give you should probably be decimal values not floats.

    It would also be nice if you'd stop bashing people over the head and start being more of mentor to less experienced individuals.

  • It has to be the pointy-haired boss types that hire Celko. No one who actually knew any practical implementations would ever change obvious numeric values to character. There'd be so much garbage in such a column after 2 years it would be unusable.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • CELKO (4/22/2015)


    Why are you changing obvious integer data types to character data types?

    Please read any book on basic data modelling and look at the chapters on scales and measurements. An identifier (<something>_id) is measured on a nominal scale. Math makes no sense on such scales -- what is your credit card number divided by the UPC code on your candy bar?

    Therefore, identifiers are strings. The convention in most ISO and other standards is to use a fixed length made up of digits and basic Latin letters (the original ASCII stuff that UNICODE requires to be included in every language character on Earth. This makes printing them predictable for screen and forms (unlike the display of integers and other numeric values).

    Furthermore, a string can get validation by a regular expression (think about ZIP codes with the simple CHECK (zip_code LIKE '[0-9][0-9][0-9][0-9][0-9]') in your DDL. Or you can add a check digit -- got a book with an ISBN sitting on your desk? The last digit is the Luhn check digit.

    The mileage types I'll give you should probably be decimal values not floats.

    Yes, because they are measured on ratio scales. Those scales have to worry about range, precision and accuracy. Math makes sense with ratio and interval scales; you really can divide miles by hours and get speed! 🙂

    How exactly does one compute a check digit on a string-only characters? I don't believe that MC or VISA are spending hundreds of millions of extra dollars on storage just to store a credit card numbers in a character column.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • CELKO (4/22/2015)


    How exactly does one compute a check digit on a string-only characters? I don't believe that MC or VISA are spending hundreds of millions of extra dollars on storage just to store a credit card numbers in a character column.

    Ever write COBOL? The language for 78%+ of all commercial banking programs? All the data, numeric and string, is in strings! There are two methods for computing the check digit:

    1) You do math by casting the digits to integers, multiply by a positional weight, add them and take the MOD().

    2) Do table look-ups on substrings. The dihderal-5 algorithm uses this method. It lets you do trickier math. http://en.wikipedia.org/wiki/Verhoeff_algorithm

    Sure, I wrote COBOL, but I wrote it in the real world where there are numbers in COBOL. Such as PIC 99V999 COMP-3, which was stored in a packed decimal format, not a string format. Your ignorance in storing that data in strings was just forcing COBOL to convert it to decimal, do the arithmetic, then convert it back to string to store it for every mathematical operation.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • CELKO (4/22/2015)


    Why are you changing obvious integer data types to character data types?

    Please read any book on basic data modelling and look at the chapters on scales and measurements. An identifier (<something>_id) is measured on a nominal scale. Math makes no sense on such scales -- what is your credit card number divided by the UPC code on your candy bar?

    Therefore, identifiers are strings. The convention in most ISO and other standards is to use a fixed length made up of digits and basic Latin letters (the original ASCII stuff that UNICODE requires to be included in every language character on Earth. This makes printing them predictable for screen and forms (unlike the display of integers and other numeric values).

    Furthermore, a string can get validation by a regular expression (think about ZIP codes with the simple CHECK (zip_code LIKE '[0-9][0-9][0-9][0-9][0-9]') in your DDL. Or you can add a check digit -- got a book with an ISBN sitting on your desk? The last digit is the Luhn check digit.

    The mileage types I'll give you should probably be decimal values not floats.

    Yes, because they are measured on ratio scales. Those scales have to worry about range, precision and accuracy. Math makes sense with ratio and interval scales; you really can divide miles by hours and get speed! 🙂

    I don't need to read books on data modeling. Also, just because you don't do math on something that is all numeric doesn't mean it can't be stored as numeric values. You don't do math on invoice numbers but if they don't require leading zeros there is no reason NOT to store them as integer values. They can take less space in the database that way.

    Zip codes have leading zeros, so yes you store that as a string.

    Also, since you are so picky about using the correct terms, distance divided by time equals velocity!

    Maybe you should brush up on your math/science terms.

  • Technically everything is stored as bits and bytes. Number storage is quite dense (efficient). I would expect credit card numbers to be stored in an encrypted varbinary column. Storing numbers in a numeric type makes sense when math operations will be performed on them.

    I have to say I really liked the old Celko in the early 2000's when I felt he was actually helping the industry gain some insight and quality and accuracy.

  • Lynn Pettis (4/23/2015)


    CELKO (4/22/2015)


    Why are you changing obvious integer data types to character data types?

    Please read any book on basic data modelling and look at the chapters on scales and measurements. An identifier (<something>_id) is measured on a nominal scale. Math makes no sense on such scales -- what is your credit card number divided by the UPC code on your candy bar?

    Therefore, identifiers are strings. The convention in most ISO and other standards is to use a fixed length made up of digits and basic Latin letters (the original ASCII stuff that UNICODE requires to be included in every language character on Earth. This makes printing them predictable for screen and forms (unlike the display of integers and other numeric values).

    Furthermore, a string can get validation by a regular expression (think about ZIP codes with the simple CHECK (zip_code LIKE '[0-9][0-9][0-9][0-9][0-9]') in your DDL. Or you can add a check digit -- got a book with an ISBN sitting on your desk? The last digit is the Luhn check digit.

    The mileage types I'll give you should probably be decimal values not floats.

    Yes, because they are measured on ratio scales. Those scales have to worry about range, precision and accuracy. Math makes sense with ratio and interval scales; you really can divide miles by hours and get speed! 🙂

    I don't need to read books on data modeling. Also, just because you don't do math on something that is all numeric doesn't mean it can't be stored as numeric values. You don't do math on invoice numbers but if they don't require leading zeros there is no reason NOT to store them as integer values. They can take less space in the database that way.

    Zip codes have leading zeros, so yes you store that as a string.

    Also, since you are so picky about using the correct terms, distance divided by time equals velocity!

    Maybe you should brush up on your math/science terms.

    Zip codes have to be displayed/output with leading zeros, but that doesn't necessarily mean they have to be stored that way. With potentially 9 digits being used now for zips, the space saving from numeric could be noticeable again.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • ScottPletcher (4/23/2015)


    Lynn Pettis (4/23/2015)


    CELKO (4/22/2015)


    Why are you changing obvious integer data types to character data types?

    Please read any book on basic data modelling and look at the chapters on scales and measurements. An identifier (<something>_id) is measured on a nominal scale. Math makes no sense on such scales -- what is your credit card number divided by the UPC code on your candy bar?

    Therefore, identifiers are strings. The convention in most ISO and other standards is to use a fixed length made up of digits and basic Latin letters (the original ASCII stuff that UNICODE requires to be included in every language character on Earth. This makes printing them predictable for screen and forms (unlike the display of integers and other numeric values).

    Furthermore, a string can get validation by a regular expression (think about ZIP codes with the simple CHECK (zip_code LIKE '[0-9][0-9][0-9][0-9][0-9]') in your DDL. Or you can add a check digit -- got a book with an ISBN sitting on your desk? The last digit is the Luhn check digit.

    The mileage types I'll give you should probably be decimal values not floats.

    Yes, because they are measured on ratio scales. Those scales have to worry about range, precision and accuracy. Math makes sense with ratio and interval scales; you really can divide miles by hours and get speed! 🙂

    I don't need to read books on data modeling. Also, just because you don't do math on something that is all numeric doesn't mean it can't be stored as numeric values. You don't do math on invoice numbers but if they don't require leading zeros there is no reason NOT to store them as integer values. They can take less space in the database that way.

    Zip codes have leading zeros, so yes you store that as a string.

    Also, since you are so picky about using the correct terms, distance divided by time equals velocity!

    Maybe you should brush up on your math/science terms.

    Zip codes have to be displayed/output with leading zeros, but that doesn't necessarily mean they have to be stored that way. With potentially 9 digits being used now for zips, the space saving from numeric could be noticeable again.

    True, but not everyone knows their full zip+4 zip code. But why strip the leading zero on entry just to have to add it back on display? Just seems like extra work that isn't needed.

  • Lynn Pettis (4/23/2015)


    ScottPletcher (4/23/2015)


    Lynn Pettis (4/23/2015)


    CELKO (4/22/2015)


    Why are you changing obvious integer data types to character data types?

    Please read any book on basic data modelling and look at the chapters on scales and measurements. An identifier (<something>_id) is measured on a nominal scale. Math makes no sense on such scales -- what is your credit card number divided by the UPC code on your candy bar?

    Therefore, identifiers are strings. The convention in most ISO and other standards is to use a fixed length made up of digits and basic Latin letters (the original ASCII stuff that UNICODE requires to be included in every language character on Earth. This makes printing them predictable for screen and forms (unlike the display of integers and other numeric values).

    Furthermore, a string can get validation by a regular expression (think about ZIP codes with the simple CHECK (zip_code LIKE '[0-9][0-9][0-9][0-9][0-9]') in your DDL. Or you can add a check digit -- got a book with an ISBN sitting on your desk? The last digit is the Luhn check digit.

    The mileage types I'll give you should probably be decimal values not floats.

    Yes, because they are measured on ratio scales. Those scales have to worry about range, precision and accuracy. Math makes sense with ratio and interval scales; you really can divide miles by hours and get speed! 🙂

    I don't need to read books on data modeling. Also, just because you don't do math on something that is all numeric doesn't mean it can't be stored as numeric values. You don't do math on invoice numbers but if they don't require leading zeros there is no reason NOT to store them as integer values. They can take less space in the database that way.

    Zip codes have leading zeros, so yes you store that as a string.

    Also, since you are so picky about using the correct terms, distance divided by time equals velocity!

    Maybe you should brush up on your math/science terms.

    Zip codes have to be displayed/output with leading zeros, but that doesn't necessarily mean they have to be stored that way. With potentially 9 digits being used now for zips, the space saving from numeric could be noticeable again.

    True, but not everyone knows their full zip+4 zip code. But why strip the leading zero on entry just to have to add it back on display? Just seems like extra work that isn't needed.

    People don't know 9-digit zips, but machines do. We do a huge amount of EDI and other electronic processing. As to strip/restore, I view it similar to dates. I strip dashes, etc., from the date to store it numerically, as a date/datetime column, and then convert to put them back in if I need to for display/output. Seems to me, I both save space by making zip numeric and have somewhat less overhead to search/process/group it.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • ScottPletcher (4/23/2015)


    Lynn Pettis (4/23/2015)


    ScottPletcher (4/23/2015)


    Lynn Pettis (4/23/2015)


    CELKO (4/22/2015)


    Why are you changing obvious integer data types to character data types?

    Please read any book on basic data modelling and look at the chapters on scales and measurements. An identifier (<something>_id) is measured on a nominal scale. Math makes no sense on such scales -- what is your credit card number divided by the UPC code on your candy bar?

    Therefore, identifiers are strings. The convention in most ISO and other standards is to use a fixed length made up of digits and basic Latin letters (the original ASCII stuff that UNICODE requires to be included in every language character on Earth. This makes printing them predictable for screen and forms (unlike the display of integers and other numeric values).

    Furthermore, a string can get validation by a regular expression (think about ZIP codes with the simple CHECK (zip_code LIKE '[0-9][0-9][0-9][0-9][0-9]') in your DDL. Or you can add a check digit -- got a book with an ISBN sitting on your desk? The last digit is the Luhn check digit.

    The mileage types I'll give you should probably be decimal values not floats.

    Yes, because they are measured on ratio scales. Those scales have to worry about range, precision and accuracy. Math makes sense with ratio and interval scales; you really can divide miles by hours and get speed! 🙂

    I don't need to read books on data modeling. Also, just because you don't do math on something that is all numeric doesn't mean it can't be stored as numeric values. You don't do math on invoice numbers but if they don't require leading zeros there is no reason NOT to store them as integer values. They can take less space in the database that way.

    Zip codes have leading zeros, so yes you store that as a string.

    Also, since you are so picky about using the correct terms, distance divided by time equals velocity!

    Maybe you should brush up on your math/science terms.

    Zip codes have to be displayed/output with leading zeros, but that doesn't necessarily mean they have to be stored that way. With potentially 9 digits being used now for zips, the space saving from numeric could be noticeable again.

    True, but not everyone knows their full zip+4 zip code. But why strip the leading zero on entry just to have to add it back on display? Just seems like extra work that isn't needed.

    People don't know 9-digit zips, but machines do. We do a huge amount of EDI and other electronic processing. As to strip/restore, I view it similar to dates. I strip dashes, etc., from the date to store it numerically, as a date/datetime column, and then convert to put them back in if I need to for display/output. Seems to me, I both save space by making zip numeric and have somewhat less overhead to search/process/group it.

    But datetime types tend to have a lot of build in formating functions in SQL Server and oracle for that matter, zipcodes not so much so which means you're likely writing your own custom code to convert from a numeric to varchar.

  • ZZartin (4/23/2015)


    ScottPletcher (4/23/2015)


    Lynn Pettis (4/23/2015)


    ScottPletcher (4/23/2015)


    Lynn Pettis (4/23/2015)


    CELKO (4/22/2015)


    Why are you changing obvious integer data types to character data types?

    Please read any book on basic data modelling and look at the chapters on scales and measurements. An identifier (<something>_id) is measured on a nominal scale. Math makes no sense on such scales -- what is your credit card number divided by the UPC code on your candy bar?

    Therefore, identifiers are strings. The convention in most ISO and other standards is to use a fixed length made up of digits and basic Latin letters (the original ASCII stuff that UNICODE requires to be included in every language character on Earth. This makes printing them predictable for screen and forms (unlike the display of integers and other numeric values).

    Furthermore, a string can get validation by a regular expression (think about ZIP codes with the simple CHECK (zip_code LIKE '[0-9][0-9][0-9][0-9][0-9]') in your DDL. Or you can add a check digit -- got a book with an ISBN sitting on your desk? The last digit is the Luhn check digit.

    The mileage types I'll give you should probably be decimal values not floats.

    Yes, because they are measured on ratio scales. Those scales have to worry about range, precision and accuracy. Math makes sense with ratio and interval scales; you really can divide miles by hours and get speed! 🙂

    I don't need to read books on data modeling. Also, just because you don't do math on something that is all numeric doesn't mean it can't be stored as numeric values. You don't do math on invoice numbers but if they don't require leading zeros there is no reason NOT to store them as integer values. They can take less space in the database that way.

    Zip codes have leading zeros, so yes you store that as a string.

    Also, since you are so picky about using the correct terms, distance divided by time equals velocity!

    Maybe you should brush up on your math/science terms.

    Zip codes have to be displayed/output with leading zeros, but that doesn't necessarily mean they have to be stored that way. With potentially 9 digits being used now for zips, the space saving from numeric could be noticeable again.

    True, but not everyone knows their full zip+4 zip code. But why strip the leading zero on entry just to have to add it back on display? Just seems like extra work that isn't needed.

    People don't know 9-digit zips, but machines do. We do a huge amount of EDI and other electronic processing. As to strip/restore, I view it similar to dates. I strip dashes, etc., from the date to store it numerically, as a date/datetime column, and then convert to put them back in if I need to for display/output. Seems to me, I both save space by making zip numeric and have somewhat less overhead to search/process/group it.

    But datetime types tend to have a lot of build in formating functions in SQL Server and oracle for that matter, zipcodes not so much so which means you're likely writing your own custom code to convert from a numeric to varchar.

    So if someone enters a 9-digit zip, you store the dash in your db? That's wasted space. As can be leading zeros, particularly if you compress the row/page in SQL. You can add a computed column that does the formatting if you need it.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • ScottPletcher (4/23/2015)


    ZZartin (4/23/2015)


    ScottPletcher (4/23/2015)


    Lynn Pettis (4/23/2015)


    ScottPletcher (4/23/2015)


    Lynn Pettis (4/23/2015)


    CELKO (4/22/2015)


    Why are you changing obvious integer data types to character data types?

    Please read any book on basic data modelling and look at the chapters on scales and measurements. An identifier (<something>_id) is measured on a nominal scale. Math makes no sense on such scales -- what is your credit card number divided by the UPC code on your candy bar?

    Therefore, identifiers are strings. The convention in most ISO and other standards is to use a fixed length made up of digits and basic Latin letters (the original ASCII stuff that UNICODE requires to be included in every language character on Earth. This makes printing them predictable for screen and forms (unlike the display of integers and other numeric values).

    Furthermore, a string can get validation by a regular expression (think about ZIP codes with the simple CHECK (zip_code LIKE '[0-9][0-9][0-9][0-9][0-9]') in your DDL. Or you can add a check digit -- got a book with an ISBN sitting on your desk? The last digit is the Luhn check digit.

    The mileage types I'll give you should probably be decimal values not floats.

    Yes, because they are measured on ratio scales. Those scales have to worry about range, precision and accuracy. Math makes sense with ratio and interval scales; you really can divide miles by hours and get speed! 🙂

    I don't need to read books on data modeling. Also, just because you don't do math on something that is all numeric doesn't mean it can't be stored as numeric values. You don't do math on invoice numbers but if they don't require leading zeros there is no reason NOT to store them as integer values. They can take less space in the database that way.

    Zip codes have leading zeros, so yes you store that as a string.

    Also, since you are so picky about using the correct terms, distance divided by time equals velocity!

    Maybe you should brush up on your math/science terms.

    Zip codes have to be displayed/output with leading zeros, but that doesn't necessarily mean they have to be stored that way. With potentially 9 digits being used now for zips, the space saving from numeric could be noticeable again.

    True, but not everyone knows their full zip+4 zip code. But why strip the leading zero on entry just to have to add it back on display? Just seems like extra work that isn't needed.

    People don't know 9-digit zips, but machines do. We do a huge amount of EDI and other electronic processing. As to strip/restore, I view it similar to dates. I strip dashes, etc., from the date to store it numerically, as a date/datetime column, and then convert to put them back in if I need to for display/output. Seems to me, I both save space by making zip numeric and have somewhat less overhead to search/process/group it.

    But datetime types tend to have a lot of build in formating functions in SQL Server and oracle for that matter, zipcodes not so much so which means you're likely writing your own custom code to convert from a numeric to varchar.

    So if someone enters a 9-digit zip, you store the dash in your db? That's wasted space. As can be leading zeros, particularly if you compress the row/page in SQL. You can add a computed column that does the formatting if you need it.

    Not all countries use the same format for postal codes. I guess it's a matter of saving a couple bytes of space vs. maintainability and flexibility.

Viewing 15 posts - 1 through 15 (of 76 total)

You must be logged in to reply to this topic. Login to reply