Splitting a crazy column into two new columns

  • Hi all,

    I have some data which is provided, unfortunately I have no control over the format.

    One crazy column contains some data i need to split into two columns, i can't even think how to do this...

    So this is the crazy data in that column: Booked on: 20/03/2020 10:52 by: A1234567

    Is there a way to Split the column so that the Date and time is joined and column named as "Booking date" and then a column called "By Who" containing just the ID at the end - ID will be a fixed length from checking the data.

    I tried messing with split string, but i was hopeless

    Any help appreciated.

    Thanks

  • Look for defining characters, and split the string by those characters

    DECLARE @OriginalColumn varchar(50) = 'Booked on: 20/03/2020 10:52 by: A1234567';

    SELECT OriginalColumn = @OriginalColumn
    , BookingDate = CONVERT(datetime, SUBSTRING(@OriginalColumn, 11, PATINDEX('%by:%', @OriginalColumn)-11), 103)
    , ByWho = LTRIM(RTRIM(SUBSTRING(@OriginalColumn, PATINDEX('%by:%', @OriginalColumn)+3, LEN(@OriginalColumn))))
  • rkelly58 wrote:

    Hi all,

    I have some data which is provided, unfortunately I have no control over the format.

    One crazy column contains some data i need to split into two columns, i can't even think how to do this...

    So this is the crazy data in that column: Booked on: 20/03/2020 10:52 by: A1234567

    Is there a way to Split the column so that the Date and time is joined and column named as "Booking date" and then a column called "By Who" containing just the ID at the end - ID will be a fixed length from checking the data.

    I tried messing with split string, but i was hopeless

    Any help appreciated.

    Thanks

    If the format is fixed then stuff out the first 11 characters, grab the left 16 for the date and the right eight characters for the Account ID.

    😎

     

  • Hi Des,

    Thanks for that. How do i connect that code to use my column from a table, rather than the example test you used? Sorry if its obvious

     

  • rkelly58 wrote:

    Hi Des,

    Thanks for that. How do i connect that code to use my column from a table, rather than the example test you used? Sorry if its obvious

    SELECT OriginalColumn
    , BookingDate = CONVERT(datetime, SUBSTRING(OriginalColumn, 11, PATINDEX('%by:%', OriginalColumn)-11), 103)
    , ByWho = LTRIM(RTRIM(SUBSTRING(OriginalColumn, PATINDEX('%by:%', OriginalColumn)+3, LEN(OriginalColumn))))
    FROM YourSchema.YourTable
  • Des,

    That is spot on, worked wonderfully, so thanks.

    One last thing, can i do a WHERE clause on the "BookingDate" field to only show bookings for the last two hours?

    I tried adding this to your code (i know its not two hours, just to test): Where BookingDate > '2020-03-19'

    But it fails with Msg 207, Level 16, State 1, Line 25

    Invalid column name 'BookingDate'.

    I googled it and it said i can't use Where as it's not a proper column.

    Any workaround?

    • This reply was modified 3 years, 6 months ago by  rkelly58.
  • rkelly58 wrote:

    Des,

    That is spot on, worked wonderfully, so thanks.

    One last thing, can i do a WHERE clause on the "BookingDate" field to only show bookings for the last two hours?

    I tried adding this to your code (i know its not two hours, just to test): Where BookingDate > '2020-03-19'

    But it fails with Msg 207, Level 16, State 1, Line 25

    Invalid column name 'BookingDate'.

    I googled it and it said i can't use Where as it's not a proper column.

    Any workaround?

     

    The column BookingDate does not exist in the table that you are querying, so you cannot use it as is.

    You need to force SQL to create a hidden table behind the scenes with the field names that you want to use, then query that.

    WITH cteBooking AS (
    SELECT OriginalColumn
    , BookingDate = CONVERT(datetime, SUBSTRING(OriginalColumn, 11, PATINDEX('%by:%', OriginalColumn)-11), 103)
    , ByWho = LTRIM(RTRIM(SUBSTRING(OriginalColumn, PATINDEX('%by:%', OriginalColumn)+3, LEN(OriginalColumn))))
    FROM YourSchema.YourTable
    )
    SELECT *
    FROM cteBooking
    --WHERE BookingDate >= '' -- Uncomment this line for a specific date/time
    WHERE BookingDate >= DATEADD(hh, -2, GETDATE()) -- Uncomment this line for the last 2 hours
  • Or make a computed column (or two) with the date. Then use the WHERE on that.

  • Thank You Des,

    That works well. Could i now use this hidden table with the results to then populate another table using SSIS? Would there be any issues with it being a hidden table?

    My intention was to get the columns in the right format and then use these columns to then populate a central table elsewhere

    Thanks

     

  • rkelly58 wrote:

    Thank You Des,

    That works well. Could i now use this hidden table with the results to then populate another table using SSIS? Would there be any issues with it being a hidden table?

    My intention was to get the columns in the right format and then use these columns to then populate a central table elsewhere

    Thanks

    I have no working knowledge of SSIS.

    Now, as Steve pointed out above, you can add these 2 new fields to your table as calculated fields.  Then there is no reason why you cannot use these fields in any query from with SQL or even SSIS.

    ALTER TABLE YourSchema.YourTable
    ADD BookingDate AS CONVERT(datetime, SUBSTRING(OriginalColumn, 11, PATINDEX('%by:%', OriginalColumn)-11), 103) PERSISTED
    , ByWho AS LTRIM(RTRIM(SUBSTRING(OriginalColumn, PATINDEX('%by:%', OriginalColumn)+3, LEN(OriginalColumn)))) PERSISTED;

    Then you simply query the table directly

    SELECT *
    FROM YourSchema.YourTable
    --WHERE BookingDate >= '' -- Uncomment this line for a specific date/time
    WHERE BookingDate >= DATEADD(hh, -2, GETDATE()) -- Uncomment this line for the last 2 hours
  • Hi Des,

    Unfortunately i am unable to add any new columnsto the table, as it is a third party DB and we only have read permissions to the source data.

    I added your code to an SSIS package to transfer the SQL results to a table, but it complains about "ByWho" and my destination columb "UserID cannot convert between unicode and non-unicode string data types.

    Is there a way of using a CAST on your code to change the ByWho to a nvarchar(50) like the destination UserID column?

    I had a go with , CAST( [ByWho] AS Varchar(50)) but didn't work

    Thanks again, really appreciate your help with this.

    • This reply was modified 3 years, 6 months ago by  rkelly58.
    • This reply was modified 3 years, 6 months ago by  rkelly58.
  • So, why not copy the original data to your own table in your own DB, then manipulate it from there?

    If you need to cast the value, then simply do it

    , ByWho = CAST(LTRIM(RTRIM(SUBSTRING(OriginalColumn, PATINDEX('%by:%', OriginalColumn)+3, LEN(OriginalColumn)))) AS nvarchar(50))

    What about doing the splitting in the SSIS?

  • Is this the same table that has the weird DateTime column and time as INT column in your recent post?

    Does this third-party database allow creation of Indexed Views?

    If so, might be worth creating one (as mentioned in the other topic) and adding Des's code to create the column you need here.

    "Knowledge is of two kinds. We know a subject ourselves, or we know where we can find information upon it. When we enquire into any subject, the first thing we have to do is to know what books have treated of it. This leads us to look at catalogues, and at the backs of books in libraries."
    — Samuel Johnson
    I wonder, would the great Samuel Johnson have replaced that with "GIYF" now?

  • thanks Des,

    asked third party aboyut the indexed views, they said absolutely not, will void contract, lol

    Can't add to my own table, as i want to use the source, SQL to create the temp columns and then insert into a MAIN table via SSIS (thats another challenge)

    Your Cast statement worked great, so i started to add other columns i need.

    This got me puzzled. There is a column called Attended (bit datatype) and has values of 0 or 1. Is there a way of making a new column using your same sql code and a new temp column to say where there is a 0 it says "not attended", and where there is a 1 it says "attended"?

    I tried myself using a SET and THEN statement as folows:

    ,tester = Set "1" then "Attended" but no joy. I looked at case but not sure how it works with this? Is there an easy way to do this?

    • This reply was modified 3 years, 6 months ago by  rkelly58.
  • I got my first bit of code working myself:

    CASE When AttendedCol = 1 THEN 'Attended' ELSE 'Not attended' END AS OMGThisWorked

    But can't quite add the result to another column as a string of two columns

    , CASE When AttendedCol = 1 THEN 'Attended' ELSE 'Not attended' END AS Attendence

    , AdditionalInfo = (Notes) + ':' + Attendence

    Invalid column name: Attendence

    I basically the column to say Notes (column) + : + Attended or Not Attended

    had a play, got it working, put the case on the same time and enclosed in brackets and removed the AS Attendence at the end...

    Great learning curve, thanks

    • This reply was modified 3 years, 6 months ago by  rkelly58.
    • This reply was modified 3 years, 6 months ago by  rkelly58.
    • This reply was modified 3 years, 6 months ago by  rkelly58.
    • This reply was modified 3 years, 6 months ago by  rkelly58.

Viewing 15 posts - 1 through 15 (of 30 total)

You must be logged in to reply to this topic. Login to reply