comma delimited name column

  • I am not a DBA, but rather a .NET developer that has been thrust into working on an SQL database created using documentation and modeling from another database. The data is provided by the state of NC in a fixed length format. Getting the data into the database is no problem. The problem is that one of the searchable columns, the name column, is populated with the full name delimited by commas. My first thoughts are to create columns for the different name parts. The problem with that is the names sometimes do not follow the "last, first, middle, suffix" pattern. Some names in the column are of other nationalities that may consist of about five or six name parts. On top of that there are instances where there may be two or three commas before, after or in the middle of the name data.

    Searching the data as it is was simplified by creating a full-text index and searching the data with the containstable and near predicates and functions. The issue comes in when I the searcher needs to search for different spellings, either by the end user or the person that entered the data. Example: "Keith or Keeth". The FORMSOF function doesn't seem to do the trick when searching the name column.

    I have experimented with the Soundex function provided in SQL but that really doesn't seem to work on the comma delimited column data either. I get way too much useless results to deal with considering there are over 30 million rows of data to search.

    Does anyone have any suggestions on the best approach for this problem?

  • Without seeing actual examples of the data you are looking at this may be a bit difficult, but I have to ask the question, is your data not comma delimited by any chance.

    If this is the case, then set the delimiter as comma when importing, and then have this imported to the seperate fields as you described that you would like to create.

    The you may want to place full text indices on the columns to speed up your search functions.

    Not that I know too much about full text indices, but if I had searches on strings, thats the first place I would investigate.

  • No, the raw data comes as a fixed length file. The name column just contains commas. It could look like the following.

    smith,john,thomas,jr

    smith,john,thomas,sr

    smith,john,thomas,,,

    ,,smith,john,thomas,

    smith,john,,thomas,

    smith,john,t,,,,,

    ,,,smith,,,john,thomas,

    smith,john,thomas,AKA,John,thomas

    No consistency at all and that is really frustrating.

  • You can actually do like this just replace “,” with Single Quote & Comma & Single Quote will get you the answer. Along with that, we have to prepend “SELECT ” with a Single Quote and add a Single Quote at last.

    the final code looks something like this

    DECLARE @p NVARCHAR(MAX)

    SELECT @P = 'AFGSDFGSDF,BSDF,CSDF,D'

    DECLARE @STRSQL NVARCHAR(MAX)

    SET @STRSQL = 'SELECT ''' + REPLACE(@P,',',''',''') + ''''

    EXEC( @STRSQL)

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply