Count Individual Words in freetext fields

  • Intresting bit of help required would prefer to do this in SQL especially as now one is working on the server.

    I have a table(activity) with 2 fields  (RowId,notes)

    What I want is a count on all the unique words in the notes field

    so for example

    rowid              

    x11102 =This is the first test

    x11103 =This is the Second test

    ------------------

    Id want my result set like

    Word      Count

    -----------------

    This          2

    is              2

    the            2

    First          1

    Test           2

    Second        1

    Any help would be much appreciated

     

  • A simple, brute force strategy would be to write a parser to read each NOTES and split the words into a single vertical table.  Then simply run a count for each of the distinct words in the vertical table.  An alternative implementation is to have two columns in the vertical table (WORD and COUNT) and just update count if the word already exists.

    The same results can be achieved by using one of the "split" functions available on this site inside of a derived subquery.

    Have a good day

    Wayne

  • The problem with most split functions is that they can handle only a single character, e.g. ',' -- I think you would have to modify it to handle a pattern instead, so that you could consider all punctuation, white space, etc, characters.

    --
    Adam Machanic
    whoisactive

  • Okay, I just posted a UDF to help solve this problem:

    http://sqljunkies.com/WebLog/amachanic/articles/PatternSplitString.aspx

    ... To use it for this problem, you would have to cursor/loop through the table for each row you want, insert the words into a temp table, and then take the final count from there. I'm not aware of a set-based way to handle the problem.

    --
    Adam Machanic
    whoisactive

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply