Full Text Search of Microsoft Word Documents

  • I have approximately 5,000 Microsoft Word Documents stored in a file on my c drive. These documents are saved by a case number which is also the primary key on almost all of my tables in my SQL Server 2005 database. Can I do a full text search on these word documents from SQL Server? Thank you in advance.

  • You will need to import the documents to a column in a SQL Server table before SQL Server can create a full text index on them.

    Alternatively you can use Windows index service to craete a Windows full-text index. Your applications would need to know if the FT index is inside or outside SQL to issue the right search commands.

    Original author: https://github.com/SQL-FineBuild/Common/wiki/ 1-click install and best practice configuration of SQL Server 2019, 2017 2016, 2014, 2012, 2008 R2, 2008 and 2005.

    When I give food to the poor they call me a saint. When I ask why they are poor they call me a communist - Archbishop Hélder Câmara

  • OK Ed thanks for your reply. I was not sure if I needed to have the Word docs inside SQL Server or not. I may try the alternate method and let Windows do the full text search. Thank You.

  • We have a similar issue, but the application is like a document management system (DMS) and the returned document list needs to reference the data from the database and NOT display the file system data. We are obviously trying the keep the filesystem information hidden from the user at all times.

    Happy to hear any ideas

  • It is possible to extract the text from a Word document, then import the text into a SQL table and create a Full Text index on it. This can allow the required searching without making the full document available to users.

    There are a number of ways to extract the text...

    * Use the Filtdump utility from the W2003 SDK to front-end the iFilter DLL, and wrap this up in a command script. Google can find more info on Filtdump.

    * Write a .Net front end to the iFilter DLL that does all your require.

    * Automate Word to do a Save As .txt (this can be unreliable in a server environment)

    * Buy a vendor tool that can repurpose Word documents (and other types) as text

    Original author: https://github.com/SQL-FineBuild/Common/wiki/ 1-click install and best practice configuration of SQL Server 2019, 2017 2016, 2014, 2012, 2008 R2, 2008 and 2005.

    When I give food to the poor they call me a saint. When I ask why they are poor they call me a communist - Archbishop Hélder Câmara

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply