I wish to store a word document in SQL Server. I should be able to search on the content. e.g. Select Content from DocTable where content like '%this doc is good%'.
I want to retain basic formatting like old,underline , italics.
Is there a way to achieve this without saving the document as HTML?
you'll want to use Full text indexing, which has the option to scan various types of documents: you definitely don't reinvent the wheel when someone has built a race car for the same issue.
as far as formatting goes, the formatting in the document is still in place, but it's broken up into search words;
not sure what it is you are after as far as formatting goes.http://msdn.microsoft.com/en-us/library/ms142571.aspx
Filters. Some data types require filtering before the data in a document can be full-text indexed, including data in varbinary, varbinary(max), image, or xml columns. The filter used for a given document depends on its document type. For example, different filters are used for Microsoft Word (.doc) documents, Microsoft Excel (.xls) documents, and XML (.xml) documents. Then the filter extracts chunks of text from the document, removing embedded formatting and retaining the text and, potentially, information about the position of the text. The result is a stream of textual information. For more information, see Configure and Manage Filters for Search.
--help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!