How to efficiently perform search on my DB and Word documents on my server?

  • So I'm building a system for a client and they want to have a search functionality for all the data their users and clients upload to their service, that data includes Word documents and user text (plain text that'll be stored in a MySQL DB)

    So what are the industry standards when it comes to search functionality for both Word docs and regular plain text data in a DB?

    I of course want to have a good performance for the search so it won't over load the server...

    BTW I use NodeJS as a backend if that matters (which I don't see why it would)

    Thanks in advance 🙂

  • I'm sorry, but this is Microsft SQL Server forum, not MySQL. These are completely different products.

  • My opinion - those are 2 distinct problems.  First being searching of Word documents and the second being searching user text stored in MySQL DB.  If the word document is being stored in MySQL as well, then you are going to have a lot of problems with that as a Word document is not plain text and as such, there is nothing native in MySQL (or any database language that I am aware of) that can read a Word document.

    I would try to find a tool (something like elasticsearch) and tackle each problem one at a time.

    I think elasticsearch can handle both MySQL and Word documents (with 3rd party plugins), but I have never set it up that way.  If your word documents are stored in the database, I am not entirely sure if elasticsearch can handle that or not and you may need to dump them to disk.

    An alternate way to do this would be to use a tool that can convert Word to plain text and store ONLY plain text in the database.

    I have never used elasticsearch, I just know of it being a very flexible searching tool.  If memory serves, it is a community support tool though so, like most free tools, if you are using this for an enterprise level application, be prepared for having days of downtime while waiting on someone to reply to a forum post which may not even get a reply.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • Okk I got it

    ____________________________________________________________________________

    real estate marketing india runwal dombivali  suraj palette

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply