Insert data from .pdf files

  • Hi everyone,

    We are facing a problem with loading data from .pdf files from vendor.

    .pdf files have data in tabular format and we would like to insert those fields into a SQL table.

    We do not want to insert the physical location of the file but, we need to insert the data within the file.

    How can we read a pdf file?

    Thanks & Regards

  • SSC experts would definitely have an answer to this .. my wild guess though is :unsure: ....probably firstly converting it to excel and then reading that excel... or writing some code in some language eg. java .. or using some third party software to read text from PDF...

    Please see if following helps

    http://stackoverflow.com/questions/4784825/how-to-read-pdf-files-using-java

    http://www.a-pdf.com/data-extractor/

  • May be with iText:

    http://itextpdf.com/

    Regards

  • Confusing Queries (3/23/2014)


    Hi everyone,

    We are facing a problem with loading data text from .pdf files[/url] from vendor.

    .pdf files have data in tabular format and we would like to insert those fields into a SQL table.

    We do not want to insert the physical location of the file but, we need to insert the data within the file.

    How can we read a pdf file[/url]?

    Thanks & Regards

    If you want to read a pdf file, I think you might use some PDF reading utility. And as for this question, I think you can find answer in this post.

    http://www.sqlservercentral.com/Forums/Topic1339455-148-1.aspx

    As for "you want to insert data filed that is in tabular format into SQL table", maybe you can check this post

    http://social.msdn.microsoft.com/Forums/sqlserver/en-US/01bf1171-6165-4c29-9242-d7f11f9662d3/insert-pdf-fields-into-sql-table?forum=sqlintegrationservices

    Hope it offers some useful help.:-D

  • In Adobe:

    File>Save As>Text

    PDF table will convert like this:

    Arizona

    5

    Alabama

    4

    Kansas

    9

    Missouri

    3

    Montana

    2

    Read Text file, parse it out.

    Or, you might look at this application: Winautomation. Macro software that can read and write to sql db. it's an excellent application, I've used to to do some web scraping to store in SQL.

  • This is just a shot in the dark, but try Googling or Binging the following:

    +"sql server" +iFilter +PDF +text filestream semantic

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • On a non-technical level, has anyone asked the vendor what other formats they can send the data in? PDF is a print format for humans; having a computer pull data out of it is less than ideal compared to getting a fixed width text file, a delimited text file, or a variety of other formats.

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply