Back to All

Extracting from multiple tables on single page

Hi. I have use-case as follows:

  1. Extract 6 data points from 4 identical grids on each single page. The grids are well-structured but are NOT in rows/columns. The documents are native PDF but I suppose I could convert from PDF to JPG or another 'photo' format.
  2. Each document is comprised of anywhere from 10-100 pages. Each page will be identically structured.
  3. I need output in a table form where ideally all of the data of each type will come in a single column. So, in the case of a 100 page document, the ideal output would be 6 columns (one for each type of data) and 400 rows - 100 pages x 4 instances of each data per page. Worst case, I would be OK if got 24 columns (each of the four instances of each type of data on each page treated as a separate item) x 100 rows.

Can formx handle this use case? I tried to do myself and almost immediately got tripped up. I would be happy to send back a sample of the kind of document I'm working with. Thank you.