We could provide a new extract-tables subcommand, thats uses the camelot to extract tables from PDF files.
The PR implementing this should include:
- unit tests
- documentation: docstrings & a new page in
docs/user/
- the command output should display the
.parsing_report from camelot
- it should be possible to target specific PDF pages
- various export options should be possible, using the corresponding
camelot methods: to_csv() , to_json(), to_excel(), to_html(), to_markdown() & to_sqlite().
- other options could be implemented immediately or in further PRs:
--password for decryption, --flavor, --parallel, --split-text
We could provide a new
extract-tablessubcommand, thats uses the camelot to extract tables from PDF files.The PR implementing this should include:
docs/user/.parsing_reportfromcamelotcamelotmethods: to_csv() , to_json(), to_excel(), to_html(), to_markdown() & to_sqlite().--passwordfor decryption,--flavor,--parallel,--split-text