Skip to content

Unable to parse Document result in Python #39

@anking

Description

@anking

using textract-trp 0.1.3

When parsing "get_document_analysis" response the following output is generated:

Traceback (most recent call last):
  File "G:\dev\OCR\main.py", line 17, in <module>
    result = (textract.receive_document_result('52c4a450c667a18d89f4e26a1cf4b56859ad239f1a63279bec8f60458ae2284e'))
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "G:\dev\OCR\textract.py", line 62, in receive_document_result
    return Document(response)
           ^^^^^^^^^^^^^^^^^^
  File "G:\dev\OCR\venv\Lib\site-packages\trp\__init__.py", line 633, in __init__
    self._parse()
  File "G:\dev\OCR\venv\Lib\site-packages\trp\__init__.py", line 667, in _parse
    page = Page(documentPage["Blocks"], self._blockMap)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "G:\dev\OCR\venv\Lib\site-packages\trp\__init__.py", line 516, in __init__
    self._parse(blockMap)
  File "G:\dev\OCR\venv\Lib\site-packages\trp\__init__.py", line 530, in _parse
    l = Line(item, blockMap)
        ^^^^^^^^^^^^^^^^^^^^
  File "G:\dev\OCR\venv\Lib\site-packages\trp\__init__.py", line 142, in __init__
    if(blockMap[cid]["BlockType"] == "WORD"):
       ~~~~~~~~^^^^^
KeyError: '9e2f5e38-f865-4b79-a37b-ac8ed7a19f02'

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions