Document Understanding
Description
The Document Understanding operation is designed to extract answers to user-defined questions from provided text.
The input text can be derived from PDF files, images, or other supported document formats. Both processed text and raw text can be used as input.
Multiple questions can be submitted at once for a single document.
Number of Parameters: 4
Document Understanding Parameters
Parameter: Processing Text
Specifies the key that contains the text to be analyzed.
The key must be provided inside square brackets.
Example:
['processed_text']
Nested JSON keys can also be provided.
Example:
['data_response']['processed_text']
Only one key name is allowed inside each square bracket.
Invalid example:
['processed','text']
Parameter: Question
Defines the list of questions whose answers must be extracted from the document.
Each question should be provided as a separate entry.
Examples:
- What is the name of State?
- Which is the capital of State?
- What is the language of State?
- What is population of State?
Parameter: Attribute
Specifies the attribute name used to store the answer for each corresponding question.
Each question must have a matching attribute.
Example:
If the question is: What is the name of State?
Attribute:
State
Parameter: Understanding Key
Defines the key that stores the extracted answers for all questions.
Example:
Answers
Image Understanding
Description
The Image Understanding operation extracts important information from image-based data by answering user-defined questions.
Images must be provided in Base64 encoded format.
The system analyzes the image content and returns structured answers.
Number of Parameters: 4
Image Understanding Parameters
Parameter: Processing Image
Specifies the key that contains the Base64 encoded image.
Example:
['processed_text']
Parameter: Questions
Defines the questions used to extract information from the image.
Example:
- What is the name of State?
Parameter: Attribute
Specifies the attribute name that stores the extracted answer.
Example:
State
Parameter: Understanding Key
Defines the key that stores all extracted answers from the image.
Example:
Answers
Frequently Asked Questions
What is Document Understanding in eZintegrations?
It is a deep learning operation that extracts answers from text-based documents using predefined questions.
What input formats are supported for Document Understanding?
Processed text and raw text extracted from PDFs, images, and other files are supported.
Can I ask multiple questions in one operation?
Yes. Multiple questions can be configured and processed in a single execution.
What format is required for Image Understanding?
Images must be provided in Base64 encoded format.
Where are extracted answers stored?
All extracted answers are stored under the configured Understanding Key.
Notes
- Ensure correct key paths are provided for text and image inputs.
- Each question must have a corresponding attribute.
- Use clear and specific questions for accurate results.
- Validate extracted data before downstream processing.
- Test configurations with sample inputs before production use.