Output Formatter

Use formatter to transform and post-process the extractor data before output in the JSON response

Formatter Setup

To set up the formatter, navigate to the "Formatter" tab on the extractor page.

You can add multiple steps to the formatter and customize each step with its individual configurations. Once you save the setup, the JSON response from the FormX API will undergo transformation according to the rules you define here. To test the extractor, proceed to the "Test" tab.

Supported Actions

Remove Characters

Remove characters from field values based on predefined types, a list of strings, or a regular expression. This action is useful for data cleanup when you are familiar with the character set of the data. For example, you can remove whitespace and newline characters.

Predefined types: Numbers, Space, New Line, Special characters (punctuations), English Alphabets, Extended Latin Alphabets, Chinese Characters

Custom Value: Strings separated by lines

Regular Expression: Find and remove characters using a regex pattern.

Keep Only One Language

Retain only one language from the value and remove characters from other languages. Currently, English and Chinese are supported.

Date Format Correction (Declare Date Format)

By default, FormX detects dates in the UK format (day-first). If the extracted date value contains errors with swapped day and month, you can use this action to instruct FormX to detect dates in the US format (month-first) instead.

This action is particularly useful in combination with "Conditional Matching." For example, if you are processing receipt documents from various vendors, some using the UK date format and others using the US date format, you can apply this action conditionally based on the extracted vendor names. This ensures accurate date extraction regardless of the vendor's format.

Condition Matching

You can configure the transformation to apply only when a certain condition is met in another field. For instance, in an invoice, you can set FormX to clean up the value in the product items only if the vendor name matches a specific company.

Output Settings

After formatting, you have the option to either replace the original value or return it in a new field. The output can be used in subsequent steps as well.

If "Return in New Field" is selected, the field will appear in under formatter_output in the JSON response alongside with other auto extraction items

{
  "auto_extraction_item":[
    {
    	"name": "vendor_name",
    	"value": "Acme Inc."
  	}
  ],
  "formatter_output":[
    {
      "name":"cleaned_up_vendor_name",
      "value": "Acme"
    }
  ]
}