Building a dataset

To build typless models for data extraction you need to build a dataset of documents for the document type.

Using your existing data

📘
Use it for pre-training
You can use existing data to achieve the state of the art accuracy for your data extraction.

Use all data from documents already manually processed and stored in the database to build a dataset for your document type.
Upload the original file with correct values from your database to train typless before production.
Use the recipe to start:

Using live data

📘
Use it in live environment
Using live data allows you to improve your data extraction continuously and automate new suppliers on the fly.

Typless continuously improves with a closed feedback loop where you provide correct values for the extracted document. Check out the example below.

Using training room

For smaller volumes of documents and testing purposes you can use training room. In the training room, you can train documents for your document type and perform test extractions to quickly see results. Each document type has its own training room. Data you confirm here as correct solutions will be used to train your document type.