Maximize extraction coverage in the business domain of RealEstate / Startup
Read all essential contract elements on scanned contracts. Specialize in purchase and lease contracts under Austrian law.
Enrichment of a contract database for statistical analysis and demographic allocation of data within a WebService.
Essential was the generation of a saving in the previous manual processing of the extraction.
At a glance - essential project data
|From 9/1/2020 to 10/31/2020 with about 2 months of full engagement
|Data and Tools
|Market - RealEstate
• 23,237 contracts with 25,295 properties
• Average number of figures per contract: approx. 400
• Average number of price quotes per contract: approx. 13!
• We find out the right number!
• There are contracts with more than one object unsupervised Labeling for automated Model-Training
|Web API for Backoffice System
| • NLP
• Information Extraction
Automated reading and extraction of features from home purchase contracts.
Client motivation / Solution aims
Quality increase in the process
Increase throughput in processing.
|AI key technology used in our solution
|Multiple objects per contract, distinction between inclusive prices vs. exclusive prices.
| • NLP
|ML Integration and ML Operations
| • Operation Integration API
Insights and Details
To achieve information extraction on a contract unsupervised requires not only many contracts, but also the ability to vary the model approaches.
Examples of content problems for purchase contracts - variant 1
Examples for content problems in purchase contracts - variant 2 - Only in the following sentence the context of the preceding figure becomes apparent!
To vary the pre-trailing view of the interpretation in NLP we always accompany this approach with quality ranges in which we optimize the windows size
Even the context of the contract over several objects is recognized and can then be interpreted again.