Reducing Errors & Manual Effort in Document Processing Using
Vision and NLP
The Challenge
A real-estate technology firm had implemented a less efficient, semi-automated process to match and map data from a multitude of inspection reports with various formats, to an estimate for works needed. This process was time-consuming, error prone and expensive. Tatras was asked to develop a machine learning pipeline to make the estimate creation more efficient, accurate and scalable.
Hypothesis
- Deep Learning algorithms can identify inspection reports with similar layout.
- A single model for each layout can extract various components of each inspection note and group them with high accuracy.
- Inspection notes contain enough semantic information to map them using NLP onto work items.
Execution
- Layout data tagged.
- Pretrained models fine-tuned to generate visual embeddings of documents. Embeddings used for clustering documents.
- OCR model trained for each cluster
- Extracted text and associated images used to map to work item.
- PyTorch and TensorFlow, LayoutLM, Clustering, Transformers, Fasttext, Fast-API.
Outcomes
- 95% accuracy of OCR data.
- 65% reduction in manual efforts to date.
- 60% coverage of report layouts to date.
- Deep Learning algorithms can identify inspection reports with similar layout.
- A single model for each layout can extract various components of each inspection note and group them with high accuracy.
- Inspection notes contain enough semantic information to map them using NLP onto work items.
Project Highlights
65%
reduction in manual efforts to date.