Portfolio - TATRAS DATA

From Product Zero to Agentic Chat: A Five-Year Build

From Product Zero to Agentic Chat: A Five-Year Build The Challenge The client set out to build a content intelligence platform that could deliver real-time, context-aware answers drawn from diverse data sources. Early retrieval systems were too slow, with response times stretching to tens of seconds. This created a poor user experience and added significant […]

AI-Powered Industrial Incident Analysis and Root Cause Automation

AI-Powered Industrial Incident Analysis and Root Cause Automation The Challenge Industrial incident investigations relied on manual and inconsistent data collection. Witness statements often remained unstructured, slowing root cause analysis and corrective action planning. The customer, a U.S.-based occupational safety copilot backed by private equity, aimed to modernize this process with an AI-guided digital system that […]

Transforming 1099 Financial Workflows with Dataiku Automation

Transforming 1099 Financial Workflows with Dataiku Automation The Challenge The current 1099 contractor payment process is slow and inefficient, taking 4–5 days per report due to heavy manual work. Multiple contributors increase risks of errors, duplicates, and missed payments, undermining trust and creating rework. As contractor volume grows, the process cannot scale, leading to bottlenecks […]

AI Powered Hail Damage Detection with Dataiku

AI Powered Hail Damage Detection with Dataiku The Challenge Traditionally, insurance companies have relied on contractors to physically climb onto roofs and manually mark areas damaged by hail. This process is: Time-consuming and resource-heavy. Prone to human error and inconsistencies. Risky and potentially dangerous for inspectors. Slows down the claims process and impacts customer satisfaction. […]

Scaling Enterprise Reporting with Tatras Data and Dataiku

Scaling Enterprise Reporting with Tatras Data and Dataiku The Challenge A global risk and benefits management firm relied heavily on manual reporting processes to track performance across departments. Reports ranged from daily operational updates to complex quarterly analyses. This manual system created several challenges: Time-consuming and error-prone reporting workflows. High opportunity cost for skilled employees […]

Self Healing Knowledge Base with AI-Driven Metadata and Taxonomy at Scale

Self-Healing Knowledge Base with AI-Driven Metadata and Taxonomy at Scale The Challenge Document bases with potentially million documents are very challenging to manage, classic knowledge base scalability challenges. This is required for proper organization, searchability, removing duplication and many more. Most of the documents do not have any tags / categories to identify their content […]

Vision-Language OCR for PDFs and Spreadsheets Elevates Multimodal Q&A from 45% to 75%

Vision-Language OCR for PDFs and Spreadsheets Elevates Multimodal Q&A from 45% to 75% The Challenge The system faced significant limitations in handling multimodal documents: PDFs: No text extraction was available for non-editable and scanned PDFs, making Q&A impossible and highlighting a gap in AI for PDFs within broader multimodal document processing. Visuals: Images, figures, and […]

LLM based evaluation pipeline to reduce human effort on answer validation

LLM-based evaluation pipeline to reduce human effort on answer validation The Challenge Manually validating answers after updates in the application was time-consuming, repetitive, and required significant human effort. This bottleneck made rapid iteration and quality control difficult, blocking efforts to reduce human effort in QA validation. Hypothesis Leveraging an LLM to evaluate answer quality by […]

HTML to Markdown and Table Chunking Achieve 20% RAG Accuracy Gain

HTML to Markdown and Table Chunking Achieve 20% RAG Accuracy Gain The Challenge The initial HTML ingestion pipeline was extracting only raw text, losing critical structural elements like links, formatting, and hierarchy. Additionally, answers derived from long tables were often incomplete or inaccurate because of loss of structured data due to ingestion. This highlighted a […]

LLM-Grounded Citations for Trustworthy RAG Answers

LLM-Grounded Citations for Trustworthy RAG Answers The Challenge Citations, a key factor for explainable AI question answering, returned alongside answers in the RAG pipeline were often incorrect or misleading, hurting citation accuracy in RAG systems. The LLM didn’t consistently utilize all retrieved documents, making it hard to trace answers back to sources. Additionally, hallucinations and […]

Archives: Portfolio