Tracking Manufacturing Throughput with Vision-Based Deep Learning

Quick Summary

Challenge

Manual tracking of stitching table output and worker breaks was inefficient, leading to productivity blind spots in the manufacturing process.

Solution

Tatras Data developed a video analysis system using pose estimation and object detection to track towel folding actions and log workstation-level performance.

Result

92% of employee line activity successfully captured

Tech Stack

AI: Custom neural networks Pose estimation Action classifiers | ML: Human role identification Sequence classification | Data & Retrieval: Live camera stream processing | Dev: OpenCV PyTorch Custom DL modules SAP integration | Viz: Line activity dashboards Workstation heatmaps | Security: On-prem inference with restricted data storage

The Challenge

On the factory floor, every stitch — and every pause — adds up.

A leading towel manufacturer had installed cameras above their stitching tables.

The goal was simple: maximize productivity.

But the footage sat unused, offering no insights into actual throughput or worker behavior.

They needed a way to track two things:

How many towels were folded at each table
When workers were active, idle, or on break

Manual monitoring was out of the question.

It was time to teach machines to watch, understand, and report.

A Day in the Life: Before Our Solution

Supervisors walked the floor every hour with clipboards in hand.

Some workstations moved faster than others, but no one had clear numbers. Counting folded towels relied on end-of-day samples. Breaks were estimated. And the entire reporting process was disconnected from their operations software.

On paper, throughput looked consistent.

In reality, productivity fluctuated — and no one could say why.

Pain Points:

No automated way to count towel output from stitching tables
Worker breaks went untracked or misestimated
Real-time insights into productivity were unavailable
Data couldn’t be fed into downstream systems like SAP
Supervisors spent time manually observing instead of optimizing

Solution

1. Core Innovation

Tatras Data developed a deep learning system tailored for industrial vision environments:

Each camera feed was parsed to localize workstations using object detection
Pose estimation tracked the towel folder’s movement
A sequence classifier turned hand positions into folding actions
Custom neural architectures overcame visual obfuscation and motion blur
Break times were inferred from inactivity patterns and visual gaps
Output was mapped per table and pushed into SAP for central visibility

2. Key Features

Real-time folding action detection via pose-based modeling
Human role identification across shared workstations
Per-table throughput counts logged to central systems
Break detection for individual employees
SAP integration for operations and HR reporting

3. Workflow Integration

The system runs continuously on-prem. It captures video from overhead cameras, performs on-device inference, and streams counts + time logs into SAP dashboards.

No manual intervention. Just clear, structured insights from every workstation.

Outcomes

✅ 92% of employee line activity successfully tracked

📊 Accurate towel output counts by stitching table

⏱️ Break durations mapped without wearables

🧵 Process visibility unlocked for both ops and HR

📈 Increased productivity through targeted floor-level adjustments