BIG DATA ENGINEERING FOR ADVANCED

THE CHALLENGE

Our client, a US-based HR portal, had multiple data sources and data platforms – their existing ‘query and reporting’ systems were not able to give them an integrated view of the business, much less run useful analytics.

They approached Tatras to help them undertake a technology transformation which would enable advanced analytic outputs.

THE SOLUTION

Tatras examined multiple data sources, such as
  • Document data: which was textual and humongous
  • Transactional data: coming from their OLTP system
  • Subscription and Registration data
  • Events data
Tatras evaluated multiple big-data infrastructure providers such as MapR, Cloudera, HortonWorks and Pivotal. We did a Proof-of-Concept with the shortlisted platforms.
  • Ingested data from the OLTP and brought it into Hadoop including real-time pipelines using Storm and Spark realtime API
  • Incremental updates to data tables using reconciliation views
  • Exporting data to Visualisation tools

TECHNIQUES, TECHNOLOGIES, TOOLS

Hadoop, HIVE, Hbase, Spark, MLlib, R, Oozie, Storm

RESULT/IMPACT

Completed In 3 months, this PoC led to a final proposed big data architecture for the client. And a data science program with Tatras, building analytics schema and predictive models.