Our client, a US-based HR portal, had multiple data sources and data platforms – their existing ‘query and reporting’ systems were not able to give them an integrated view of the business, much less run useful analytics.
They approached Tatras to help them undertake a technology transformation which would enable advanced analytic outputs.
Tatras examined multiple data sources, such as
- Document data: which was textual and humongous
- Transactional data: coming from their OLTP system
- Subscription and Registration data
- Events data
Tatras evaluated multiple big-data infrastructure providers such as MapR, Cloudera, HortonWorks and Pivotal. We did a Proof-of-Concept with the shortlisted platforms.
- Ingested data from the OLTP and brought it into Hadoop including real-time pipelines using Storm and Spark realtime API
- Incremental updates to data tables using reconciliation views
- Exporting data to Visualisation tools
TECHNIQUES, TECHNOLOGIES, TOOLS
- Hadoop, HIVE, Hbase, Spark, MLlib, R, Oozie, Storm
Completed In 3 months, this PoC led to a final proposed big data architecture for the client. And a data science program with Tatras, building analytics schema and predictive models.