Client Challenge
Client Challenge – The client’s growth created a variety of issues that their existing platform could not handle effectively. The issues included:
- Difficulty in integrating and reconciling data due to multiple IT systems and DBs
- Merging of inconsistent data sets leading to poor data quality & accuracy
- Lack of standardization in data format, structure and content
- Data privacy and security concerns
- Limited visibility into cross departmental data hindering decision making
- Data silos and duplication of information
Milestone Solution
Milestone’s Digital Platform Engineering Services worked with the client and determined an AWS cloud-based platform (data lake) using cloud-native technologies and Spark would provide: a configurable ETL pipeline to handle different kinds of files, support parsing of CSV, XML, XLSX and custom formats, anonymize PCI/PII data, and transform data as needed. The solution also included an ETL framework to build self-healing data pipelines and support multiple sources and destinations, and in-built Datapost and intelligent orchestration capabilities.
Result
The result was a tailored AWS data lake platform that:
- Allowed data from six different business units to be integrated into one format and one Data warehouse.
- Enabled data to be collected through files, database, Kafka streams, etc.
- Put data privacy and security checks in place to ensure PCI/DSS compliance.
- Integrated data across the merchant life cycle from marketing to servicing.
- Built-in data quality checks and alerts will be raised within the hour of an inconsistency.