Azure Data Lake implementation for a major retail group

August 9, 2022 | Published by

Client Overview

The client is a major Pan-Asian retail group operating more than 10,000 supermarkets, hypermarkets, convenience stores, health, and beauty stores, as well as home furnishings stores world-wide.

The Problem

The client was facing issues in Data factory pipelines, Databricks Notebooks, and SQL DWH Artifacts. The finance report logic contained various bugs & required modification in produced report structure. The Client wanted the development of Finance Reward Offer Recon for reconciliation of the generated reports. Data Lake Anfield’s report format, data types were not accurate and multiple discrepancies in master and transactional data needed to be resolved. There was a need to build a system that generates SQL queries based on the JSON files.

Our Solution

PureSoftware engaged with the client to address their challenges and handled multiple change requests due to change in business logic. Our team of experts modified and solved multiple bugs in the finance report. Various components to verify email notification in data factory pipelines were also tested. To meet the client requirements, system was built and tested the JSON configuration files which implemented the functionality of the stored procedures. Added own brand classification into the existing structure.

The Result

  • All the bugs assigned and identified by PureSoftware Team were fixed.
  • Discrepancies in the data were removed.
  • Format of the final reports were corrected.
  • Documented all findings and issues identified in the scope of project

Technology Stack

  • Azure: Azure Datafactory, Databricks
  • Python: Pandas, Pyspark, Pypika
  • SQL: SQL DWH