Akash Patel professional headshotAkash Patel
Back to projects

Case study

Amazon Reviews ETL Pipeline

Processed 34.7M JSON review records using Pig, Hive, SSIS, SQL Server, and Power BI to create analysis-ready datasets for sentiment and product trend reporting.

Business Impact

Reduced stakeholder data preparation time by 70%.

Tools

PigHiveSSISSQL ServerPower BIBig Data ETL

34.7M records processed

70% data prep time reduction

Analysis-ready reporting dataset

Problem

Large-scale JSON review data was too raw and noisy for fast stakeholder analysis or reporting.

Data / Architecture

A big data ETL pipeline using Pig and Hive for processing, SSIS and SQL Server for structured loads, and Power BI for reporting.

Process

Processed nested JSON review records into structured analytical tables.

Built ETL transformations for sentiment and product trend reporting.

Created Power BI outputs for stakeholder-ready exploration.

Business Impact

Reduced preparation time and enabled faster review trend analysis from large-scale customer feedback.

Screenshots

Amazon Reviews ETL pipeline from JSON records to Power BI reporting visualization