Sahil Sharma — Automation QA Lead

Overview

Automation and validation of an enterprise retail data and search ecosystem, focused on ensuring correctness, consistency, and reliability across large-scale data ingestion and search platforms.

This project addressed high-risk data flows where incorrect transformations or indexing could directly impact product discovery, SEO, and customer experience.

What I worked on

ETL validation for large retail datasets flowing from upstream commerce systems into a search platform
Verified data extraction, transformation, and loading logic to ensure accuracy across ingestion stages
Validated field mapping and data integrity between source systems and the search index
Ensured data consistency in DynamoDB after ingestion and transformation
Automated validation of search metadata, including:
- Metadata attributes
- URL slugs
- SEO-related data (SEODS)

Search & sitemap automation

Designed and implemented a scalable automation framework for:
- Sitemap validation
- Search metadata correctness
- SEO-critical attributes
Built the framework from scratch using Playwright (TypeScript) for reliability and extensibility
Ensured automated coverage for scenarios impacting search discoverability and indexing

Engineering approach

Developed Cypress-based automation frameworks for API and data validation
Automated backend REST API testing using Cypress and the AWS SDK
Validated DynamoDB tables after ingestion and scan operations
Tested AWS Lambda and Step Functions responsible for data processing workflows
Integrated automation into GitLab CI/CD pipelines with:
- Allure reporting
- Clear execution signals
- CI-friendly test orchestration

Quality & reliability focus

Emphasized data correctness over UI-only validation
Built tests to detect:
- Broken transformations
- Incomplete ingestion
- Incorrect field mappings
- Search metadata regressions
Ensured pipelines provided fast, deterministic feedback for data and search-related changes

✓

Outcome

Improved confidence in enterprise data ingestion and search reliability
Reduced production issues related to incorrect indexing and metadata
Established a scalable automation foundation for validating future data pipelines and search enhancements