Data Engineer

Sai Kumar Sura

I'm a Data Engineer specializing in building scalable ETL pipelines, cloud data solutions, and analytics workflows based in Hyderabad, India.

✉ Contact Me Learn More →

// About me

About
me →

Data Engineer with hands-on experience in building and maintaining ETL pipelines, data processing workflows, and scalable data solutions. Skilled in Python, SQL, Pandas, and PySpark for data transformation and analytics. Experienced with cloud-based data platforms including AWS S3, AWS Glue, and AWS Lambda.

Years Experience

Technologies

Companies

Contact me

// Work History

Experience

Wipro Limited

Oct 2023 – Present

Engineer – Data Engineering

Develop and maintain ETL workflows for processing structured and semi-structured datasets.
Write optimized SQL queries and Python scripts to extract, transform, and load data for analytics systems.
Support data pipeline development for collecting and processing large datasets used for reporting and analysis.
Perform data validation, cleansing, and transformation to ensure high data quality and accuracy.
Collaborate with cross-functional teams to troubleshoot data issues and improve pipeline reliability.
Assist in integrating cloud-based storage solutions for scalable data processing.

Concentrix

Oct 2021 – Apr 2023

Analyst – Data & Operations

Analyzed large volumes of structured and unstructured data to ensure compliance with platform policies.
Conducted data quality checks and validation processes to maintain high accuracy standards.
Generated reports and documentation to support operational insights and decision-making.
Identified patterns and inconsistencies in datasets to improve data review processes.
Worked collaboratively with internal teams to improve operational workflows and data reliability.

// Portfolio

My projects

⚙️

ETL Pipeline Development

Built end-to-end ETL pipelines to ingest CSV datasets and perform complex transformations using Python and Pandas. Implemented data cleaning techniques including missing value handling, duplicate removal, and schema standardization. Used PySpark to process larger datasets with distributed computing.

Python Pandas PySpark ETL

☁️

Cloud-Based Data Pipeline (AWS)

Designed a scalable cloud-based data pipeline using AWS services. Stored datasets in AWS S3 and performed transformations using Python scripts. Utilized AWS Glue for data processing and workflow orchestration, demonstrating scalable data ingestion and processing workflows.

AWS S3 AWS Glue AWS Lambda Python

// Get In Touch

Contact
me →

I'm open to data engineering roles, freelance projects, and collaborations. Feel free to reach out anytime — I'd love to connect!

📍 Hyderabad, India

📞 +91 8500035791

✉ saikumarsura10@gmail.com

Send a message

Sai Kumar Sura

Aboutme →

My skills

Experience

My projects

Contactme →

About
me →

Contact
me →