in

Data Engineer

Sai Kumar Sura

I'm a Data Engineer specializing in building scalable ETL pipelines, cloud data solutions, and analytics workflows based in Hyderabad, India.

Sai Kumar

About
me

Data Engineer with hands-on experience in building and maintaining ETL pipelines, data processing workflows, and scalable data solutions. Skilled in Python, SQL, Pandas, and PySpark for data transformation and analytics. Experienced with cloud-based data platforms including AWS S3, AWS Glue, and AWS Lambda.

3+
Years Experience
5+
Technologies
2
Companies
Contact me

My skills

Python
SQL
Pandas
PySpark
AWS
Snowflake
ETL Pipelines
Data Modeling

// Work History

Experience

Wipro Limited
Oct 2023 – Present
Engineer – Data Engineering
  • Develop and maintain ETL workflows for processing structured and semi-structured datasets.
  • Write optimized SQL queries and Python scripts to extract, transform, and load data for analytics systems.
  • Support data pipeline development for collecting and processing large datasets used for reporting and analysis.
  • Perform data validation, cleansing, and transformation to ensure high data quality and accuracy.
  • Collaborate with cross-functional teams to troubleshoot data issues and improve pipeline reliability.
  • Assist in integrating cloud-based storage solutions for scalable data processing.
Concentrix
Oct 2021 – Apr 2023
Analyst – Data & Operations
  • Analyzed large volumes of structured and unstructured data to ensure compliance with platform policies.
  • Conducted data quality checks and validation processes to maintain high accuracy standards.
  • Generated reports and documentation to support operational insights and decision-making.
  • Identified patterns and inconsistencies in datasets to improve data review processes.
  • Worked collaboratively with internal teams to improve operational workflows and data reliability.

// Portfolio

My projects

⚙️
ETL Pipeline Development

Built end-to-end ETL pipelines to ingest CSV datasets and perform complex transformations using Python and Pandas. Implemented data cleaning techniques including missing value handling, duplicate removal, and schema standardization. Used PySpark to process larger datasets with distributed computing.

Python Pandas PySpark ETL
☁️
Cloud-Based Data Pipeline (AWS)

Designed a scalable cloud-based data pipeline using AWS services. Stored datasets in AWS S3 and performed transformations using Python scripts. Utilized AWS Glue for data processing and workflow orchestration, demonstrating scalable data ingestion and processing workflows.

AWS S3 AWS Glue AWS Lambda Python

Contact
me

I'm open to data engineering roles, freelance projects, and collaborations. Feel free to reach out anytime — I'd love to connect!

📍 Hyderabad, India
📞 +91 8500035791
saikumarsura10@gmail.com
Send a message