Sofia Escobedo
Big Data ETL migrations

Big Data ETL migrations

Executed multiple ETL migration projects from Oracle to AWS Glue, with strong emphasis on detailed requirement collection and comprehensive planning

XalDigital

Technologies Used

Python AWS Glue AWS Redshift AWS CloudFormation AWS Step Functions Project Planning Amazon CloudWatch AWS SNS

This article is also available in Spanish

🇪🇸 Español
Big Data ETL migrations

During my two years at Xaldigital, I had the opportunity to participate in several challenging projects focused on data migration and automation in the cloud. These projects ranged from lead management to the integration of customer and employee surveys, as well as the creation of a master customer record. Here’s a summary of these projects:

  • Lead Management:

  • One of the first projects I worked on was the automation of lead processing.

  • This involved migrating an existing project from SAS Enterprise Guide 7.1 to a serverless infrastructure on AWS.

  • The workflow included:

  • Downloading CSV files from an SFTP server.

  • Data cleaning and transformation.

  • Generating delta tables.

  • Updating the master database.

  • Additionally, we implemented continuous deployment and process orchestration using AWS Step Functions.

  • Customer Survey:

  • In another project, we automated the processing of customer survey results received in CSV format from Medallia.

  • We used EventBridge to trigger the Step Functions flow, AWS Glue to extract the information to Amazon S3, and then integrated it into Redshift.

  • At the end of the process, a notification was sent via SNS.

  • Employee Survey:

  • A similar project to the previous one, but in this case, we processed the results of employee surveys.

  • The architecture and technologies used were very similar, including EventBridge, Step Functions, AWS Glue, and Redshift.

  • Master Customer Record:

  • One of the most challenging projects was the creation of a master customer record.

  • Here, we integrated information from different internal systems (such as Rackspace and Aeroméxico) with the AWS cloud, maintaining the continuity of existing workflows.

  • We utilized AWS Glue and Step Functions to orchestrate the entire process.

  • Post-Purchase Data Flows:

  • Finally, I worked on a project where CSV files generated by SAS on an internal server were uploaded to the AWS cloud for processing.

  • We also used AWS Glue and Step Functions, along with Lambda for integration with SAS.

These projects allowed me to develop key skills in data flow automation, system integration, continuous deployment, and process orchestration in the cloud. It was a constant challenge, but incredibly rewarding to optimize and scale these workflows.

If you want to learn more about each of these projects, feel free to click on the corresponding links.

Related Posts

Explore more articles on similar topics and technologies.