Job Description
We are seeking a highly skilled Python Data Engineer with a strong background in designing, developing, and maintaining scalable data solutions. The ideal candidate will have 5+ years of professional experience in Python and SQL, with proven expertise in building ETL/ELT pipelines, managing data streaming solutions. This role requires a hands-on engineer who can translate business requirements into reliable data systems, optimize performance, and ensure high data quality. Job Responsibilities
-
- Develop and maintain data pipelines and ETL/ELT processes using Python
- Design and implement scalable, high-performance applications
- Work collaboratively with cross-functional teams to define requirements and deliver solutions
- Develop and manage near real-time data streaming solutions using PubSub or Beam.
- Contribute to code reviews, architecture discussions, and continuous improvement initiatives
- Monitor and troubleshoot production systems to ensure reliability and Performance.
Basic Qualification
-
-
- 5+ years of professional software development experience with Python and SQL
- Strong understanding of software engineering best practices (testing, version control, CI/CD)
- Experience building and optimizing ETL/ELT processes and data pipelines
- Proficiency with SQL and database concepts
- Experience with data processing frameworks (e.g., Pandas)
- Understanding of software design patterns and architectural principles
- Ability to write clean, well-documented, and maintainable code
- Experience with unit testing and test automation
- Experience working with any cloud provider (GCP is preferred)
- Experience with CI/CD pipelines and Infrastructure as code
- Experience with Containerization technologies like Docker or Kubernetes
-
Preferred Qualification
-
- Experience with GCP services, particularly Cloud Run and Dataflow
- Experience with stream processing technologies (Pub/Sub)
- Familiarity with big data technologies (Airflow)
- Experience with data visualization tools and libraries
- Knowledge of CI/CD pipelines with Gitlab and infrastructure as code with Terraform
- Familiarity with platforms like Snowflake, Bigquery or Databricks,

