Kafka and Data Lake Engineer
You will build, manage, and operate scalable data platforms centered around Kafka and data lakes.
Responsibilities
- Design data pipelines to ingest, process, and move data from various sources into the data lake using Kafka.
- Deploy, configure, and maintain Kafka clusters, including Kafka Connect and Schema Registry, ensuring high availability.
- Oversee the architecture and governance of the data lake, managing storage (e.g., S3/ADLS), security, and metadata.
- Develop producers and consumers to interact with Kafka topics using Python, Java, or Scala.
- Implement data quality checks, manage lineage, and enforce security controls across data flows.
Required Skills
- 5+ years of proven experience designing and managing data platforms with Apache Kafka and big data technologies.
- Strong proficiency in Python, Java, or Scala.
- Expertise in big data processing frameworks like Apache Spark and Apache Flink.
- Hands-on experience with cloud environments (AWS, Azure, or GCP) and services like S3 or Azure Data Lake Storage.
- Solid understanding of data lake design principles, including Delta Lake or Apache Iceberg.
- Familiarity with infrastructure-as-code tools like Terraform or Ansible and containerization with Docker and Kubernetes.
- Experience with SQL and NoSQL database systems.
Apply tot his job Apply To this Job