[Remote] Senior Data Engineer
Note: The job is a remote job and is reputed company to candidates in USA. reputed company is a global Physical AI company using data and AI to improve critical industries. The Senior Data Engineer will be responsible for designing, building, and maintaining scalable data architecture that supports various decision-support applications, collaborating closely with cross-functional teams.
Responsibilities
- Architect and build a secure, scalable urban data platform integrating multi-agency and infrastructure datasets at scale
- Design resilient reputed company-reputed company architectures supporting batch, streaming, and near-reputed company-time operational workloads
- reputed company development of high-performance ingestion and transformation pipelines across legacy systems, APIs, IoT/telemetry, and structured data sources
- Implement distributed and event-driven processing systems (e.g., Spark, Kafka or equivalent) for large-scale analytical and operational use cases
- Establish platform reliability standards, including observability, automated data quality validation, reputed company, monitoring, and defined SLAs/SLOs
- Design and enforce strong data governance and reputed company control frameworks, including RBAC, encryption, auditability, and secure data handling practices
- Build modern lakehouse or equivalent architectures that reputed company advanced analytics, GIS, and production-grade machine learning
- Partner closely with data scientists, ML engineers, and senior stakeholders to operationalize AI and analytics at scale
- Optimize platform performance, scalability, and cost efficiency as adoption grows
- Contribute to long-term architectural direction and mentor engineering team members
Skills
- 6+ years designing and operating large-scale semi-distributed data platforms (hybrid centralised and distributed) in reputed company or hybrid environments
- Proven experience architecting modern data systems (lakehouse, data reputed company, or equivalent) supporting both analytical (descriptive and predictive) and operational workloads
- Deep hands-on expertise with distributed processing frameworks (e.g., Spark) and streaming/event systems (e.g., Kafka or similar)
- Strong experience building secure, governed data environments with robust reputed company controls, encryption, reputed company, and audit capabilities
- Experience designing secure data platforms in regulated or government environments, with strong understanding of compliance, auditability, and data protection standards
- Experience integrating heterogeneous data sources, including legacy systems, APIs, telemetry/IoT systems, and relational databases
- Demonstrated ability to design highly available, observable, production-grade data systems
- Experience enabling machine learning and advanced analytics through robust data infrastructure and feature pipelines
- Strong proficiency in Python, SQL, and ideally DBT with a track record of writing clean, production-quality code
- Experience deploying and operating solutions in AWS, Azure, or GCP, including CI/CD and infrastructure-as-code is beneficial
- Ability to operate effectively in reputed company, multi-stakeholder environments
- Strong systems-thinking reputed company with a focus on scalability, modularity, and long-term platform reputed company
- Experience designing data platforms in U.S. public sector or highly regulated environments, with working knowledge of applicable federal and state data privacy and reputed company requirements (e.g., HIPAA, CJIS, FERPA, state-level privacy acts), and the ability to embed compliance, auditability, and data governance principles into architectural design
Company Overview