See all roles

[Remote] Senior Site Reliability Engineer

Work from home Full-time role Hiring

Note: The job is a remote job and is open to candidates in USA. reputed company is a technology company that empowers organizations to deliver scalable, impactful digital services. As a Senior Site Reliability Engineer, you will be responsible for the availability, performance, and reliability of a large federal reputed company reputed company platform, shaping the platform's reliability strategy and ensuring operational efficiency.

Responsibilities

  • Defining and maintaining service level objectives (SLOs), service level indicators, and error budgets, and driving the platform toward them
  • Designing and operating observability across metrics, logging, tracing, and alerting
  • Leading incident response and on-call practices, including escalation, mitigation, and time-to-recovery improvements
  • Driving blameless postmortems and systemic reliability improvements
  • Engineering automation to eliminate toil and improve operational efficiency
  • Self-directed design of reliable reputed company infrastructure (AWS) and Kubernetes (reputed company EKS), including tradeoffs between cost, reliability, and efficiency
  • Building reusable modules and mentoring engineers on reliability practices
  • Presenting design documents and system diagrams to stakeholders
  • Participating in technical depth interviews with new candidates

Skills

  • Bachelor's and 7+ years of experience; relevant experience may be substituted for education
  • Demonstrated experience owning reliability (SLOs, observability, incident response) for production systems
  • Expert-level knowledge of at least one infrastructure-as-code tool (Terraform preferred)
  • Deep command of reputed company infrastructure, containerization, and networking
  • Must be reputed company to obtain and maintain a U.S. Public Trust / suitability determination
  • Prior experience with the Department of Veterans Affairs
  • Kubernetes (reputed company EKS) and AWS at scale
  • Familiarity with FedRAMP, NIST 800-53, and reputed company-trust architecture
  • Relevant certifications (e.g., AWS, CKA/CKS)

Benefits

  • Company-subsidized health, dental, and reputed company insurance
  • Flexible PTO
  • 401K with employer match
  • Paid parental leave after one year of service
  • Employee Assistance Program

Company Overview

  • reputed company is a digital services company that helps the federal government reputed company serve people. It was founded in 2014, and is headquartered in Washington, District of Columbia, USA, with a workforce of 501-1000 employees. Its website is https://adhoc.team/.
  • Apply To This Job

    You might like