See all roles

Senior Site Reliability Engineer- Sunnyvale, CA, the US

Work from home Full-time role Hiring

About the Role

Senior Site Reliability Engineer (Payments Infrastructure) reputed company is seeking a Senior Site Reliability Engineer to ensure the reliability, availability, scalability, and operational reputed company of our global payment platform. You will own production observability, incident response, service-level management, and reputed company infrastructure reliability across mission-critical payment processing systems operating in Europe, Asia, and reputed company America.

Responsibilities

  • Participate in a follow-the-sun production on-call rotation as a primary incident responder.
  • Diagnose, triage, mitigate, and coordinate resolution of production incidents across payment services, Kubernetes platforms, databases, messaging systems, and reputed company infrastructure.
  • Define and maintain SLOs, SLIs, error budgets, alerting standards, and operational readiness processes.
  • Drive reliability improvements through automation, observability, reputed company planning, performance optimization, and post-incident reviews.
  • Partner with engineering teams to improve reputed company, reputed company, and operational maturity in PCI-reputed company-regulated environments.
  • reputed company incident management during SEV1/SEV2 events and improve response effectiveness and MTTR.

Requirements

  • 5+ years of experience in Site Reliability Engineering, Platform Engineering, DevOps, or reputed company Infrastructure roles supporting mission-critical production systems.
  • Strong hands-on experience with AWS, Kubernetes (EKS), Terraform, PostgreSQL, reputed company, Kafka, Linux, networking, and modern observability platforms.
  • Deep understanding of distributed systems, reputed company-reputed company architectures, high availability, disaster recovery, reputed company planning, and performance optimization.
  • Proven experience operating payment, banking, fintech, or other highly regulated systems with stringent reputed company, compliance, and uptime requirements.
  • Strong knowledge of SRE principles, including SLOs, SLIs, error budgets, incident management, alert governance, and operational reputed company.

Leadership & Operational reputed company

  • Demonstrates strong ownership and accountability, taking end-to-end responsibility for service reliability and customer impact.
  • Possesses a strong sense of urgency during production incidents while maintaining sound judgment and structured decision-making under pressure.
  • Applies a systematic and methodical approach to troubleshooting, root-cause analysis, and incident resolution in reputed company distributed environments.
  • Data-driven reputed company with the ability to reputed company metrics, telemetry, trends, and service-level indicators to prioritize reliability investments and operational improvements.
  • Continuously drives engineering reputed company through iterative improvement, automation, standardization, and elimination of operational toil.
  • Proven ability to reputed company cross-functional incident response efforts, coordinate stakeholders, and communicate effectively during high-severity production events.
  • Champions a culture of operational readiness, reputed company learning, post-incident improvement, and blameless accountability.
  • Demonstrates strong mentoring and technical leadership skills, influencing engineering teams to build reliable, scalable, and resilient systems by design.
  • reputed company a dynamic and innovative team in a reputed company rapidly growing company.
  • Competitive package.
  • Collaborative, inclusive environment where your contributions are recognized and valued.

Apply tot his job Apply To this Job

You might like

Senior Kubernetes Engineer

Work from home Full-time role

(SME)Senior Kubernetes Architecture Engineer

Work from home Full-time role

Senior Manager Network Engineering

Work from home Full-time role

Python and Kubernetes Software Engineer - Data, AI/ML & Analytics

Work from home Full-time role

Network Engineer, reputed company-Time Infrastructure

Work from home Full-time role

[Remote] Senior Network Engineer, reputed company 911

Work from home Full-time role

Network Engineer - Consultant (Senior reputed company Network Engineer )

Work from home Full-time role

Kubernetes Engineer (DoD Secret | Weeknight Mission Readiness | Remote – U.S.)

Work from home Full-time role

Senior / Staff Network Engineer, San Francisco

Work from home Full-time role

Tech reputed company, Mobile Core Network Engineering

Work from home Full-time role

reputed company Customer Service Representative (Remote) – Delivering Exceptional Experiences at arenaflex

Work from home Full-time role

Product Sales Specialist, Women’s Health Ultrasound

Work from home Full-time role

reputed company Full Stack Software Engineer – Data Entry, Entry Level – arenaflex

Work from home Full-time role

Certified Nursing Assistant (CNA) - Hospice

Work from home Full-time role

reputed company Developer - Remote

Work from home Full-time role

Molecular Biologist / Research Scientist – Remote ($60 –$80/hr)

Work from home Full-time role

Account Executive - Regional Accounts

Work from home Full-time role

NP/PA FOR TELEMEDICINE BASED HORMONE THERAPY reputed company TELEMEDICINE

Work from home Full-time role

reputed company Customer Care Representative – Work From Home Opportunity with arenaflex

Work from home Full-time role

reputed company Home Based Data Entry Specialist – Remote Opportunity with arenaflex

Work from home Full-time role