See all roles

[Remote] Senior Site Reliability Engineer

Work from home Full-time role Hiring

Note: The job is a remote job and is open to candidates in USA. You.com is building the AI Search Infrastructure that powers modern AI systems. As a Site Reliability Engineer, you will own parts of the reliability, observability, and incident response posture for You.com’s production services, ensuring uptime and developing tools for incident management.

Responsibilities

  • Instrument services end-to-end using OpenTelemetry metrics and structured logging to ensure every critical path is measurable
  • Develop and maintain SRE standards and patterns (instrumentation guidelines, incident playbooks, service templates) that engineering teams adopt by default in new and existing services
  • Build internal tooling and automation in Python, Bash and Terraform to improve deployment safety, reliability, and operational efficiency
  • Design and maintain actionable dashboards that surface real user impact, not vanity metrics, for service owners and leadership
  • Tune alerting rules continuously to maximize signal-to-noise ratio; tie alerts to SLO-based error-budget burn rates rather than arbitrary thresholds
  • Own reliability incident response end-to-end: detection, triage, communication, escalation, resolution, and stakeholder updates
  • Track and run blameless postmortems that focus on systemic contributing factors, not individual fault, producing actionable remediation items with owners and deadlines
  • Track remediation follow-through as a first-class metric. Ensure postmortem action items are completed, not just documented
  • Continuously improve MTTD and MTTR by feeding incident learnings back into monitoring, runbooks, and automation
  • Collaborate with Customer Success and ensure we by feed incident learnings back into monitoring, runbooks, and automation
  • Define meaningful SLOs for all production services grounded in critical user journeys, historical performance data, and business requirements
  • Eliminate alert fatigue by auditing, categorizing, and deprecating noisy or non-actionable alerts on a regular cadence
  • Help manage incident management processes and playbooks

Skills

  • 2+ years of full-time experience in an SRE or similar role
  • 3+ years of experience working in AWS with EKS and Github (GHA) & CI/CD
  • Strong hands-on experience with Git, Python, and Bash. Comfortable building production-grade automation and tooling
  • Experience establishing SRE practices across multiple teams (SLO definitions, alert hygiene, postmortem culture)
  • Built or maintained Prometheus-based monitoring with dashboards they have in Grafana
  • Demonstrated experience scoping and delivering infrastructure projects from proposal through production deployment
  • Demonstrated experience managing incidents and response to service outage
  • Hands-on experience integrating AI with SRE efforts to improve reliability, development and velocity
  • Demonstrated track record of collaborating with teams to define SLOs, instrument services against measurable SLIs, and operationalize error-budget burn-rate alerting that teams use independently to balance risk and delivery speed

Benefits

  • Hubs in San Francisco and New York City offering regular in-person gatherings and co-working sessions
  • Flexible PTO with U.S. holidays observed and a week shutdown in December to rest and recharge*
  • A competitive health insurance plan covers 100% of the policyholder and 75% for dependents*
  • 12 weeks of paid parental leave in the US*
  • 401k program, 3% match - vested immediately!*
  • $500 work-from-home stipend to be used up to a year of your start date*
  • $600 technology stipend to support a portion of our hybrid/remote team's cell phone and internet expenses*
  • $1,200 per year Health & Wellness Allowance to support your personal goals*
  • *Certain perks and benefits are limited to full-time employees only

Company Overview

  • You.com is a personalized AI search engine that delivers customized recommendations and allows natural conversation with its AI chatbot. It was founded in 2020, and is headquartered in Palo Alto, California, USA, with a workforce of 51-200 employees. Its website is https://you.com.
  • Apply To This Job

    You might like

    [Remote] Business Development Manager - US

    Work from home Full-time role

    [Remote] Customer Success Engineer (Americas)

    Work from home Full-time role

    [Remote] Senior Presales Engineer - Series B Cloud Security Start Up Vendor

    Work from home Full-time role

    [Remote] Account Manager – Performance Additives (Coatings Industry)

    Work from home Full-time role

    [Remote] Founding Product Designer

    Work from home Full-time role

    [Remote] Digital Operations Team Lead

    Work from home Full-time role

    [Remote] Junior SAAS Account Manager

    Work from home Full-time role

    [Remote] Lead AI / ML Engineer

    Work from home Full-time role

    [Remote] Clinical Trial Analyst, Full Time, Day

    Work from home Full-time role

    [Remote] Senior Customer Success Manager - North America

    Work from home Full-time role

    Mental Health Counselor / Therapist - Remote

    Work from home Full-time role

    Insurance Customer Service Representative – Client Relations, Policy Support, Complaint Resolution & Sales Enablement at arenaflex

    Work from home Full-time role

    Search Engine Evaluator / Online Researcher

    Work from home Full-time role

    Account Executive

    Work from home Full-time role

    Experienced Live Chat and Email Support Agent – Deliver Exceptional Customer Experience at arenaflex

    Work from home Full-time role

    Temporary Remote Chat Support Representative – Part-Time Opportunity at arenaflex

    Work from home Full-time role

    Recruiting Coordinator (Remote, select US states) – Remote

    Work from home Full-time role

    Bodily Injury Adjuster - Represented Moderate/Complex (MD, VA, DC) - Remote

    Work from home Full-time role

    Senior Product Manager, Xsolla Pages

    Work from home Full-time role

    Patient Intake Coordinator Sales Manager

    Work from home Full-time role