See all roles

Incident Response Engineer, Senior

Work from home Full-time role Hiring

The Incident Response Engineer, Senior provides senior‑level technical leadership for resolving complex IT incidents that affect mission‑critical services in a federal enterprise environment. The role leads deep end‑to‑end investigations through advanced observability, telemetry analysis, and cross-layer dependency mapping to isolate root causes and validate durable fixes. This position partners closely with incident managers and senior coordinators, engineering, and problem/change management teams to coordinate major events, shape incident response strategy, and elevate diagnostic practices across the operations organization. The senior engineer also drives continuous improvement by refining runbooks, tuning detection and alerting, and mentoring other responders to improve resilience and reduce time to restore. 

Key Responsibilities

  • Technical Lead (under Major Incident Management direction): Lead complex investigations from scoping through closure; drive hypothesis-based troubleshooting; validate permanent fixes across distributed systems.
  • Observability & Diagnostics: Use modern monitoring/SIEM/observability to correlate metrics, traces, logs; distinguish symptoms from root causes; map impacts across infra/app/network/identity.
  • Runbooks & Automation: Design/refine technical runbooks; implement scripts/orchestration to standardize responses and reduce manual effort; codify remediation/verification checks.
  • SRE & Architecture Integration: Translate incident insights into capacity planning, reliability metrics, and service design changes; partner with platform/reliability engineering teams.
  • Technical PIRs & Coaching: Produce high-quality technical PIRs for engineers/executives; mentor responders in tools, diagnostics, documentation discipline, and IM practice adherence.
  • Cyber IR Interface: Coordinate with SOC/cyber responders when security indicators emerge; align IT ops IR and cyber IR workflows without compromising restoration velocity/safety. 
  • Technical Mentoring: coach incident responders and operations staff, raising the bar on diagnostic techniques, tool usage, documentation discipline, and adherence to incident management practices. 

Required Qualifications

  • Bachelor’s degree in Information Technology, Computer Science, Business Administration, or related field, or equivalent relevant work experience. 
  • Minimum of 8 years of experience in incident management, IT operations, reliability engineering, or related IT roles, including frequent responsibility for leading complex, multi‑system incident resolution. 
  • Strong mastery of ITIL‑aligned incident management principles and best practices, with demonstrated experience coordinating major incidents in a large enterprise or federal IT environment. 
  • Advanced proficiency with incident management tools and modern monitoring/observability platforms used for log analysis, performance monitoring, and alerting. 
  • Proven ability to manage multiple complex incidents concurrently, synthesize technical information quickly, and communicate clearly and confidently with both technical teams and leadership. 
  • Active or obtainable SECRET clearance and U.S. citizenship, with the ability to satisfy all applicable federal suitability and security requirements. 

Preferred Qualifications

  • Background leading incident response in large‑scale, cloud‑centric, or hybrid environments, including ownership of cross‑team technical coordination and complex investigations. 
  • Advanced incident response, cybersecurity, or IT service management certifications (such as higher‑level ITIL, incident‑response‑oriented, or security certifications). 
  • Experience embedding incident insights into site reliability engineering practices, including error budgeting, reliability metrics, and capacity planning. 
  • Demonstrated success building and refining automation for common remediation actions and verification checks. 
Apply To This Job

You might like

NOC Network Administrator, Junior

Work from home Full-time role

Storage Mgmt. Lead

Work from home Full-time role

NOC Engineer, Mid

Work from home Full-time role

People and Talent Specialist

Work from home Full-time role

People and Talent Specialist

Work from home Full-time role

People and Talent Specialist

Work from home Full-time role

Senior Governance, Risk & Compliance Lead

Work from home Full-time role

Licensed Sales Professional (LSP) - NM

Work from home Full-time role

Entry Level Outside Sales Representative

Work from home Full-time role

[Remote] Collections/ Customer Service Specialist - Remote

Work from home Full-time role

Scheduling Coordinator/Scribe - GI Clinic (1.0 Days)

Work from home Full-time role

Staff Site Reliability Engineer

Work from home Full-time role

Staff Accountant (Remote - NY, NJ, PA, MA, DC)

Work from home Full-time role

Tier 2 Systems Administrator

Work from home Full-time role

Senior Software Engineer – Customer Order Management Platform (Remote) – Full‑Stack Development, DevOps & Leadership

Work from home Full-time role

Experienced Customer Service Representative – Pet Care and E-commerce Support

Work from home Full-time role

Entry Level Electrical Engineer - Data Centers

Work from home Full-time role

Senior Customer Service Representative (Digital - Remote)

Work from home Full-time role

Bike Delivery Courier

Work from home Full-time role

Experienced Part-Time Data Entry Specialist – Remote Work Opportunity for Detail-Oriented Professionals at arenaflex

Work from home Full-time role