Lead Site Reliability Engineer - Remote
What We’re About At CentralSquare, we don’t just build software - we power public servants and uplift communities with Hero-Grade Technology. Every line of code, every feature we deliver helps heroes across North America protect, serve, and save lives. When you join us, you become part of a mission-driven team creating technology that makes communities safer and stronger. Your Growth Matters. We believe heroes deserve opportunities to rise. That’s why we invest in your career with mentorship, learning programs, and clear paths for advancement. If you’re motivated, there’s no limit to how far you can go. Your Commitment Deserves Reward. We offer competitive compensation and a benefits package designed to support your life inside and outside of work—tuition reimbursement, parental leave, paid volunteer hours, and unlimited PTO. Plus, our flexible work environment gives you the freedom to balance your heroic work with personal well-being, whether you’re in the office or remote. Join us and help build the tools that power real-life heroes. Together, we make a difference. The Opportunity We are seeking a highly skilled Senior Cloud / DevOps Engineer with a strong background in AWS, automation, infrastructure as code, and networking to support and modernize our cloud environments. This role is hands-on and will partner closely with Cloud Operations, SREs, Networking, and Application teams to improve scalability, reliability, security, and operational efficiency across mission‑critical systems. The ideal candidate is comfortable operating at both the infrastructure and application layers, has strong troubleshooting skills, and can automate repeatable operational tasks while supporting high‑availability production workloads.
Key Responsibilities
Cloud & DevOps Engineering Design, build, and maintain AWS-based infrastructure supporting production and non-production environments Implement and maintain Infrastructure as Code (IaC) using tools such as Terraform, CloudFormation, or equivalent Develop and support CI/CD pipelines for infrastructure and application deployments Partner with application teams to improve deployment reliability and performance Automation & Reliability Create and maintain automation scripts and tooling (Python, Bash, PowerShell, etc.) to reduce manual operations Improve system reliability through self-healing mechanisms, monitoring, and alerting Support SRE-style practices including incident response, root cause analysis, and continuous improvement Networking & Security Design and support cloud networking (VPCs, subnets, routing, VPNs, security groups, NACLs) Troubleshoot complex network, connectivity, and performance issues across hybrid environments Implement security best practices aligned with AWS Well-Architected Framework Operations & Collaboration Participate in on-call rotations supporting critical production systems Provide operational support, troubleshooting, and resolution for cloud-related incidents Collaborate across CloudOps, Networking, DBAs, and Application teams Document architectures, runbooks, and operational procedures What Success Looks Like in This Role Reduced manual operational work through automation Improved deployment reliability and production stability Faster recovery and clearer root cause analysis during incidents Strong partnership with CloudOps, Networking, and Application teams Apply To This Job