See all roles

SRE – OPENSTACK / PRIVATE CLOUD OPERATIONS

Work from home Full-time role Hiring

Time Zone: EST / North America Full-time / Long-term Contract Role Overview

  • Looking for an SRE with strong experience in OpenStack and private cloud environments
  • Role focuses on production support, troubleshooting, and platform reliability
  • Requires hands-on expertise in Linux, networking, and storage
  • Involves close collaboration with engineering teams and customer interaction

Key Responsibilities

  • Troubleshoot complex issues in OpenStack and Linux environments
  • Manage and support OpenStack services including Nova, Neutron, Cinder, and Keystone
  • Perform root cause analysis (RCA) and drive long-term fixes
  • Participate in incident management and on-call rotations
  • Monitor system performance, availability, and reliability
  • Collaborate with engineering teams on fixes and improvements
  • Communicate effectively with customers via calls and written channels
  • Perform system optimization and performance tuning

Must-Have Skills Linux, Networking Storage Fundamentals

  • Strong understanding of Linux internals and system performance
  • Experience with kernel tuning and troubleshooting
  • Hands-on experience with filesystems and disk management
  • Knowledge of partitions and system-level troubleshooting
  • Experience with LVM and SCSI multipath
  • Basic understanding of Ceph
  • Ability to troubleshoot IO and performance issues
  • Knowledge of DHCP, DNS, VLANs, and network bonding
  • Understanding of basic routing concepts

OpenStack Operations Troubleshooting

  • Hands-on experience with OpenStack services such as Nova, Neutron, Cinder, and Keystone
  • Experience managing production environments
  • Strong troubleshooting and debugging skills
  • Ability to handle customer-facing technical issues
  • Experience performing root cause analysis

Good To Have Skills

  • Basic understanding of Kubernetes concepts
  • Experience with monitoring tools like Prometheus and Grafana
  • Knowledge of metrics, logging, and alerting systems
  • Basic scripting skills in Python or Go
  • Exposure to automation and observability practices

Soft Skills

  • Strong problem-solving and analytical thinking
  • Ability to work in high-pressure production environments
  • Clear and effective communication skills
  • Proactive mindset toward issue prevention
  • Comfortable working in remote, distributed teams

Apply tot his job Apply To this Job

You might like