VP of Cloud Engineering, Operations & Delivery
VP of Cloud Engineering, Operations & Delivery Job Type: Full-Time | Executive Leadership Location: Remote Salary: $200k - $225k base salary, plus bonus and equity
About the Role
We are seeking an experienced and well-rounded VP of Cloud Engineering, Operations & Delivery to lead our cloud practice across a diverse portfolio of industry verticals. This role sits at the intersection of technical authority, executive leadership, and forward-thinking innovation — someone who brings genuine cloud engineering depth, while also driving strategy, client relationships, and organizational growth. You will lead high-performing teams delivering complex, multi-cloud solutions across AWS, Azure, and Google Cloud Platform , setting the technical bar while ensuring the business delivers on its commitments. Critically, you will help shape and lead our evolution into an agentic AI-powered future — identifying opportunities to transform how our teams and our clients design, deploy, operate, and optimize cloud infrastructure using AI agents and intelligent automation. The ideal candidate is a natural communicator who can shift seamlessly from an architecture discussion with engineers to a strategic briefing with a client's executive team — and be credible in both rooms. They are also someone who looks at today's manual, repetitive, or complex processes and asks: "How do we let intelligent agents handle this?"
Key Responsibilities
Technical Leadership
- Serve as the senior technical authority for cloud architecture and infrastructure decisions across AWS, Azure, and GCP
- Advance and mature our
Infrastructure as Code (IaC) practices — Github, Jenkins, Terraform, Qualys, Sonarqube, etc. — ensuring consistency, security, and scalability across client environments
- Provide meaningful technical guidance and architectural direction to engineering teams — going beyond high-level oversight to engage substantively on design decisions, standards, and delivery quality
- Guide adoption of cloud-native patterns including Kubernetes (EKS/AKS/GKE), serverless, CI/CD automation, and event-driven architecture
- Lead architecture reviews and serve as the escalation point for complex technical challenges
- Ensure security and compliance are embedded into infrastructure from the ground up — spanning IAM design, network segmentation, secrets management, and frameworks such as SOC 2, NIST, CIS, HIPAA, and PCI-DSS
Agentic AI Strategy & Transformation
- Champion the adoption of
AI agents and multi-agent systems to transform how cloud infrastructure is built, operated, and optimized — moving teams from reactive, manual workflows to intelligent, autonomous execution
- Identify high-value opportunities to introduce agentic workflows into engineering operations — including infrastructure provisioning, incident detection and remediation, cost optimization, compliance monitoring, security response, and deployment pipelines
- Lead the evaluation and adoption of agentic AI frameworks and platforms (e.g., LangGraph, AutoGen, Amazon Bedrock Agents, Azure AI Agent Service, Vertex AI Agent Builder) to build purpose-built agents that extend the capabilities of our engineering teams
- Define governance, guardrails, and human-in-the-loop checkpoints for agentic systems operating in cloud environments — ensuring autonomous actions are safe, auditable, and aligned with client expectations
- Collaborate with engineering and solutions teams to design
agentic delivery pipelines — where AI agents assist in code generation, IaC validation, drift detection, security scanning, and release orchestration
- Work with peer technology teams to identify process transformation opportunities — helping envision, roadmap and execute an agentic future state for cloud operations and engineering workflows
- Stay ahead of the rapidly evolving AI agent ecosystem and bring informed, practical perspectives on what is production-ready versus experimental
Operations & Reliability
- Own the operational health of cloud environments across the client portfolio — including availability, performance, security posture, and cost efficiency
- Mature SRE practices across the organization: SLOs, error budgets, incident management, and blameless postmortems
- Drive FinOps discipline — optimizing cloud spend through right-sizing, commitment strategies, tagging governance, and anomaly detection — increasingly augmented by AI-driven insights and autonomous recommendations
- Define and enforce observability standards across logging, metrics, and
Apply tot his job Apply To this Job