See all roles

Senior Platform Engineer

Work from home Full-time role Hiring

Lob was founded in 2013 by technical co-founders with a vision to connect the world one mailbox at a time. Today, we're transforming the way businesses use direct mail and bringing the power of technology to a traditionally manual channel. Our modern logistics and fulfillment engine helps businesses to build and scale high-quality, personalized direct mail programs without the operational burden. As we grow to meet the evolving needs of our customers and expand our product offerings, we’re building a team to shape the future of direct mail. About The Role We are looking for a Senior Platform Engineer to help scale and improve the reliability, observability, performance, and cost efficiency of our platform infrastructure. This role is focused on observability engineering and infrastructure optimization across AWS environments. The ideal candidate has deep hands-on experience with Datadog, OpenTelemetry, and HashiCorp Nomad, and understands how to build highly visible, scalable, and operationally efficient systems while actively reducing unnecessary infrastructure spend. You will work closely with engineering teams to improve telemetry, monitoring, performance testing, platform reliability, and cloud infrastructure efficiency across a fast-moving distributed environment, including leveraging modern AI-driven tooling and operational workflows where appropriate. What You’ll Work On Building and improving observability across distributed systems and services Designing dashboards, alerting, metrics, tracing, and telemetry pipelines Improving operational visibility using Datadog, and OpenTelemetry Helping evolve and mature the organization’s observability strategy and tooling Supporting and improving HashiCorp Nomad orchestration environments Identifying and implementing AWS cost-saving opportunities across compute, storage, and platform infrastructure Improving infrastructure utilization and operational efficiency across Nomad workloads Optimizing S3 storage utilization, lifecycle management, and storage costs Designing and maintaining performance testing environments and tooling Running load and performance tests to identify bottlenecks and scalability issues Managing and tuning Elasticsearch/OpenSearch environments Troubleshooting production performance issues across services, infrastructure, and databases Partnering with engineering teams to improve platform reliability, scalability, and infrastructure efficiency

Responsibilities

Lead observability initiatives across infrastructure and applications Design and maintain monitoring, telemetry, dashboards, tracing, and alerting systems Build actionable visibility into platform health, reliability, and performance Improve incident detection, troubleshooting, and operational response capabilities Define observability standards and best practices across engineering teams Drive infrastructure cost optimization initiatives across AWS services and platform environments Analyze infrastructure utilization and recommend performance and cost efficiency improvements Maintain and improve infrastructure-as-code standards and workflows Design, build, and maintain scalable performance testing environments and tooling Execute and analyze load/performance testing initiatives Support and improve Nomad-based orchestration environments Troubleshoot complex production and infrastructure issues across distributed systems Collaborate closely with engineering teams to improve scalability, reliability, operational visibility, and infrastructure efficiency Create and maintain operational documentation and platform best practices Qualifications 7+ years of experience in platform engineering, infrastructure engineering, or site reliability engineering Strong hands-on experience with HashiCorp Nomad Deep expertise with Datadog Strong experience implementing and operating observability platforms using OpenTelemetry and modern monitoring tooling Experience with Grafana or similar visualization and observability platforms Strong understanding of distributed tracing, metrics, logging, and monitoring best practices Experience building dashboards, alerts, telemetry pipelines, and operational visibility tooling Strong experience identifying and implementing AWS cost optimization strategies in production environments Strong knowledge of S3 optimization, lifecycle management, and storage cost reduction Experience building and running performance/load testing environments Strong troubleshooting and performance analysis skills across distributed systems Strong experience operating infrastructure in AWS environments Strong experience with Terraform and infrastructure-as-code practices Experience balancing platform reliability, observability, and infrastructure cost efficiency at scale Experience working with distributed and event-driven architectures using technologies such as Redis, SQS, or Temporal Experience managing and tuning Elasticsearch or OpenSearch clusters Experience working in fast-paced engineering environments Strong communication and collaboration skills

Nice to Have

Exposure to PostgreSQL RDS to Aurora migrations Experience with Kubernetes Experience with CI/CD systems and deployment automation Experience with Go, Python, or TypeScript Since great engineers come from a variety of backgrounds, it doesn’t particularly matter if you have a specific degree—we want to hear about your contributions in a real-world setting. Compensation information The compensation for this role consists of a base salary + additional RSUs. Annual Base Salary: $160,000 - $177,500 Apply To This Job

You might like

Substitute Teacher - Elementary

Work from home Full-time role

Data Entry Specialist, Remote

Work from home Full-time role

Data Analytics & Insights Manager

Work from home Full-time role

Financial Data Application Specialist

Work from home Full-time role

Power System Studies Consultant II – North America

Work from home Full-time role

Data Entry Specialist, Remote

Work from home Full-time role

Data Entry Specialist, Remote

Work from home Full-time role

Data Entry Specialist, Remote

Work from home Full-time role

Data Entry Specialist, Remote

Work from home Full-time role

Data Entry Specialist, Remote

Work from home Full-time role

P&C Licensed Insurance CSR

Work from home Full-time role

Experienced Remote Chat Moderator – Entry-Level Opportunities with Competitive Hourly Pay

Work from home Full-time role

Analytics Engineer

Work from home Full-time role

Immediate Hiring: Overnight Customer Service Specialist – No Experience Required – Join arenaflex

Work from home Full-time role

Email/Chat/Phone Specialist (Nights and Weekends) – Join arenaflex's Dynamic Customer Experience Team

Work from home Full-time role

[Hiring] Non-Clinical CMC Statistician @IQVIA

Work from home Full-time role

Remote Azure Cloud Architect

Work from home Full-time role

Experienced Full Stack Customer Service Representative – Remote Inbound Support

Work from home Full-time role

SMB Agent Product Research Intern

Work from home Full-time role

Experienced Remote Customer Service Representative – Delivering Exceptional Arenaflex Experiences from the Comfort of Your Own Home

Work from home Full-time role