[Remote] Principal AI/ML Platform Engineer (US)
Note: The job is a remote job and is open to candidates in USA. PointClickCare is focused on delivering innovative AI solutions, and they are seeking a Principal AI/ML Platform Engineer to build the infrastructure that connects AI systems with existing products. This role involves designing and maintaining the core infrastructure for GenAI products, ensuring seamless integration and delivery of AI-generated insights.
Responsibilities
- Design, build, and maintain the core infrastructure layer supporting GenAI products, including model gateways, prompt/versioning stores, vector databases, and LLM evaluation tools
- Implement secure access controls and authentication mechanisms integrated by default into the AI platform components
- Develop and manage observability, monitoring, and logging solutions for GenAI workloads and infrastructure
- Collaborate closely with product and engineering teams to integrate GenAI infrastructure with agent frameworks, and downstream applications
- Optimize infrastructure for scalability, high availability, cost efficiency for production workloads
Skills
- Extensive experience building and maintain AI platform infrastructure, Kubernetes, and container security
- Demonstrated expertise in observability, and monitoring frameworks, with a focus on real-time performance (i.e: experience with OpenTelemetry, MLFlow)
- Experience with AI infrastructure components such as vector databases, prompt/versioning stores, and AI IDEs
- Familiarity with vLLM, SGLang or similar framework to host LLM inference workloads
- Experience with CI/CD pipelines and automation for AI model deployment and platform operations
- Strong knowledge of authentication and authorization frameworks integrated into AI platforms
Benefits
- Bonus
- Benefits
Company Overview
Company H1B Sponsorship