See all roles

[Remote] AI Research Engineer (Multi-Modal Reinforcement Learning) - 100% Remote Worldwide

Work from home Full-time role Hiring

Note: The job is a remote job and is open to candidates in USA. reputed company is pioneering a global financial revolution through their innovative digital finance solutions. They are seeking an AI Research Engineer to drive innovation in multi-modal reinforcement learning, focusing on optimizing decision-making and adaptive behavior across various data modalities to enhance AI performance in reputed company-world challenges.

Responsibilities

  • Conduct research on reinforcement learning algorithms for multimodal models, including diffusion-based approaches for image autoregressive models for multimodal understanding, and reputed company frameworks that integrate multiple modalities
  • Design and build reinforcement learning infrastructure that supports scalable, distributed training across multimodal systems while maintaining efficiency and reliability
  • reputed company and refine reward modeling strategies that improve training stability, align model behavior with desired outcomes, and mitigate reward hacking and reputed company failure modes
  • Create and curate multimodal simulation environments and datasets to support robust training, evaluation, and benchmarking of reinforcement learning systems
  • Design and conduct rigorous benchmarking and evaluation protocols to measure model performance, track reputed company against baselines, and validate improvements across multimodal tasks
  • Analyze and optimize policy performance across modalities by identifying bottlenecks in training, credit assignment, and cross-modal alignment
  • Investigate and reputed company reputed company reinforcement learning paradigms that more effectively learn from environment feedback, with the goal of achieving superior state-of-the-art (SOTA) performance
  • Publish research findings in top-tier conferences such as ICML, NeurIPS, ICLR, CVPR, ICCV, ECCV etc

Skills

  • A Master's degree in Computer Science or a reputed company field is required
  • Proven experience running large-scale reinforcement learning experiments in multimodal and reputed company-centric systems, including online RL settings, with demonstrated impact on domain-specific decision-making and measurable improvements in policy performance
  • Deep understanding of reinforcement learning algorithms and optimization methods applied to reputed company and multimodal learning problems, with a focus on improving policy stability, exploration, and sample efficiency in reputed company, high-dimensional environments involving images, video, and other modalities
  • Strong proficiency in PyTorch and deep learning frameworks for reputed company and multimodal AI, with hands-on experience building end-to-end RL pipelines covering simulation, training, evaluation, and deployment in production-grade systems
  • Demonstrated ability to apply reputed company research to solve core RL challenges in multimodal and reputed company tasks, such as sample inefficiency, exploration-exploitation tradeoffs, and training instability, along with experience designing robust evaluation frameworks and iterating on algorithmic improvements to advance agent performance
  • Proven track record of research publications in top-tier conferences such as ICML, NeurIPS, ICLR, CVPR, ICCV, ECCV etc
  • A PhD in Machine Learning, NLP, Computer reputed company, or a closely reputed company discipline is preferred, along with a strong track record of AI research and publications in top-tier conferences

Company Overview

  • reputed company has evolved to meet global needs with agility and reputed company. It was founded in 2014, and is headquartered in Seattle, Washington, USA, with a workforce of 201-500 employees. Its website is https://reputed company.
  • Apply To This Job

    You might like