AXIBO

Reinforcement Learning Engineer (Full-Time) - Humanoid Robot

Reposted 4 Days Ago

Be an Early Applicant

In-Office

Cambridge, ON

Mid level

In-Office

Cambridge, ON

Mid level

As a Reinforcement Learning Engineer, you will develop and deploy machine learning systems for humanoid robots, focusing on control tasks like locomotion and manipulation, utilizing deep learning techniques.

The summary above was generated by AI

About AXIBO

AXIBO is a robotics company pioneering the design, prototyping, and manufacturing of advanced robotic systems—all under one roof. We build everything in-house and take pride in delivering robust, reliable products that power automation across industries. Our fast-paced environment demands high levels of precision, organization, and execution—not just in engineering, but across all functions.

Position Overview

As a Reinforcement Learning Engineer, you will develop and deploy machine learning systems that enable intelligent behaviors in our humanoid and legged robots. You'll work at the intersection of control theory, deep learning, and robotics—helping close the loop between simulation and reality to bring adaptive behaviors into real-world machines.

Key Responsibilities

Develop reinforcement learning agents for robotic control tasks such as locomotion, manipulation, and dynamic balance
Implement learning architectures using policy gradient methods, actor-critic frameworks, and off-policy algorithms (e.g., PPO, SAC, TD3)
Build reward functions, curriculum learning strategies, and simulation environments tailored for real-world transfer
Design multi-agent training pipelines, including distributed rollouts, experience replay, and adaptive difficulty scaling
Interface with Isaac Gym, Mujoco, Brax, and custom physics simulators to run large-scale experiments
Work with hardware and firmware teams to deploy trained policies to embedded or real-time environments
Design diagnostic tools and visualization dashboards to monitor training progress and system behavior
Apply domain randomization, sim2real techniques, and sensor noise modeling to enhance policy robustness
Maintain code quality through version control, testing, and modular design
Stay current with academic literature and integrate novel RL methods as appropriate

Required Skills and Qualifications

Bachelor's or Master’s degree in Computer Science, Engineering, Robotics, or a related field
2+ years of hands-on experience applying deep reinforcement learning to simulation or robotic control tasks
Strong grasp of machine learning fundamentals and control theory
Proficiency with PyTorch, JAX, or TensorFlow
Programming experience in Python and C++
Deep understanding of policy optimization, generalization, and environment design
Experience working in Linux development environments and with GPU-based training pipelines
Excellent debugging skills across ML, software, and hardware stacks
Ability to independently manage experiments and rapidly iterate on model architectures

Preferred Experience (Bonus)

Deployment of RL systems to real-world robots, especially legged or humanoid platforms
Contributions to open-source RL frameworks or robotics middleware (e.g., ROS, Isaac ROS)
Experience with imitation learning, behavior cloning, or inverse reinforcement learning
Prior research/publications in reinforcement learning, multi-agent systems, or robotic control
Familiarity with low-level robot interfaces, sensor fusion, or control loop tuning
Knowledge of real-time systems, embedded software, or custom actuator control

Job Details

Location: Cambridge, Ontario
Work Environment: In-person (on-site at our Waterloo facility)
Type: Full-time
Compensation: Competitive salary (based on experience)
Health Insurance: Provided
Growth: Regular performance evaluations with potential for salary increases and stock option participation

Top Skills

C++

Jax

Python

PyTorch

TensorFlow

Guelph, Ontario, Canada

Similar Jobs

CrowdStrike

Engineer I - Sensor, SSP (Hybrid)

4 Hours Ago

Hybrid

Toronto, ON, CAN

Entry level

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity

Entry-level systems engineer who writes endpoint sensor code and tests (largely in Python), learns an internal DSL, reasons about OS events across macOS/Windows/Linux, collaborates with internal teams to implement and maintain sensor detections, and diagnoses customer or engineering issues.

Top Skills: Python,C++,In-House Dsl,Macos,Windows,Linux,Kernel Programming,Falcon Sensor,Cloud

Samsara

Product Manager

11 Hours Ago

Easy Apply

Remote or Hybrid

Toronto, ON, CAN

Easy Apply

Senior level

Artificial Intelligence • Cloud • Computer Vision • Hardware • Internet of Things • Software

Own vision and roadmap for core support tools. Lead discovery and user research, draft PRDs and wireframes, oversee builds with engineering and AI/data teams, manage launches and change management, track KPIs and adoption, and iterate to drive GTM impact and support experience.

Top Skills: Decagon,Happy Robot,Intercom Fin,Llm-Based Platforms,Ai Support Tools

Mastercard

Director, Agent Suite Consulting Products

14 Hours Ago

Hybrid

Toronto, ON, CAN

Senior level

Blockchain • Fintech • Payments • Consulting • Cryptocurrency • Cybersecurity • Quantum Computing

As Director of Consulting Products, lead the development of AI-enabled solutions, manage product roadmaps, and collaborate with cross-functional teams to drive innovation and client engagement.

Top Skills: AIData ScienceFintech

What you need to know about the Toronto Tech Scene

Although home to some of the biggest names in tech, including Google, Microsoft and Amazon, Toronto has established itself as one of the largest startup ecosystems in the world. And with over 2,000 startups — more than 30 percent of the country's total startups — Toronto continues to attract new businesses. Be it helping entrepreneurs manage their finances, simplifying business operations by automating payroll or assisting pharmaceutical companies in launching new drugs, the city's tech scene is just getting started.

AXIBO

Reinforcement Learning Engineer (Full-Time) - Humanoid Robot

Top Skills

AXIBO Guelph, Ontario, CAN Office

Similar Jobs

Engineer I - Sensor, SSP (Hybrid)

Product Manager

Director, Agent Suite Consulting Products

What you need to know about the Toronto Tech Scene