Huawei Canada Jobs

Senior Principal Researcher & Technical Leader – Agentic RL for Distributed Computing

Huawei Canada

Senior Principal Researcher & Technical Leader – Agentic RL for Distributed Computing

Reposted 12 Days Ago

Be an Early Applicant

In-Office

Markham, ON, CAN

Senior level

In-Office

Markham, ON, CAN

Senior level

Lead architecture design and technology selection for foundation model applications, driving innovation and efficiency for AI developers, and connecting business and academic resources.

The summary above was generated by AI

Huawei Canada has an immediate permanent opening for a Senior Principal Engineer.

About the team:

The Distributed Data Storage and Management Lab leads research in distributed data systems, aiming to develop next-generation cloud serverless products that encompass core infrastructure and databases. This lab addresses various data challenges, including cloud-native disaggregated databases, pay-by-query user models, and optimizing low-level data transfers via RDMA. Teams within this lab create advanced cloud serverless data infrastructure and implement cutting-edge networking technologies for Huawei's global AI infrastructure.

Join Huawei’s Distributed Computing Lab – where we’re redefining AI innovation in Canada

We are a distributed computing team dedicated to building scalable, high-performance systems and robust tools for the global open-source community. Our work focuses on advancing infrastructure for AI and data-intensive workloads, with strong emphasis on production-grade reliability and efficiency.

We are the creators of openYuanrong (https://www.openeuler.org/en/projects/yuanrong/) and vLLM-omni (https://github.com/vllm-project/vllm-omni), helping shape the ecosystem for large-scale LLM serving and multi-modal inference. Operating at the intersection of distributed systems, AI infrastructure, and high-performance computing, we tackle challenges such as large-scale data movement, heterogeneous resource scheduling, and efficient multi-agent execution, delivering impactful, widely adopted open technologies.

About the job:

As a Senior Principal leader at Huawei Canada Research Center, you will be the primary technical authority and strategic visionary for our Distributed AI Infrastructure. This is a legacy-free leadership role where you will move beyond current Python-heavy stack limitations to define a high-performance C++ native substrate. Your mission is to architect the "chassis" for the next era of intelligence: Agentic Reinforcement Learning (RL) and Multi-Agent Systems (MAS). We believe Agentic RL will define the next generation of AI systems, where models are not just predictors, but decision-making entities that interact, collaborate, and evolve.
Lead technical innovation and exploration of AI infrastructure distributed systems for reinforcement learning and multi-agent systems. Drive multi-tier research and multi-scenario applications to build core technical competitiveness and support commercial success.
Understand corporate/product line strategies and industry trends. Continuously identify business issues and core challenges, absorb the latest research from academia and industry, and solve the pain points and difficult problems of AI distributed systems.
Act as a regional ecosystem builder, connecting academic resources in the North American AI distributed systems field, promoting academic cooperation, and building academic influence.

About the ideal candidate:

Lead and define the architectural evolution of Agentic RL and MARL for cooperative, competitive, and mixed-agent environments, including CTDE, decentralized learning, and hierarchical systems.
Lead the construction of simulation and training substrates for ultra-large-scale agent systems, establishing global standards for self-play, population-based training (PBT), and curriculum learning.
Drive the optimization of agent learning performance on distributed clusters, resolving critical challenges in sample efficiency, credit assignment, and communication learning at scale.
Decide on frontier research pathways for multi-agent intelligence, including communication protocols, game-theoretic learning dynamics, and meta-learning application prototypes.
Direct the translation of cutting-edge Agentic AI and MARL research into production-grade distributed systems, ensuring robust deployment in complex simulated or real-world environments.
Establish industry-leading benchmarking standards and evaluation frameworks for agent coordination, robustness, scalability, and safety.
Bridge research, infrastructure, and product teams to lead the deployment and application of large-scale agentic learning systems in real-world business scenarios.
Champion the team’s technical leadership in global academia and industry through high-impact publications, patents, and open-source contributions.

Continuously track and evaluate technical advancements in Agentic AI and large-scale distributed systems to inform long-term corporate technology strategy.

PROFESSIONAL ATTRIBUTE:

MS or PhD in Computer Science, Electrical Engineering, or a related field, with a focus on Reinforcement Learning, Multi-Agent Systems, Agentic AI, or Distributed AI.
Strong expertise in reinforcement learning algorithms, particularly in multi-agent settings (e.g., policy gradients, value-based methods, CTDE, credit assignment, and coordination in non-stationary environments).
Solid foundations in optimization, probability, and game theory, with the ability to design and analyze complex learning systems.
Experience building scalable RL training infrastructure, including distributed rollouts, large-scale simulation, and experiment pipelines.
Strong programming skills in Python and/or C++, with experience developing high-performance or distributed ML systems.
Demonstrated impact through research publications at top-tier venues (e.g., NeurIPS, ICML, OSDI, SOSP), open-source contributions, patents, or production ML systems in reinforcement learning, multi-agent learning, or large-scale AI systems.

19 Allstate Pky, Markham, Ontario, Canada, L3R 5A4

Similar Jobs

ServiceNow

Principal Customer Success Executive

An Hour Ago

Remote or Hybrid

Toronto, ON, CAN

Expert/Leader

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation

Lead customer retention and adoption for ServiceNow customers by identifying churn risk, partnering with Sales on adoption/retention plans, advising on governance and SLA issues, and improving customer satisfaction through consulting, project oversight, and executive engagement.

Top Skills: AIAi-Powered ToolsServicenow

HiBob

Senior Back-end Engineer

An Hour Ago

Remote or Hybrid

Canada

Senior level

HR Tech • Information Technology • Professional Services • Sales • Software

Design, develop, and maintain scalable backend systems for the Payroll product using a microservices architecture. Own the full development lifecycle from technical design to deployment and monitoring, collaborate with product and front-end teams, build and optimize APIs, and work in a continuous delivery environment with automated QA and testing practices.

Top Skills: APIsAutomated QaAWSContinuous DeliveryJavaKotlinMicroservicesMockingMonitoringMySQLPostgresScalaTddUnit Testing

Magna International

Generalist, Human Resources

3 Hours Ago

Hybrid

Mid level

Automotive • Hardware • Robotics • Software • Transportation • Manufacturing

Provide day-to-day HR support to managers and employees including full-cycle recruitment, onboarding, employee relations, performance management, attendance and disability administration, HRIS (Dayforce) recordkeeping, benefits/payroll support, compliance with employment legislation, and partnering on workforce planning, training, and organizational development.

Top Skills: DayforceHrisExcelMS OfficeMicrosoft PowerpointMicrosoft WordWorkday

What you need to know about the Toronto Tech Scene

Although home to some of the biggest names in tech, including Google, Microsoft and Amazon, Toronto has established itself as one of the largest startup ecosystems in the world. And with over 2,000 startups — more than 30 percent of the country's total startups — Toronto continues to attract new businesses. Be it helping entrepreneurs manage their finances, simplifying business operations by automating payroll or assisting pharmaceutical companies in launching new drugs, the city's tech scene is just getting started.

Huawei Canada

Senior Principal Researcher & Technical Leader – Agentic RL for Distributed Computing

Huawei Canada Markham, Ontario, CAN Office

Similar Jobs

Principal Customer Success Executive

Senior Back-end Engineer

Generalist, Human Resources

What you need to know about the Toronto Tech Scene