Huawei Canada Logo

Huawei Canada

Distinguished Engineer - AI Computing System

Posted 3 Days Ago
Be an Early Applicant
In-Office
Markham, ON, CAN
Senior level
In-Office
Markham, ON, CAN
Senior level
Lead the development of AI training cluster frameworks and optimization technologies, enhancing efficiency and performance for large model training scenarios.
The summary above was generated by AI

Huawei Canada has an immediate permanent opening for a Distinguished Engineer - AI Computing System

About the team:

The Advanced Computing and Storage Lab, currently a part of the Vancouver Research Centre, aims to explore adaptive computing system architectures to address the challenges posed by flexible and variable application loads in the future. It assists in ensuring the stability and quality of training clusters, constructs dynamic cluster configuration strategy solvers, and establishes precision control systems to create stable and efficient computing power clusters. One of the lab's goals is to focus on key industry AI application scenarios such as large model training/inference, based on key technologies like low-precision training, multi-modal training, and reinforcement learning, responsible for bottleneck analysis and the design and development of optimization solutions, thereby improving training and inference performance as well as usability.

About the job:

  • As a leading expert in the industry in the field of training cluster software frameworks and technologies, gain insights into the evolution direction of industry AI large model training frameworks and key features. Plan and layout AI frameworks and software features for scenarios such as large model pre-training, post-training, and integrated training and inference, building key capabilities for the company's training cluster software framework.

  • Focusing on the company's large model training optimization field, lead the team to build key technologies such as low-precision training, parallel strategy tuning, and training resource optimization, promoting the commercial implementation of large model perception optimization-related technologies.

  • Focusing on the company's training servers and super nodes and other products, lead the team to build large model AI training frameworks, operator libraries, acceleration libraries, and other software frameworks and acceleration features, fully leveraging system engineering and software-hardware collaboration capabilities to enhance AI cluster computing efficiency.

  • Identify high-quality academic resources in the direction of large model training, collaborate with domain experts and scholars on projects, layout related standards and patents, support the company's continuous innovation in the training cluster field, and build long-term competitiveness in the AI training cluster direction.

  • Cultivate a team of technical experts and key technical backbone in the direction of AI training cluster frameworks and software optimization. 

The base salary for this position ranges from $172,000 to $230,000 depending on education, experience and demonstrated expertise.

About the ideal candidate:

  • Major in artificial intelligence, computer science, software, automation, physics, mathematics, electronics, microelectronics, information technology, or related fields, with more than 5 years of R&D experience in large model training and optimization.

  • Proficient in common model structures of large models such as Deepseek and Llama, with deep technical expertise in large model training and inference optimization in fields like LLM, MoE, and multimodal learning.

  • Familiar with the hardware architecture and programming systems of AI accelerators such as GPU and NPU, with experience in optimizing AI systems with software-hardware-cores collaboration.

  • Familiar with cluster computing and cloud computing fields, with experience in software architecture design for cluster scheduling.

  • Enjoys research, has strong learning ability, good communication skills, and teamwork ability.

Top Skills

Ai Frameworks
Cloud Computing
Cluster Computing
Gpu
Low-Precision Training
Npu
Reinforcement Learning
Training Cluster Software Frameworks
HQ

Huawei Canada Markham, Ontario, CAN Office

19 Allstate Pky, Markham, Ontario, Canada, L3R 5A4

Similar Jobs

36 Minutes Ago
Hybrid
Toronto, ON, CAN
Mid level
Mid level
Big Data • Food • Hardware • Machine Learning • Retail • Automation • Manufacturing
As the Manager of Talent, you will lead talent strategies, including succession planning, talent assessments, and development initiatives for the Canada business unit, while acting as a strategic advisor and driving engagement and inclusion.
Top Skills: Employee Experience PlatformsGlintHr Information SystemsWorkday
36 Minutes Ago
Hybrid
Toronto, ON, CAN
Junior
Junior
Big Data • Food • Hardware • Machine Learning • Retail • Automation • Manufacturing
As an Associate Engineer II, you'll support packaging design, conduct trials, and manage project-related activities, ensuring quality and consumer satisfaction.
Top Skills: Chemical EngineeringLean Six SigmaMechanical EngineeringMinitabPackaging Science
36 Minutes Ago
Remote or Hybrid
East York, ON, CAN
Mid level
Mid level
Big Data • Food • Hardware • Machine Learning • Retail • Automation • Manufacturing
Analyze financial data, perform budgeting and forecasting, and provide insights to support strategic initiatives. Lead monthly close activities and collaborate with cross-functional teams to improve financial performance and ensure compliance.
Top Skills: Advanced ExcelPower BISAP

What you need to know about the Toronto Tech Scene

Although home to some of the biggest names in tech, including Google, Microsoft and Amazon, Toronto has established itself as one of the largest startup ecosystems in the world. And with over 2,000 startups — more than 30 percent of the country's total startups — Toronto continues to attract new businesses. Be it helping entrepreneurs manage their finances, simplifying business operations by automating payroll or assisting pharmaceutical companies in launching new drugs, the city's tech scene is just getting started.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account