Manulife Logo

Manulife

Lead Platform Reliability Engineer, Global AI Platform & Solutions

Reposted Yesterday
Be an Early Applicant
In-Office
Toronto, ON, CAN
Senior level
In-Office
Toronto, ON, CAN
Senior level
The Lead Platform Reliability Engineer ensures platform stability, performance, and reliability through SRE practices, automation, and collaboration, while managing cloud-based infrastructure and incident responses.
The summary above was generated by AI

The Lead Platform Reliability Engineer (PRE) ensures the stability, performance, and scalability of the shared platform that supports internal AI solution development. It combines software engineering, SRE practices, and operations to keep the platform reliable and developer-friendly.

Position Responsibilities:  

  • Reliability and performance: Define SLOs/SLIs, track operations budgets, reduce MTTR, capacity plan, and tune autoscaling.
  • Observability: Build and maintain logging, metrics, tracing, and alerting; instrument platform components; create runbooks and dashboards.
  • Incident response: On-call for platform incidents; triage, mitigate, root-cause, and drive postmortems and corrective actions.
  • Automation and tooling: Develop self-service capabilities, AIOps/MLOps/GitOps/CICD pipelines, and operational automations (provisioning, upgrades, backups).
  • Infrastructure as code: Manage clusters, networks, storage, and policies via Terraform/Ansible; prevent configuration drift.
  • Security and compliance: Enforce identity/RBAC, secrets management, supply chain security, and regulatory controls; collaborate with risk and audit.
  • Scalability and cost: Optimize resource usage, plan capacity, control spend (rightsizing, autoscaling, reservations/spot).
  • Change management: Safe rollouts, progressive delivery, and policy-as-code guardrails.
  • Platform productization: Treat the platform as a product, define operations SLAs in alignment to product roadmap, service catalog, and developer experience.
  • Collaborate with global engineering, security, and AI governance teams to ensure compliance with cross-geo regulations and Asia’s data residency requirements.
  • Operate scalable backend services supporting high-traffic agent interactions, retrieval operations, and real-time execution flows.
  • Maintain AI services runbooks, playbooks, and enablement for GOCC

Required Qualifications: 

  • Bachelor’s in Computer Science/Engineering or equivalent experience (not strictly required if skills demonstrated).
  • 5-8 years experience in DevOps/Platform Engineering or Production Operations.
  • Proven track record operating large-scale distributed systems and running on-call.
  • Operational experience with cloud-native development: Azure, Kubernetes, containers, CI/CD, and observability stacks.
  • Knowledge with Python and/or Java/Scala/TypeScript for building backend services and automation.
  • Understanding of AI solution, LLM systems, retrieval architectures, embeddings, vector stores, prompt/tool orchestration, and agent workflow fundamentals.
  • Knowledge of API design, asynchronous workflows, concurrency, reliability engineering (SLOs, error budgets), and performance tuning.
  • Familiarity with security, governance, and compliance for AI/data systems (authN/authZ, data protection, audit logging, model governance).
  • Ability to collaborate across global teams and translate business requirements into platform capabilities and operational SLAs.

Preferred Qualifications:

  • ITIL & ITSM certification
  • Azure Administrator/DevOps certificate (nice to have)
  • Kubernetes: CKA/CKS certificate (nice to have)
  • HashiCorp Terraform Associate certificate (nice to have)

When you join our team: 

  • We’ll empower you to learn and grow the career you want.  
  • We’ll recognize and support you in a flexible environment where well-being and inclusion are more than just words.  
  • As part of our global team, we’ll support you in shaping the future you want to see.  

 
 #LI-Hybrid
 

The role being advertised is an existing vacancy.

About Manulife and John Hancock

Manulife Financial Corporation is a leading international financial services provider, helping people make their decisions easier and lives better. To learn more about us, visit https://www.manulife.com/en/about/our-story.html.

Manulife is an Equal Opportunity Employer

At Manulife/John Hancock, we embrace our diversity. We strive to attract, develop and retain a workforce that is as diverse as the customers we serve and to foster an inclusive work environment that embraces the strength of cultures and individuals. We are committed to fair recruitment, retention, advancement and compensation, and we administer all of our practices and programs without discrimination on the basis of race, ancestry, place of origin, colour, ethnic origin, citizenship, religion or religious beliefs, creed, sex (including pregnancy and pregnancy-related conditions), sexual orientation, genetic characteristics, veteran status, gender identity, gender expression, age, marital status, family status, disability, or any other ground protected by applicable law.

It is our priority to remove barriers to provide equal access to employment. A Human Resources representative will work with applicants who request a reasonable accommodation during the application process. All information shared during the accommodation request process will be stored and used in a manner that is consistent with applicable laws and Manulife/John Hancock policies. To request a reasonable accommodation in the application process, contact [email protected].

Referenced Salary Location

Toronto, Ontario

Working Arrangement

Hybrid

Salary range is expected to be between

$113,260.00 CAD - $210,340.00 CAD

Employees also have the opportunity to participate in incentive programs and earn incentive compensation tied to business and individual performance. The actual salary will vary depending on local market conditions, geography and relevant job-related factors such as knowledge, skills, qualifications, experience, and education/training. If you are applying for this role outside of the primary location, please contact [email protected] for the salary range for your location.

Manulife offers eligible employees a wide array of customizable benefits, including health, dental, mental health, vision, short- and long-term disability, life and AD&D insurance coverage, adoption/surrogacy and wellness benefits, and employee/family assistance plans. We also offer eligible employees various retirement savings plans (including pension and a global share ownership plan with employer matching contributions) and financial education and counseling resources. Our generous paid time off program in Canada includes holidays, vacation, personal, and sick days, and we offer the full range of statutory leaves of absence. If you are applying for this role in the U.S., please contact [email protected] for more information about U.S.-specific paid time off provisions.

We use data and analytics technologies, such as artificial intelligence (AI), and automated processing tools, to analyze and process the information you provide to us or third parties in the application process. For more information, please refer to our personal information collection statement.

Top Skills

Ansible
Azure
Java
Kubernetes
Python
Scala
Terraform
Typescript
HQ

Manulife Toronto, Ontario, CAN Office

250 Bloor St E,, Toronto, Ontario, Canada, M4W 1E6

Manulife Kitchener, Ontario, CAN Office

25 Water St S, Kitchener, ON, Canada, N2G 4Z4

Manulife Waterloo, Ontario, CAN Office

500 King St N,, Waterloo, ON, Canada, N2L

Similar Jobs

An Hour Ago
Easy Apply
Hybrid
Toronto, ON, CAN
Easy Apply
Senior level
Senior level
AdTech • Artificial Intelligence • Marketing Tech • Software • Analytics
The role involves driving sales growth in the Canadian market through strategic partnerships, managing key accounts, and leveraging industry relationships.
Top Skills: AIData-Driven MarketingMarketing TechnologyProgrammatic Media
An Hour Ago
In-Office
Oakville, ON, CAN
Senior level
Senior level
Gaming
Develop accessible front-end components using React, integrate applications with Next.js and ASP.NET Core, optimize web performance, and collaborate with teams across studios.
Top Skills: Asp.Net CoreC#Next.JsReact
2 Hours Ago
Hybrid
Toronto, ON, CAN
Senior level
Senior level
Fintech • Professional Services • Consulting • Energy • Financial Services • Cybersecurity • Generative AI
The Senior Solution Architect designs and implements technology solutions for clients, managing projects, providing strategic guidance, and ensuring high client satisfaction.
Top Skills: Business DevelopmentDigital TransformationProject ManagementSolution Architecture

What you need to know about the Toronto Tech Scene

Although home to some of the biggest names in tech, including Google, Microsoft and Amazon, Toronto has established itself as one of the largest startup ecosystems in the world. And with over 2,000 startups — more than 30 percent of the country's total startups — Toronto continues to attract new businesses. Be it helping entrepreneurs manage their finances, simplifying business operations by automating payroll or assisting pharmaceutical companies in launching new drugs, the city's tech scene is just getting started.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account