Citi Logo

Citi

Site Reliability Engineer

Posted 6 Days Ago
Be an Early Applicant
In-Office
Mississauga, ON, CAN
Senior level
In-Office
Mississauga, ON, CAN
Senior level
The role involves supporting AI and DevOps platforms, enhancing operational reliability, coordinating with teams for incident resolution, and improving service levels through various operational practices.
The summary above was generated by AI
Description 
Engineer the future of global finance. 
At Citi, our Tech team doesn’t just support finance – we are helping to redefine it. Every day, $5 trillion crosses through our network. We do business in 180+ countries, operating at a scale few can match. From deploying advanced AI to helping shape global markets, we build systems that matter. Look to join a team where your work helps influence economies, your ideas can drive innovation and outcomes, and your growth is backed by mentorship, continuous learning, and flexibility with potential hybrid work opportunities. Help solve realworld challenges that touch millions and get the opportunity to build the future of finance with Citi Tech. 
We are seeking a motivated team member to support our AI and DevOps Platform Support team in North America. This role is responsible for assisting in the stability, reliability, and performance of our critical AI and DevOps platforms. The team supports a wide range of services, including multiple AI applications, developer tools, and CI/CD pipeline technologies used across the organization. The ideal candidate will work closely with SRE and Support engineers to resolve incidents, address platform issues, and collaborate with engineering and development teams to enhance platform supportability. The role includes coordinating daily operational activities and contributing to short term planning. 
 
Responsibilities 
• Understand how application support functions within the broader technology organization and contributes to business objectives. 
• Assist with vendor coordination and day to day interactions with offshore managed services. 
• Support efforts to improve service levels, including participating in incident management, problem management, and knowledge sharing initiatives. 
• Partner with development and engineering teams to support application stability and operational readiness. 
• Assist in collecting capacity, performance, and latency data to support platform planning efforts. 
• Support application onboarding activities using established guidelines and standards. 
• Contribute to fostering a collaborative and supportive team environment that encourages skill development. 
• Participate in cost efficiency initiatives such as Root Cause Analysis reviews, knowledge management, and performance tuning. 
• Assist in preparing materials for business review meetings and help align technology activities with business needs. 
• Follow established support processes and tool standards and provide input on improvement opportunities. 
• Perform other duties and functions as assigned. 
• Contribute to platform enhancement initiatives in partnership with engineering and support leads. 
• Assist in resilience related activities, including incident simulations, disaster recovery exercises, and platform readiness testing. 
• Support automation efforts to reduce manual tasks and improve operational efficiency. 
• Help maintain observability practices, including monitoring, logging, tracing, and alerting. 
• Maintain practical understanding of platform components to support troubleshooting and incident response activities. 
• Assist in tracking the operational health of production platforms (including OpenShift, ECS, CI/CD) and support SLA adherence. 
• Participate in monitoring and observability efforts to support proactive issue identification and analysis. 
 
Qualifications 
• 58 years of relevant experience in technical support, platform operations, or engineering. 
• Exposure to architecture concepts with the ability to contribute to technical discussions and understand design decisions. 
• Experience working with business partners, engineering teams, or technology stakeholders. 
• Demonstrated experience supporting IT services, platform operations, or infrastructure components. 
• Strong verbal and written communication skills, with the ability to document technical issues clearly. 
• Experience supporting operational workstreams or participating in platform improvement initiatives. 
• Participation in resilience related or stability focused activities preferred. 
• Ability to collaborate effectively with cross functional teams. 
• Strong organizational skills and ability to manage daily workload and task priorities. 
• Working knowledge of Generative AI concepts preferred. 
• Experience with CI/CD or configuration management tools preferred. 
• Experience with Red Hat OpenShift or similar Kubernetes technologies preferred. 
• Experience working with databases such as Postgres, Oracle, MongoDB, or Redis preferred. 
• Experience with scripting or coding in Java, Python, Go, or similar languages preferred. 
• Familiarity with modern observability and monitoring tools (e.g., Prometheus, Grafana, Splunk, ELK) preferred. 
 
Education 
• Bachelor’s/University degree required. 

------------------------------------------------------

Job Family Group:

Technology

------------------------------------------------------

Job Family:

Applications Support

------------------------------------------------------

Time Type:

Full time

------------------------------------------------------

Primary Location Full Time Salary Range:

$94,300.00 - $141,500.00

------------------------------------------------------

Most Relevant Skills

Please see the requirements listed above.

------------------------------------------------------

Other Relevant Skills

For complementary skills, please see above and/or contact the recruiter.

------------------------------------------------------

Automated Processing and AI

We use automated processing, including artificial intelligence, for our legitimate business interests (or our reasonable and appropriate business purposes) to identify and align the candidate's skills and abilities with a specific job opening. Additionally, if you so choose, or consent, we can match your skills and abilities to other suitable roles at Citi.

Importantly, all our hiring processes and decisions, including determining your suitability for a role, are conducted, checked, and decided by individuals. Our automated processing and AI do not involve relying on automatic or autonomous decision-making. Please refer to any Jurisdictional Considerations, with specific provisions for your country (where relevant) for further details.

------------------------------------------------------

This job opening is for an existing job vacancy.

------------------------------------------------------

Citi is an equal opportunity employer, and qualified candidates will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other characteristic protected by law.

 

If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review Accessibility at Citi.
View Citi’s EEO Policy Statement and the Know Your Rights poster.

Top Skills

Ci/Cd
Ecs
Elk
Go
Grafana
Java
MongoDB
Openshift
Oracle
Postgres
Prometheus
Python
Redis
Splunk

Similar Jobs

12 Days Ago
Easy Apply
Hybrid
Toronto, ON, CAN
Easy Apply
Senior level
Senior level
Big Data • Cloud • Software • Database
The Senior Site Reliability Engineer will design, optimize, and maintain MongoDB's multi-tenant distributed storage systems, ensuring reliability and operational safety while implementing automation solutions and participating in on-call support.
Top Skills: AWSAzureGoGoogle Cloud PlatformKubernetesLinuxPython
12 Days Ago
Easy Apply
Hybrid
Toronto, ON, CAN
Easy Apply
Expert/Leader
Expert/Leader
Big Data • Cloud • Software • Database
The role involves building and maintaining secure multi-cloud infrastructure for communication between systems, incorporating networking and distributed systems expertise. Responsibilities include collaborating with teams for service connectivity and participating in a 24/7 on-call rotation.
Top Skills: AWSAzureBgpDnsGCPKubernetesSdnTcp/IpTls/Mtls
2 Days Ago
In-Office
Toronto, ON, CAN
Senior level
Senior level
Software
The Lead Site Reliability Engineer manages client onboarding processes on Azure, ensures platform availability, resolves incidents, and enhances system features while mentoring junior staff.
Top Skills: AnsibleAzure DevopsCi/CdCitrix CloudCitrix DaasDockerJenkinsKubernetesAzureRest ApisSAMLScimTerraform

What you need to know about the Toronto Tech Scene

Although home to some of the biggest names in tech, including Google, Microsoft and Amazon, Toronto has established itself as one of the largest startup ecosystems in the world. And with over 2,000 startups — more than 30 percent of the country's total startups — Toronto continues to attract new businesses. Be it helping entrepreneurs manage their finances, simplifying business operations by automating payroll or assisting pharmaceutical companies in launching new drugs, the city's tech scene is just getting started.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account