Senior Cloud Engineer

Posted Yesterday
Be an Early Applicant
Ottawa, ON
Senior level
Fintech • Payments • Software
The Role
The Senior Cloud Engineer at Zafin is responsible for automating platform provisioning, configuration, and management using tools such as Terraform and Azure. They will design and implement a scalable Cloud Infrastructure on Azure, developing CI/CD pipelines and GitOps workflows, enhancing platform resilience, performance, and reliability while driving continuous improvement and collaboration with cross-functional teams.
Summary Generated by Built In

Who we are

Founded in 2002, Zafin offers a SaaS product and pricing platform that simplifies core modernization for top banks worldwide. Our platform enables business users to work collaboratively to design and manage pricing, products, and packages, while technologists streamline core banking systems. 

With Zafin, banks accelerate time to market for new products and offers while lowering the cost of change and achieving tangible business and risk outcomes. The Zafin platform increases business agility while enabling personalized pricing and dynamic responses to evolving customer and market needs. 

Zafin is headquartered in Vancouver, Canada, with offices and customers around the globe including ING, CIBC, HSBC, Wells Fargo, PNC, and ANZ. Zafin is proud to be recognized as a top employer and certified Great Place to Work® in Canada, India and the UK.  

Job Mandate

Reporting to the Head of Platform Engineering, the Senior Cloud Engineer is responsible for driving the complete automation of our platform provisioning, configuration, and management. You will leverage tools such as Terraform, Azure, AKS, Kubernetes, Kustomize, Helm, Argo CD, and GitOps to spearhead the design, build, and automation of a highly available, scalable, secure, and reliable Cloud Infrastructure Platform based on Azure and Azure Kubernetes Services (AKS). Your role will support our mission to enable development teams to deliver high-quality software faster, more reliably, and securely. You will be a force multiplier in our team’s mandate to continuously improve and enhance the platform’s capabilities, ensuring it remains cutting-edge and highly efficient, continuously evolving to meet the needs of the organization and its users. This role requires operational expertise, customer focus, and the ability to work collaboratively with cross-functional teams to drive customer satisfaction.

Major Responsibilities

  • Lead the design and implementation of a scalable, reliable, secure, and highly available Cloud Infrastructure Platform based on Azure and Azure Kubernetes Service (AKS).
  • Drive the complete automation of platform provisioning, configuration, and management using tools like Terraform and Argo CD.
  • Automate Infrastructure Provisioning with Terraform: Design and manage cloud infrastructure using Terraform to implement infrastructure as code (IaC), utilizing Terraform modules to ensure modular, reusable, and maintainable configurations for consistent, repeatable deployments.
  • Develop automated workflows with CI/CD pipelines to streamline and accelerate software delivery, eliminating any manual interventions.
  • Design and maintain GitOps workflows using tools such as Argo CD to automate the deployment and management of infrastructure and applications, ensuring seamless integration with Kubernetes clusters. Use GitOps principles to automatically detect and correct configuration drifts, ensuring that the actual state of the system always matches the desired state.
  • Implement self-healing and auto-scaling mechanisms to enhance platform resilience and performance.
  • Drive Continuous Improvement and Enhancement: Lead initiatives to continuously improve and enhance the platform by identifying inefficiencies, implementing automation, eliminating any manual toil, adopting new technologies, and optimizing existing processes to ensure higher reliability, performance, and scalability.
  • Collaborate closely with the Cloud Operations team to facilitate a seamless handover of support and maintenance tasks. Deliver documentation, conduct necessary knowledge transfer sessions, and provide ongoing mentorship to enable the cloud operations team to take over operational tasks successfully.
  • Implement Observability for Applications and Infrastructure: Utilize Azure's suite of observability tools, including Azure Monitor, Application Insights, Log Analytics, and Azure Network Watcher, to monitor and alert on the performance and health of applications and infrastructure. Ensure comprehensive visibility into application health and performance, enabling proactive detection and resolution of issues. Set up alerts and dashboards to provide real-time insights and proactive notifications for infrastructure anomalies and performance degradation.
  • Deliver Self-service Documentation to ensure that development and operations teams can easily consume and support the platform independently. This will reduce dependency on the platform engineering team and facilitate the handover of operational responsibilities to the cloud operations team, equipping them with the necessary knowledge and tools to manage day-to-day operations effectively.
  • Provide L3 and L4 support to aid in the resolution of Cloud Platform-related incidents.

Required Technical Skills and Qualifications

  • Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
  • 8+ years of experience in DevOps, Cloud Infrastructure, and Platform Engineering.
  • Extensive experience and strong expertise with Terraform for infrastructure as code (IaC): Proficient in designing, writing, and maintaining Terraform configurations, utilizing modules for modular and reusable code.
  • Extensive experience and strong expertise in Azure services, especially Azure Kubernetes Service (AKS): Deep understanding and extensive experience with Azure cloud services and hands-on experience managing AKS clusters.
  • Strong expertise in Kubernetes: Comprehensive knowledge of Kubernetes architecture, cluster setup, management, and troubleshooting. CKA certification is preferred.
  • Proficient in using Helm charts and Kustomize for Kubernetes resource management.
  • In-depth knowledge of Argo CD tool and GitOps principles: Experience in setting up and managing Argo CD for automated deployments and GitOps workflows.
  • Strong Experience with CI/CD pipelines: Expert in developing, managing, and optimizing CI/CD pipelines using Azure Pipelines and other tools like Jenkins and GitHub Actions
  • Solid understanding of networking, security, and infrastructure best practices: Expertise in implementing network configurations, security protocols, and maintaining compliance standards.
  • Hands-on experience with Azure suite of Observability tools such as Azure Monitor, Application Insights, Log Analytics, and Azure Network Watcher for monitoring and alerting on application and infrastructure health and performance.
  • Proficiency in scripting languages (Python, Bash): Capable of writing scripts to automate tasks and manage configurations.
  • Experience with configuration management tools (Ansible, Chef, Puppet): Knowledgeable in using these tools for automating system configurations.
  • Excellent problem-solving skills and the ability to independently troubleshoot complex issues: Proven track record of resolving infrastructure and application-related problems efficiently.
  • Strong communication and collaboration skills: Able to work effectively with cross-functional teams and provide mentorship.

Key Performance Indicators (KPIs)

Following KPIs will help measure the effectiveness of the Cloud Engineer in maintaining and improving the platform, ensuring it meets the needs of the organization and its users.

  • Uptime and Availability: Ensure the platform maintains a high uptime percentage, targeting 99.9% or higher availability.
  • Deployment Frequency: Increase the frequency of successful deployments, aiming for multiple deployments per day.
  • Mean Time to Recovery (MTTR): Reduce the average time it takes to recover from failures or incidents.
  • Automated Test Coverage: Increase the percentage of code and infrastructure covered by automated tests.
  • Time to Production: Measure and optimize the time taken to move code changes from commit to production.
  • Security Compliance: Ensure the platform meets all security compliance requirements and passes regular security audits.
  • User Satisfaction: Improve user satisfaction scores based on feedback from development and operations teams.
  • Documentation Quality: Ensure high-quality, comprehensive documentation is available, regularly updated, and easily accessible to all stakeholders.

What’s in it for you

Joining our team means being part of a culture that values diversity, teamwork, and high-quality work. We offer competitive salaries, annual bonus potential, generous paid time off, paid volunteering days, wellness benefits, and robust opportunities for professional growth and career advancement. Want to learn more about what you can look forward to during your career with us? Visit our careers site and our openings: zafin.com/careers

Zafin welcomes and encourages applications from people with disabilities. Accommodations are available on request for candidates taking part in all aspects of the selection process. 

Zafin is committed to protecting the privacy and security of the personal information collected from all applicants throughout the recruitment process. The methods by which Zafin contains uses, stores, handles, retains, or discloses applicant information can be accessed by reviewing Zafin’s privacy policy at https://zafin.com/privacy-notice/. By submitting a job application, you confirm that you agree to the processing of your personal data by Zafin described in the candidate privacy notice.

Top Skills

Aks
Azure
Kubernetes
Terraform
The Company
Palo Alto, CA
450 Employees
On-site Workplace
Year Founded: 2002

What We Do

Zafin, the global leader in SaaS cloud-native product and pricing solutions, is a trusted partner to the world’s most customer-centric financial institutions. Zafin’s product and pricing platform empowers banks of all sizes to center their customers, grow relationships and drive revenues.

The Zafin platform separates product and pricing from core processing to accelerate progressive modernization, enable digital transformation and deliver personalization at the relationship level.

A typical Zafin installation integrates easily with most back-end systems and customer-facing channels to increase product and pricing efficiency and agility, drive interest and non-interest income, and deliver a positive ROI—often in one year or less. 

Similar Jobs

Toronto, ON, CAN
1468 Employees

Zafin Logo Zafin

Senior Cloud Engineer

Fintech • Payments • Software
Toronto, ON, CAN
450 Employees

Zafin Logo Zafin

Senior Cloud Engineer

Fintech • Payments • Software
Toronto, ON, CAN
450 Employees

RBC Logo RBC

Senior Cloud Developer

Fintech • Insurance
West Toronto, ON, CAN
88000 Employees

Similar Companies Hiring

General Motors Thumbnail
Transportation • Software • Robotics • Manufacturing • Information Technology • Big Data • Automotive
Detroit, MI
165000 Employees
iCapital Thumbnail
Fintech • Financial Services
New York, NY
1500 Employees
VTS Thumbnail
Software • Real Estate • PropTech • Productivity • Big Data Analytics
New York City, NY
500 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account