Nylas

Senior Site Reliability Engineer

Posted 10 Days Ago

Be an Early Applicant

Remote

Hiring Remotely in Canada

Senior level

Remote

Hiring Remotely in Canada

Senior level

As a Senior Site Reliability Engineer, you'll ensure the reliability and efficiency of our products. This includes scaling legacy AWS systems, managing infrastructure in GCP, configuring alerts, improving CI/CD pipelines, and participating in on-call duties.

The summary above was generated by AI

The Company

At Nylas, we specialize in making it easier for developers to add email, calendar, and contact management features into their applications. We provide tools called APIs, which streamline the integration of these functionalities, ensuring they are secure and effective. This enables better, safer, and more reliable communication within apps.

Supporting over 100,000 developers and collaborating with more than 900 companies globally, Nylas plays a pivotal role in how digital communication tools are built and utilized. Our technology spans various sectors, from healthcare to education, simplifying the complex process of app development related to communications. By reducing the barriers in communication technology, we empower developers to innovate and enhance user interaction across platforms.

The Role

Our SRE team is responsible for ensuring our products run reliably and efficiently. We manage an impressive scale of infrastructure, serving billions of API calls every day. We are responsible for our overall SLA uptime and Cost of Goods Sold (COGS) relative to Cloud Compute spend.

What You’ll Do

Support our engineering team with best practices and provisioning new infrastructure as necessary.
Maintaining and scaling a legacy system in AWS with Ansible, Python, MySQL, Terraform.
Maintaining our new Infrastructure in GCP with Kubernetes, Helm, ArgoCD, Terraform, GoLang, OpenSearch, Spanner, Redis.
Configuring and adjusting alerts and dashboards in NewRelic and Coralogix. Leveraging Fluent-Bit and OpenTelemetry.
Managing and improving our CI/CD pipelines using ArgoCD and Helm.
Take part in an on-call rotation and assist in debugging and resolving incidents.

What You Must Bring

Experience: Minimum of 5 years in production engineering, with hands-on experience in managing and scaling Linux-based production servers.
Communication and Empathy: Exceptional communication skills and a strong empathetic approach, understanding that effective teamwork and problem-solving require more than just technical skills.
Linux Proficiency: Advanced proficiency in navigating the Linux command line.
Logging and Observability: Demonstrated experience with platforms like New Relic, Coralogix, Grafana, and Prometheus. Candidates with expertise in tuning alerts, synthetics, and creating comprehensive health dashboards and reports will be preferred.
Configuration Management: Experience in automating systems using modern tools such as Chef, Ansible, or Puppet.
Containerization and Orchestration: Proven track record of deploying and managing services using Kubernetes and Docker.
Cloud Services: Practical experience with major cloud services like AWS, GCP, or Azure, focusing on deploying and maintaining scalable applications.
Programming Skills: Capability to write reliable code in at least one programming language such as Python, GoLang, or JavaScript. Note: While coding is part of the role, it will not be the central focus of our interview process.
Learning Agility: Ability to rapidly learn and adapt to new technologies and frameworks.
Automation and Infrastructure: Passion for building modern, scalable infrastructure and automating routine tasks to improve efficiency and reliability.

Perks/Benefits

Healthcare: Extended healthcare coverage for you and your family
Unlimited Paid Time Off (PTO): We take this very seriously as we care about the well-being of our employees
RRSP with 3% employer contribution
Education Stipend: $1,250 annual education & development benefit
Cell Phone: $60 per month stipend towards cell phone reimbursement
Fully Paid Parental Leave: 12 weeks parental leave (maternity & paternity)

Interview Process

Round 1: 30 minute phone call with the Recruiter
Round 2: 60 minute Google Meet discussion with the Hiring Manager.
Round 3: Three (3) Google Meet discussions with various Nylas leaders including a live coding assignment with a team member (max 3 hours).

During the various discussions, candidates selected to meet with us are strongly encouraged to not only discuss their knowledge, skills, experience, and abilities but also to showcase examples of their current or previous work. We expect you to clearly outline the "what," "why," and "how" behind your contributions.

The estimated base salary range for this position is $125,000 to $150,000. Actual compensation will be determined based on individual qualifications, which are objectively assessed during the interview process. Factors influencing salary include knowledge, skills, experience, and abilities.

Top Skills

Python

Similar Jobs

Cisco Meraki

Senior Site Reliability Engineer, Fleet - REMOTE within Canada

Be an Early Applicant

5 Days Ago

Canada

Remote

3,000 Employees

Senior level

Easy Apply

3,000 Employees

Senior level

Hardware • Information Technology • Security • Software • Cybersecurity • Conversational AI

The Senior Site Reliability Engineer will ensure the stability and scalability of Cisco Meraki's infrastructure. Responsibilities include automating maintenance processes, debugging failure scenarios, optimizing CI/CD workflows, collaborating with engineering teams, and developing automated tools for data collection and compliance.

Canonical

Senior Site Reliability Engineer

Be an Early Applicant

2 Days Ago

Remote

880 Employees

Senior level

Apply

880 Employees

Senior level

Cloud • Software

The Senior Site Reliability Engineer focuses on enhancing automation and operations at scale by leveraging Python, Kubernetes, and OpenStack. Responsibilities include managing private cloud infrastructure, promoting a scientific approach to operations, and contributing to the evolution of open source technologies for high-pressure, mission-critical environments.

Tenable

Senior Site Reliability Engineer - DoD IL5

9 Days Ago

Remote

1,847 Employees

Senior level

Apply

1,847 Employees

Senior level

Security • Software

As a Senior Site Reliability Engineer at Tenable, you will enhance their vulnerability management platform, ensure its functionality in cloud environments, automate system management and monitoring, and provide solutions for complex technical issues. You will also document processes and support production applications.

What you need to know about the Toronto Tech Scene

Although home to some of the biggest names in tech, including Google, Microsoft and Amazon, Toronto has established itself as one of the largest startup ecosystems in the world. And with over 2,000 startups — more than 30 percent of the country's total startups — Toronto continues to attract new businesses. Be it helping entrepreneurs manage their finances, simplifying business operations by automating payroll or assisting pharmaceutical companies in launching new drugs, the city's tech scene is just getting started.