Okta Logo

Okta

Site Reliability Engineer, Kubernetes

Posted Yesterday
Be an Early Applicant
Toronto, ON
Senior level
Toronto, ON
Senior level
The Site Reliability Engineer at Okta will design, build, and scale the production Kubernetes platform, respond to production incidents, and improve system reliability through proactive monitoring and incident management. The role demands collaboration with engineering teams and a focus on security best practices while supporting a 24x7 environment.
The summary above was generated by AI

Get to know Okta
Okta is The World’s Identity Company. We free everyone to safely use any technology—anywhere, on any device or app. Our Workforce and Customer Identity Clouds enable secure yet flexible access, authentication, and automation that transforms how people move through the digital world, putting Identity at the heart of business security and growth. 
At Okta, we celebrate a variety of perspectives and experiences. We are not looking for someone who checks every single box - we’re looking for lifelong learners and people who can make us better with their unique experiences. 
Join our team! We’re building a world where Identity belongs to you.

Okta Workforce Identity Cloud (WIC) provides easy, secure access for your workforce so you can focus on other strategic priorities—like reducing costs, and doing more for your customers.

If you like to be challenged and have a passion for solving large-scale automation, testing, and tuning problems, we would love to hear from you. The ideal candidate is someone who exemplifies the ethics of, “If you have to do something more than once, automate it” and who can rapidly self-educate on new concepts and tools.

What you’ll be doing 

  • Designing, building, and scaling Okta's production Kubernetes platform
  • Be an evangelist for security best practices and also lead initiatives/projects to strengthen our security posture for critical infrastructure
  • Responding to production incidents and determining how we can prevent them in the future
  • Triaging and troubleshooting complex production issues to ensure reliability and performance
  • Continuously evolving our monitoring tools and platform
  • Developing and maintaining technical documentation, runbooks, and procedures
  • Supporting a 24x7 online environment as part of an on-call rotation

What you’ll bring to the role

  • Are always willing to go the extra mile: see a problem, fix the problem.
  • Are passionate about encouraging the development of engineering peers and leading by example.
  • A proven track record of successful SRE engagements and collaborating closely with engineering teams.
  • Knowledge and experience with deploying microservices and utilizing CI/CD pipelines.
  • A security mindset that prioritizes protecting assets from risks and vulnerabilities. 

Required Skills:

  • 6+ years of experience with AWS and Terraform
  • 3+ years of experience provisioning and managing Kubernetes clusters, with solid understanding of containers, Kubernetes infrastructure, and helm charts.
  • 3+ years of developer experience with Python or Golang
  • Strong Linux understanding and experience

Preferred Skills:

  • Experience with Istio service mesh and network policies
  • Familiarity with Spinnaker
  • Experience with monitoring and alerting in a Kubernetes ecosystem
  • Certified Kubernetes Administrator (CKA) or Certified Kubernetes Application Developer (CKAD) certification

#LI-Remote

#LI-LSS1

Below is the annual salary range for candidates located in Canada. Your actual salary will depend on factors such as your skills, qualifications, and experience. In addition, Okta offers equity (where applicable), bonus, and benefits, including health, dental, and vision insurance, RRSP with a match, healthcare spending, telemedicine, and paid leave (including PTO and parental leave) in accordance with our applicable plans and policies. To learn more about our Total Rewards program, please visit: https://rewards.okta.com/can.

The annual base salary range for this position for candidates located in Canada is between:

$135,000$203,000 CAD

What you can look forward to as an Full-Time Okta employee!

  • Amazing Benefits
  • Making Social Impact
  • Fostering Diversity, Equity, Inclusion and Belonging at Okta 

Okta cultivates a dynamic work environment, providing the best tools, technology and benefits to empower our employees to work productively in a setting that best and uniquely suits their needs. Each organization is unique in the degree of flexibility and mobility in which they work so that all employees are enabled to be their most creative and successful versions of themselves, regardless of where they live. Find your place at Okta today! https://www.okta.com/company/careers/.

Okta is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, ancestry, marital status, age, physical or mental disability, or status as a protected veteran. We also consider for employment qualified applicants with arrest and convictions records, consistent with applicable laws. If reasonable accommodation is needed to participate in the job application, interview process, or onboarding please use this Form to request an accommodation.

Okta is committed to complying with applicable data privacy and security laws and regulations. For more information, please see our Privacy Policy at https://www.okta.com/privacy-policy/. 

Top Skills

Go
Python

Okta Toronto, Ontario, CAN Office

401 Bay St., Toronto, ON, Canada, M5H 2Y4

Similar Jobs

Be an Early Applicant
11 Hours Ago
Toronto, ON, CAN
2,382 Employees
Expert/Leader
2,382 Employees
Expert/Leader
Big Data • Cloud • Software • Database
The Staff Site Reliability Engineer on the Fabric team will develop and maintain the infrastructure required for secure communication among services. Responsibilities include ensuring reliability and scalability of a multi-cloud network, collaborating with other teams, and handling technical issues to support connectivity.
Be an Early Applicant
14 Days Ago
Toronto, ON, CAN
Remote
232 Employees
Senior level
232 Employees
Senior level
Insurance
The Senior Site Reliability Engineer will enhance infrastructure security, monitoring, release engineering, and developer tools, mentoring teams and implementing best practices for scalability and reliability. Responsibilities include developing CI/CD systems, managing access to tools, guiding product teams in security requirements, and participating in on-call rotations.
Be an Early Applicant
3 Hours Ago
Toronto, ON, CAN
Hybrid
26,000 Employees
Senior level
26,000 Employees
Senior level
Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
As a Solution Consultant, you will support solution sales by guiding revenue through product-specific solutions. Your role includes leading workshops, providing product demonstrations, answering technical questions, offering feedback for enhancements, and participating in marketing events while achieving sales goals for your territory.

What you need to know about the Toronto Tech Scene

Although home to some of the biggest names in tech, including Google, Microsoft and Amazon, Toronto has established itself as one of the largest startup ecosystems in the world. And with over 2,000 startups — more than 30 percent of the country's total startups — Toronto continues to attract new businesses. Be it helping entrepreneurs manage their finances, simplifying business operations by automating payroll or assisting pharmaceutical companies in launching new drugs, the city's tech scene is just getting started.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account