Wisedocs Logo

Wisedocs

Site Reliability Engineer (SRE)

Posted Yesterday
Be an Early Applicant
Toronto, ON
Junior
Toronto, ON
Junior
As a Site Reliability Engineer at Wisedocs AI, you will design, implement, and maintain cloud infrastructure while ensuring optimal performance and reliability. Your role involves automation, troubleshooting incidents, optimizing systems, and collaborating with development teams to integrate SRE practices.
The summary above was generated by AI

Description

Wisedocs is on a mission to make it easy and accessible for any company in the insurance, legal and medical space to understand medical documents quickly using AI (Artificial Intelligence). Every week, we process hundreds of thousands of pages of documents, saving our customers hours and hours of manual processing time, and helping them process medical claims much more quickly.

Join Wisedocs AI as a Site Reliability Engineer, where you will be responsible for designing, implementing, and maintaining our cloud-based infrastructure and operational processes. You will work closely with our development and operations teams to build reliable, automated systems that support our rapidly growing user base and mission-critical applications. This role combines software engineering, system administration, and operational expertise to optimize our service reliability and performance.

The position is a hybrid model requiring on-site presence 2-3 days/week in Downtown Toronto.

Responsibilities

As a member of our Engineering team, your primary responsibilities will include:

Infrastructure Reliability & Monitoring:

  • Design, build, and maintain scalable, resilient, and secure cloud infrastructure on AWS.
  • Implement robust monitoring, alerting, and logging systems to ensure high availability and performance.
  • Develop and maintain automated processes for infrastructure deployment, updates, and recovery.

Incident Response & Troubleshooting:

  • Serve as the first line of defence during incidents; the majority of your time will be spent on initial incident response, including rapid detection, diagnosis, and remediation of outages or performance issues.
  • Lead escalation procedures as needed, conduct root cause analyses, and implement long-term fixes to prevent recurrence of incidents.

Performance & Capacity Planning:

  • Continuously evaluate system performance, identify bottlenecks, and proactively plan for future growth.
  • Develop and maintain tools to measure and optimize system performance.

Automation & DevOps:

  • Collaborate with software development teams to integrate SRE best practices into the development lifecycle.
  • Automate repetitive tasks and implement CI/CD pipelines to streamline deployments and operational workflows.

Security & Compliance:

  • Ensure systems are secure, compliant with industry standards, and follow best practices for data protection and privacy.
  • Work with cross-functional teams to address security vulnerabilities and maintain system integrity.

Documentation & Collaboration:

  • Create clear, detailed documentation for infrastructure, processes, and operational procedures.
  • Serve as a key resource for reliability and performance insights across the organization.
  • Other duties and responsibilities will be assigned as projects develop, adjust and mature.


What to expect from our Recruitment Process:

  • Round #1 – HR (Quick Prescreen)
    Duration: 20-30 minutes
    Focus: High level Get-to-Know-You
  • Round #2 - Technical Assessment
    Duration: 30-45 minutes
    Focus: Practical Coding Evaluation
  • Round #3 - Hiring Manager Interview
    Duration: 1-1.5 hours
    Focus: Experience, technical skills (conceptual assessment), team integration.
  • Round #4 - Meet with our CTO!
    Duration: 45 min
    Focus: Culture fit, strategic alignment
Requirements

Technical Expertise:

  • Proven experience in a Site Reliability Engineer, DevOps, or similar role in a cloud environment (AWS preferred).
  • Experience with infrastructure-as-code tools (e.g., Terraform, CloudFormation) and container orchestration (e.g., Kubernetes).
  • Solid programming/scripting skills in languages such as Python or Bash

Operational Skills:

  • Experience with monitoring tools (e.g., Prometheus, Grafana, Datadog) and logging solution
  • Familiarity with CI/CD pipelines and version control systems (e.g., Git).
  • Knowledge of networking concepts, load balancing, and high availability architectures.

Soft Skills:

  • Excellent problem-solving skills and the ability to work under pressure during incident resolution.
  • Strong communication skills with the ability to collaborate effectively across technical teams.
  • A proactive, detail-oriented mindset with a passion for continuous improvement.

Preferred Qualifications:

  • Experience working in a SaaS or high-growth startup environment.
  • Familiarity with agile methodologies and collaborative cross-functional team environments.
  • Familiarity with browser developer console.
  • Relevant certifications in cloud technologies (e.g., AWS Certified DevOps Engineer, Certified Kubernetes Administrator).
  • Familiarity with browser developer tools
  • Experience with ticketing systems
Benefits

What We Offer 

  • A hybrid work model,
  • Modern employee benefits, including health and dental coverage
  • Competitive compensation, with valuable stock options, as we’re still a young company growing very quickly.  
  • An opportunity to develop very rapidly in your career. We can offer you a super-immersive learning environment, and you thrive there you will have the opportunity to rapidly develop this opportunity into senior practitioner or management opportunities as you choose.  
  • Access to a learning and professional development fund to help you level up your career while you’re working with us. We hope to be an incredible step up for your career if you decide to come and work with us.  
  • Company events  
  • Generous Paid Time Off  
  • Paid Sick Days  
  • Casual Dress code  
  • Employee Referral Bonus  
  • Tuition Assistance  
  • Plus many other Recognition Programs!  

Join our team and be part of a company committed to making a positive impact on the InsureTech and HealthTech industries. 

*Wisedocs AI is an equal opportunity employer and are committed to providing employment accommodation in accordance with AODA. If you require an accommodation, please notify us and we will work with you to meet your needs.

Top Skills

Bash
Python

Similar Jobs

10 Days Ago
Toronto, ON, CAN
Senior level
Senior level
Food • Retail • Agriculture • Manufacturing
The Sr Engineering Manager, SRE & Observability will lead the design, implementation, and monitoring of secure, fault-tolerant SRE and Observability infrastructure. Responsibilities include developing strategies, collaborating with teams, mentoring engineers, and driving operational excellence through advanced monitoring and automation techniques.
Top Skills: GoJavaPython
2 Days Ago
Toronto, ON, CAN
Mid level
Mid level
Social Media
As a Sr. Site Reliability Engineer at Pinterest, you will develop software solutions for reliable large-scale distributed systems, create automation tools, manage system performance and capacity, and help enhance engineering practices to maintain operational excellence.
Top Skills: JavaPythonRuby
2 Days Ago
Hybrid
Waterloo, ON, CAN
Mid level
Mid level
Fintech • Software
As a Site Reliability Engineer II at Carta, you will build and scale internal platforms, design monitoring systems, and collaborate with software engineers to enhance application reliability and performance. You will drive improvements in infrastructure as the company grows globally.
Top Skills: JavaPython

What you need to know about the Toronto Tech Scene

Although home to some of the biggest names in tech, including Google, Microsoft and Amazon, Toronto has established itself as one of the largest startup ecosystems in the world. And with over 2,000 startups — more than 30 percent of the country's total startups — Toronto continues to attract new businesses. Be it helping entrepreneurs manage their finances, simplifying business operations by automating payroll or assisting pharmaceutical companies in launching new drugs, the city's tech scene is just getting started.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account