Thomson Reuters Logo

Thomson Reuters

Staff Software Engineer — Search Platform, API & Infrastructure

Posted 18 Hours Ago
Be an Early Applicant
Hybrid
Toronto, ON, CAN
Senior level
Hybrid
Toronto, ON, CAN
Senior level
The role involves leading the design and delivery of a search platform's API and cloud infrastructure, focusing on providing self-service capabilities to internal clients. Responsibilities include API ownership, cloud infrastructure management, reliability engineering, and technical leadership.
The summary above was generated by AI
This posting is for proactive recruitment purposes and may be used to fill current openings or future vacancies within our organization.

Overview of the Role:
Advanced Content Engineering (ACE) is seeking a Staff Software Engineer to lead the design and delivery of the search platform’s control-plane API and cloud infrastructure. The platform’s core promise is self-service: internal client teams must be able to create a search system, configure an ingestion topology, promote a new index to production, and monitor system health — entirely through APIs — without requiring direct involvement from the platform team. Building, operating, and continuously improving that self-service experience is the heart of this role.
This is a high-ownership, high-leverage position at the intersection of platform engineering, API design, and cloud infrastructure. Staff Engineers on this team define, build, test, deploy, scale, and operate what they ship — full-stack ownership is the baseline, not a bonus. Delivery friction is treated as an urgent engineering problem: the team ships to production constantly, AI-assisted development is the norm, and removing obstacles to fast, safe delivery is everyone’s responsibility. The successful candidate brings enterprise-grade security instincts, deep AWS expertise, and a product-minded approach to developer experience — treating the platform’s API as a product in its own right.

About the Role
In this position, you will focus on:
Platform Control-Plane API
• Plan, design, develop, and own the platform’s management API — the self-service interface through which client teams create and configure search systems, manage ingestion topologies, register reusable components, promote index versions, and monitor system health — resolving problems of diverse scope with innovative thinking and little or no precedent to guide solutions
• Architect the platform’s multi-tenant access model: implement strict data isolation between client tenants, integrate with enterprise identity providers, establish role-based access control across all API endpoints, and define the governance framework that ensures the platform can make credible security commitments to enterprise customers
• Establish API strategy and cross-system integration patterns — designing versioned, backward-compatible interfaces with clear contracts, comprehensive documentation, and developer-experience patterns drawn from best-in-class
search platform providers — and set governance standards that the team follows for all future API surface
• Design and expose the API surface required to support the platform’s evaluation and experimentation workflows — including endpoints that enable the search grading tool to consume experiment run outputs, query/result pairs, and relevance judgments, and that allow client teams to configure and trigger A/B search experiments through self-service interfaces
• Design the configuration data model and persistence layer (DynamoDB and related services) that stores search system definitions, component registry entries, index lifecycle state, and audit logs — applying architectural patterns that scale to the platform’s multi-tenant and multi-region ambitions
• Break down complex business requirements into functional and technical requirements with consideration for security, ethical AI implementation, and operational efficiency; contribute to recommendations where technology transformation can spark business growth

Cloud Infrastructure & DevOps
• Own the platform’s AWS infrastructure as code — defining, provisioning, and maintaining ECS services, MSK clusters, OpenSearch/Vespa deployments, DynamoDB tables, networking (VPC, security groups, NAT), and IAM roles using Terraform or AWS CDK — establishing infrastructure governance standards and a cloud strategy for multi-environment and eventual multi-region operation
• Design and own the CI/CD pipeline for platform services — establishing DevOps culture and toolchain strategy for the team, with a clear mandate to eliminate delivery friction: the team ships to production constantly, and any obstacle to doing so safely is an engineering problem to be solved, not a process to be accepted
• Drive adoption of AI-assisted development practices across the team’s infrastructure and API work — establishing the tooling, patterns, and norms that enable engineers to leverage AI to move faster while maintaining the quality and reliability bar the platform demands
• Own infrastructure cost management: monitor AWS spend across platform components, evaluate architectural trade-offs at the system level, and implement an enterprise performance and optimization framework that keeps the platform’s economics sustainable as it scales — including compute cost governance for inference workloads as custom model serving is introduced
• Implement and operate customer-controlled encryption key (CMK) support — applying security strategy, risk assessment frameworks, and security governance to give enterprise clients control over their encryption keys while preserving multi-tenant reliability

Reliability Engineering
• Define and own platform-level SLOs covering API availability, query latency, ingestion throughput, and end-to-end document freshness — and build the
monitoring infrastructure (CloudWatch, distributed tracing, alerting) that makes SLO compliance continuously visible to the team and to client teams
• Design the observability infrastructure for agentic retrieval paths — where standard request/response logging is insufficient: implement trace-level instrumentation that captures tool invocation sequences, per-hop latency, and retrieval inputs, enabling reliable diagnosis of failures and quality regressions in non-deterministic agent workflows
• Take full operational responsibility for platform API and infrastructure — you built it, you own it, you run it: triage and resolve incidents, write thorough post-mortems, and drive systematic improvements that prevent recurrence
• Design enterprise performance strategy for the platform’s API layer: load testing, capacity planning, performance profiling, and system-level optimization — ensuring the platform can handle planned growth in tenants, content volumes, and query traffic
• Embed security architecture throughout the platform’s infrastructure: least-privilege IAM, secrets management, encryption at rest and in transit, audit logging, and compliance implementation aligned with TR’s enterprise security requirements

Technical Leadership
• Establish architectural principles and cross-system design patterns for the platform’s control plane and infrastructure — functioning as the technical authority that other engineers and teams turn to for API and infrastructure guidance
• Lead significant projects and business initiatives that span multiple engineers and interact with partner teams; determine work priorities and make adjustments to short-term priorities while maintaining strategic focus; provide specialist advice to senior management on complex infrastructure and security issues
• Mentor and develop Senior and mid-level engineers — providing coaching, technical direction, and educational opportunities in cloud infrastructure, platform API design, reliability engineering, and AI-assisted development practices
• Engage with client teams as a technical partner — understanding their integration experience and pain points, feeding structured requirements back into the platform API roadmap, and proactively reducing time-to-value for new platform adopters
• Deliver effective presentations on complex infrastructure and security concepts to technical and non-technical stakeholders; champion ethical AI practices and responsible technology deployment across the team’s work

About You
You’re an ideal fit if you have:
Required Experience —
• Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field
• 8+ years of software engineering experience, with demonstrated progression to staff-level or equivalent technical leadership — including ownership of a functional area and leadership of significant cross-functional projects
• Deep expertise in cloud-native platform and infrastructure engineering on AWS: VPC architecture, IAM, ECS, Lambda, DynamoDB, MSK, and related managed services — with hands-on infrastructure-as-code experience (Terraform and/or AWS CDK) and the ability to establish infrastructure governance frameworks
• Production experience with OpenSearch, Vespa, or Elasticsearch at an operational level — cluster sizing, backup and restore, index lifecycle management, and multi-tenant access controls
• Mastery of Python with strategic awareness of language selection and migration; strong software engineering fundamentals including testing architecture, security architecture, and system design
• Demonstrated enterprise security practice: security strategy, risk assessment frameworks, least-privilege IAM, secrets management, encryption at rest and in transit, and compliance implementation in production cloud environments
• Track record of establishing API governance frameworks, cross-system integration patterns, and documentation standards; experience designing multi-tenant SaaS-style platform APIs with versioning, access control, and first-class developer experience
• Demonstrated reliability engineering ownership: SLO definition, observability implementation, on-call leadership, and a track record of improving platform reliability through data-driven retrospectives — with a clear philosophy that shipping frequently and operating reliably are complementary, not in tension
• Comfort and fluency with AI-assisted development tools; you use them to move faster and produce higher-quality infrastructure and API code, and you actively help the team do the same

Preferred Experience —
• Experience operating Kafka (MSK) or other distributed messaging infrastructure in production, including partition management, consumer group monitoring, and schema registry governance
• Background in Kubernetes or ECS container orchestration, including service mesh, autoscaling, and health check patterns
• Experience building developer-facing internal platforms where API quality and documentation are treated as first-class product concerns
• Knowledge of enterprise encryption patterns, including customer-managed keys (AWS KMS) and their architectural implications for multi-tenant systems
• Familiarity with distributed tracing infrastructure for non-deterministic or agentic workflows — where trace design must capture tool call sequences and per-hop context, not just request/response pairs
• Familiarity with AI service architecture: evaluating AI vendors, cost-benefit analysis, and integrating AI API services with fallback strategies into production platform infrastructure.

What Success Looks Like
In the first 90 days:
• Build a thorough understanding of the platform’s current infrastructure, API surface, and operational posture — including known gaps in reliability, security, and developer experience
• Establish relationships with key client teams to understand their integration experience and pain points with the current platform
• Take on-call ownership for your functional area and identify and begin delivering the highest-leverage near-term improvements to platform API or infrastructure reliability

In the first year:
• Deliver a materially improved self-service platform API — with strong multi-tenant isolation, documented governance standards, and measurably better developer experience for client teams
• Establish end-to-end SLO coverage across platform services, with automated alerting, clear on-call runbooks, documented architectural decision records, and a track record of fast, high-quality incident resolution
• Own and deliver a major infrastructure initiative — CMK support, multi-environment maturity, agentic observability infrastructure, or a comparable project — from architectural design through production, establishing the principles and patterns that guide the platform’s infrastructure evolution
• Become the recognized technical authority for platform API and infrastructure — shaping team standards, influencing platform architecture, and providing specialist guidance to leadership on complex infrastructure and security challenges.

#LI-TH1

What’s in it For You?

  • Hybrid Work Model: We’ve adopted a flexible hybrid working environment (2-3 days a week in the office depending on the role) for our office-based roles while delivering a seamless experience that is digitally and physically connected.

  • Flexibility & Work-Life Balance: Flex My Way is a set of supportive workplace policies designed to help manage personal and professional responsibilities, whether caring for family, giving back to the community, or finding time to refresh and reset. This builds upon our flexible work arrangements, including work from anywhere for up to 8 weeks per year, empowering employees to achieve a better work-life balance.

  • Career Development and Growth: By fostering a culture of continuous learning and skill development, we prepare our talent to tackle tomorrow’s challenges and deliver real-world solutions. Our Grow My Way programming and skills-first approach ensures you have the tools and knowledge to grow, lead, and thrive in an AI-enabled future.

  • Industry Competitive Benefits: We offer comprehensive benefit plans to include flexible vacation, two company-wide Mental Health Days off, access to the Headspace app, retirement savings, tuition reimbursement, employee incentive programs, and resources for mental, physical, and financial wellbeing.

  • Culture: Globally recognized, award-winning reputation for inclusion and belonging, flexibility, work-life balance, and more. We live by our values: Obsess over our Customers, Compete to Win, Challenge (Y)our Thinking, Act Fast / Learn Fast, and Stronger Together.

  • Social Impact: Make an impact in your community with our Social Impact Institute. We offer employees two paid volunteer days off annually and opportunities to get involved with pro-bono consulting projects and Environmental, Social, and Governance (ESG) initiatives.

  • Making a Real-World Impact: We are one of the few companies globally that helps its customers pursue justice, truth, and transparency. Together, with the professionals and institutions we serve, we help uphold the rule of law, turn the wheels of commerce, catch bad actors, report the facts, and provide trusted, unbiased information to people all over the world.

Our use of AI within the recruitment process Thomson Reuters utilizes Artificial Intelligence (AI) to support parts of our global recruitment process. Unless you opt-out, our AI system will assess the information provided by you and compare it to the requirements listed for the role, and present the result to our recruitment personnel for further review. The AI system acts as a supporting tool, but there is always a human making the decision if you will be considered for the role.

In the United States, Thomson Reuters offers a comprehensive benefits package to our employees. Our benefit package includes market competitive health, dental, vision, disability, and life insurance programs, as well as a competitive 401k plan with company match. In addition, Thomson Reuters offers market leading work life benefits with competitive vacation, sick and safe paid time off, paid holidays (including two company mental health days off), parental leave, sabbatical leave. These benefits meet or exceeds the requirements of paid time off in accordance with any applicable state or municipal laws. Finally, Thomson Reuters offers the following additional benefits: optional hospital, accident and sickness insurance paid 100% by the employee; optional life and AD&D insurance paid 100% by the employee; Flexible Spending and Health Savings Accounts; fitness reimbursement; access to Employee Assistance Program; Group Legal Identity Theft Protection benefit paid 100% by employee; access to 529 Plan; commuter benefits; Adoption & Surrogacy Assistance; Tuition Reimbursement; and access to Employee Stock Purchase Plan.

Thomson Reuters complies with local laws that require upfront disclosure of the expected pay range for a position. The base compensation range varies across locations. Eligible office location(s) for this role include one or more of the following: New York City, San Francisco, Los Angeles, and/or Irvine, CA; McLean, VA; Washington, DC. The base compensation range for the role in any of those locations is $136,000 USD - $253,000 USD. For any eligible US locations, unless otherwise noted, the base compensation range for this role is $118,400 USD - $219,800 USD. For Ontario, Canada, the base compensation range for this role is $140,600 CAD - $190,600 CAD. Base pay is positioned within the range based on several factors including an individual’s knowledge, skills and experience with consideration given to internal equity. Base pay is one part of a comprehensive Total Reward program which also includes flexible and supportive benefits and other wellbeing programs. This role may also be eligible for an Annual Bonus based on a combination of enterprise and individual performance.

About Us

Thomson Reuters informs the way forward by bringing together the trusted content and technology that people and organizations need to make the right decisions. We serve professionals across legal, tax, accounting, compliance, government, and media. Our products combine highly specialized software and insights to empower professionals with the data, intelligence, and solutions needed to make informed decisions, and to help institutions in their pursuit of justice, truth, and transparency. Reuters, part of Thomson Reuters, is a world leading provider of trusted journalism and news.

We are powered by the talents of 26,000 employees across more than 70 countries, where everyone has a chance to contribute and grow professionally in flexible work environments. At a time when objectivity, accuracy, fairness, and transparency are under attack, we consider it our duty to pursue them. Sound exciting? Join us and help shape the industries that move society forward.

As a global business, we rely on the unique backgrounds, perspectives, and experiences of all employees to deliver on our business goals. To ensure we can do that, we seek talented, qualified employees in all our operations around the world regardless of race, color, sex/gender, including pregnancy, gender identity and expression, national origin, religion, sexual orientation, disability, age, marital status, citizen status, veteran status, or any other protected classification under applicable law. Thomson Reuters is proud to be an Equal Employment Opportunity Employer providing a drug-free workplace.

Thomson Reuters makes reasonable accommodations for applicants with disabilities, including veterans with disabilities, and for sincerely held religious beliefs in accordance with applicable law. If you reside in the United States and require an accommodation in the recruiting process, you may contact our Human Resources Department at [email protected]. Disability accommodations in the recruiting process may include things like a sign language interpreter, making interview rooms accessible, providing assistive technology, or other relevant accommodations. Please note this email is not intended for general recruitment questions and we will promptly respond to inquiries regarding accommodations. More information on requesting an accommodation here.

Learn more on how to protect yourself from fraudulent job postings here.

More information about Thomson Reuters can be found on thomsonreuters.com

Top Skills

AWS
Elasticsearch
Opensearch
Python
Terraform
Vespa
HQ

Thomson Reuters Toronto, Ontario, CAN Office

19 Duncan Street, Toronto, Ontario, Canada, M5H 3H1

Similar Jobs

17 Minutes Ago
In-Office or Remote
Mid level
Mid level
Big Data • Information Technology • Software • Analytics • Energy
The Cloud Security Integration Engineer manages security integrations of third-party platforms and internal systems, overseeing cloud security posture and incident response across AWS, Azure, and GCP environments, while collaborating on secure architecture with DevOps.
Top Skills: AWSAzureCspmEdrFido2GCPIdentity ManagementPasskeysSIEMVulnerability Management
17 Minutes Ago
In-Office or Remote
Senior level
Senior level
Big Data • Information Technology • Software • Analytics • Energy
The Account Director will manage the full sales lifecycle for Midstream accounts, focusing on lead generation, sales, and account management in the energy sector.
Top Skills: CRMSalesforce
17 Minutes Ago
In-Office or Remote
Mid level
Mid level
Big Data • Information Technology • Software • Analytics • Energy
As an Owner Relations Agent, you'll manage owner inquiries regarding revenue and land issues, build client relationships, and support customer needs through effective communication and multitasking.
Top Skills: MS OfficeOil And Gas Software

What you need to know about the Toronto Tech Scene

Although home to some of the biggest names in tech, including Google, Microsoft and Amazon, Toronto has established itself as one of the largest startup ecosystems in the world. And with over 2,000 startups — more than 30 percent of the country's total startups — Toronto continues to attract new businesses. Be it helping entrepreneurs manage their finances, simplifying business operations by automating payroll or assisting pharmaceutical companies in launching new drugs, the city's tech scene is just getting started.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account