Armilla AI Jobs

Applied Scientist, AI Risk

Armilla AI

Applied Scientist, AI Risk

Reposted 2 Days Ago

Be an Early Applicant

In-Office

Toronto, ON, CAN

Mid level

In-Office

Toronto, ON, CAN

Mid level

Conduct research into AI model behavior and failure modes; design and build models, agents, and automated evaluation pipelines to assess AI risk; develop production-grade tooling for continuous monitoring and model testing; create novel risk metrics; collaborate with underwriting and actuarial teams to translate findings into pricing signals; prototype and deploy robust, well-tested systems while communicating results to technical and non-technical stakeholders.

The summary above was generated by AI

About Armilla AI

Armilla AI is a cutting-edge startup at the intersection of artificial intelligence and insurance. Based in Toronto, Ontario, Canada, we're building innovative solutions to manage, underwrite, and insure the rapidly evolving risks associated with AI systems. We're a dynamic team passionate about pioneering the future of AI risk management.

The Role

We're seeking an exceptional Applied Scientist who bridges the worlds of deep AI research and practical, production-grade applications. As our Applied Scientist, AI Risk, you'll be instrumental in both advancing our understanding of AI systems and translating that knowledge into robust risk assessment and evaluation frameworks. This isn't just about building models—it's about understanding their failure modes, vulnerabilities, and real-world reliability. You'll develop AI systems that evaluate other AI systems, creating the next generation of automated risk assessment tools. You'll be shaping how the insurance industry evaluates and prices AI risk.

Role Responsibilities

Conduct deep technical research into AI model behavior, failure modes, and edge cases, with a focus on practical risk assessment as well as deep research.
Design and develop AI systems—including specialized models, agents, and automated evaluation pipelines—that assess the safety, reliability, and risk profiles of other AI systems.
Build production-grade tooling and platforms for automated AI risk assessment, model testing, and continuous monitoring.
Evaluate diverse AI systems—from traditional ML models to Large Language Models and multimodal systems—identifying insurable risks, vulnerabilities, and potential failure scenarios.
Develop novel evaluation methodologies and metrics that capture AI-specific risks such as adversarial vulnerabilities, distribution shift, hallucinations, and behavioral misalignment.
Collaborate closely with our underwriting and actuarial teams to translate technical findings into actionable risk insights and pricing signals.
Stay at the forefront of AI research and safety literature, rapidly prototyping and validating new risk assessment techniques.
Write clean, maintainable, and well-tested code that scales from research prototypes to production systems.
Communicate complex technical concepts clearly to diverse stakeholders, from engineers to underwriters to executive leadership.

What We're Looking For

Advanced degree (Master's or PhD) in Computer Science, Machine Learning, Statistics, or related field, with demonstrated research experience in AI/ML.
Strong track record of applied research—you've published, contributed to open source, or shipped ML products that had real-world impact beyond academic settings.
Deep technical expertise in machine learning fundamentals, including both classical ML and modern deep learning approaches.
Experience building AI/ML systems from conception to deployment, not just running evaluations on existing models.
Hands-on experience with model evaluation, testing, and validation—you think critically about where models fail, not just where they succeed.
Solid software engineering skills with expertise in Python and experience with ML frameworks (PyTorch, TensorFlow, JAX) and scientific computing libraries (NumPy, Pandas, Scikit-learn).
Experience with LLMs and generative AI, including familiarity with their unique risks, evaluation challenges, and safety considerations.
Background in AI safety, robustness, interpretability, or adversarial ML is a significant asset.
Ability to work both independently on deep technical problems and collaboratively in a fast-paced startup environment.
Intellectual curiosity about the "other side"—not just building AI, but understanding its risks, limitations, and societal implications.
Strong problem-solving abilities and attention to detail, with a pragmatic approach to balancing research rigor with business needs.

What's In It For You

Pioneering a New Frontier: You'll be at the forefront of an emerging field, defining how AI systems are evaluated, understood, and insured at scale.
Dual-Sided Impact: Work on both the technical frontier of AI evaluation and the practical challenge of real-world risk assessment—a rare combination.
Meta-AI Challenge: Tackle the fascinating problem of building AI systems that understand and evaluate other AI systems.
Impactful Work: Your research and tooling will directly shape how AI risk is quantified and managed across industries.
Startup Agility: Enjoy the fast-paced, innovative, and collaborative culture of a growing startup where your ideas can quickly become reality.
Professional Growth: Unparalleled opportunities to develop expertise at the intersection of AI research, risk management, and insurance alongside deeply experienced AI and industry experts.
Technical Freedom: Latitude to pursue novel research directions and evaluation approaches that advance both the field and our business.

Toronto, Ontario, Canada

Similar Jobs

Tapestry - Coach and Kate Spade

Manager, Field Customer Experience

18 Minutes Ago

Remote or Hybrid

Toronto, ON, CAN

Senior level

eCommerce • Fashion • Retail • Sales • Wearables • Design

Lead regional retail training and customer experience initiatives, deliver and implement sales, service, and clienteling programs, monitor KPIs, coach store teams and managers, support onboarding and digital tool adoption (Coach Journey, Client Compass), and drive consistent brand service standards across the market.

Top Skills: Client CompassCoach JourneyExcelMS OfficePowerPointWord

Block

Software Engineer

2 Hours Ago

In-Office or Remote

Mid level

Blockchain • eCommerce • Fintech • Payments • Software • Financial Services • Cryptocurrency

Build and operate ingestion, reconciliation, and reporting systems that reconcile card-network and partner settlement files against internal transactions. Deliver end-to-end features for traceability, accounting journals, tax and regulatory reporting, and compliance tooling while ensuring reliability, scalability, and data privacy.

Top Skills: Ai ToolsAirflowAWSBigQueryCi/CdDelta LakeGoHadoopIso-8583JavaKafkaKubernetesObservabilityPysparkPythonSnowflakeSparkSQLTemporalTerraform

Block

Program Manager

2 Hours Ago

In-Office or Remote

Senior level

Blockchain • eCommerce • Fintech • Payments • Software • Financial Services • Cryptocurrency

Lead development and implementation of AI legal and compliance frameworks. Conduct cross-functional AI reviews, maintain governance docs (model cards, impact assessments), partner with engineering on transparency and fairness requirements, build incident response and monitoring processes, and scale AI legal review workflows.

Top Skills: Ai/MlAlgorithmic Impact AssessmentGenerative AiLlmsModel CardsModel Monitoring

What you need to know about the Toronto Tech Scene

Although home to some of the biggest names in tech, including Google, Microsoft and Amazon, Toronto has established itself as one of the largest startup ecosystems in the world. And with over 2,000 startups — more than 30 percent of the country's total startups — Toronto continues to attract new businesses. Be it helping entrepreneurs manage their finances, simplifying business operations by automating payroll or assisting pharmaceutical companies in launching new drugs, the city's tech scene is just getting started.

Armilla AI

Applied Scientist, AI Risk

Armilla AI Toronto, Ontario, CAN Office

Similar Jobs

Manager, Field Customer Experience

Software Engineer

Program Manager

What you need to know about the Toronto Tech Scene