Armilla AI Logo

Armilla AI

Applied Scientist, AI Risk

Reposted 8 Days Ago
Be an Early Applicant
In-Office
Toronto, ON, CAN
Mid level
In-Office
Toronto, ON, CAN
Mid level
Conduct research into AI model behavior and failure modes; design and build models, agents, and automated evaluation pipelines to assess AI risk; develop production-grade tooling for continuous monitoring and model testing; create novel risk metrics; collaborate with underwriting and actuarial teams to translate findings into pricing signals; prototype and deploy robust, well-tested systems while communicating results to technical and non-technical stakeholders.
The summary above was generated by AI

About Armilla AI

Armilla AI is a cutting-edge startup at the intersection of artificial intelligence and insurance. Based in Toronto, Ontario, Canada, we're building innovative solutions to manage, underwrite, and insure the rapidly evolving risks associated with AI systems. We're a dynamic team passionate about pioneering the future of AI risk management.

The Role

We're seeking an exceptional Applied Scientist who bridges the worlds of deep AI research and practical, production-grade applications. As our Applied Scientist, AI Risk, you'll be instrumental in both advancing our understanding of AI systems and translating that knowledge into robust risk assessment and evaluation frameworks. This isn't just about building models—it's about understanding their failure modes, vulnerabilities, and real-world reliability. You'll develop AI systems that evaluate other AI systems, creating the next generation of automated risk assessment tools. You'll be shaping how the insurance industry evaluates and prices AI risk.


Role Responsibilities

  • Conduct deep technical research into AI model behavior, failure modes, and edge cases, with a focus on practical risk assessment as well as deep research.
  • Design and develop AI systems—including specialized models, agents, and automated evaluation pipelines—that assess the safety, reliability, and risk profiles of other AI systems.
  • Build production-grade tooling and platforms for automated AI risk assessment, model testing, and continuous monitoring.
  • Evaluate diverse AI systems—from traditional ML models to Large Language Models and multimodal systems—identifying insurable risks, vulnerabilities, and potential failure scenarios.
  • Develop novel evaluation methodologies and metrics that capture AI-specific risks such as adversarial vulnerabilities, distribution shift, hallucinations, and behavioral misalignment.
  • Collaborate closely with our underwriting and actuarial teams to translate technical findings into actionable risk insights and pricing signals.
  • Stay at the forefront of AI research and safety literature, rapidly prototyping and validating new risk assessment techniques.
  • Write clean, maintainable, and well-tested code that scales from research prototypes to production systems.
  • Communicate complex technical concepts clearly to diverse stakeholders, from engineers to underwriters to executive leadership.

What We're Looking For

  • Advanced degree (Master's or PhD) in Computer Science, Machine Learning, Statistics, or related field, with demonstrated research experience in AI/ML.
  • Strong track record of applied research—you've published, contributed to open source, or shipped ML products that had real-world impact beyond academic settings.
  • Deep technical expertise in machine learning fundamentals, including both classical ML and modern deep learning approaches.
  • Experience building AI/ML systems from conception to deployment, not just running evaluations on existing models.
  • Hands-on experience with model evaluation, testing, and validation—you think critically about where models fail, not just where they succeed.
  • Solid software engineering skills with expertise in Python and experience with ML frameworks (PyTorch, TensorFlow, JAX) and scientific computing libraries (NumPy, Pandas, Scikit-learn).
  • Experience with LLMs and generative AI, including familiarity with their unique risks, evaluation challenges, and safety considerations.
  • Background in AI safety, robustness, interpretability, or adversarial ML is a significant asset.
  • Ability to work both independently on deep technical problems and collaboratively in a fast-paced startup environment.
  • Intellectual curiosity about the "other side"—not just building AI, but understanding its risks, limitations, and societal implications.
  • Strong problem-solving abilities and attention to detail, with a pragmatic approach to balancing research rigor with business needs.

What's In It For You

  • Pioneering a New Frontier: You'll be at the forefront of an emerging field, defining how AI systems are evaluated, understood, and insured at scale.
  • Dual-Sided Impact: Work on both the technical frontier of AI evaluation and the practical challenge of real-world risk assessment—a rare combination.
  • Meta-AI Challenge: Tackle the fascinating problem of building AI systems that understand and evaluate other AI systems.
  • Impactful Work: Your research and tooling will directly shape how AI risk is quantified and managed across industries.
  • Startup Agility: Enjoy the fast-paced, innovative, and collaborative culture of a growing startup where your ideas can quickly become reality.
  • Professional Growth: Unparalleled opportunities to develop expertise at the intersection of AI research, risk management, and insurance alongside deeply experienced AI and industry experts.
  • Technical Freedom: Latitude to pursue novel research directions and evaluation approaches that advance both the field and our business.

Top Skills

Generative Ai
Jax
Llms
Numpy
Pandas
Python
PyTorch
Scikit-Learn
TensorFlow

Armilla AI Toronto, Ontario, CAN Office

Toronto, Ontario, Canada

Similar Jobs

6 Hours Ago
Hybrid
Toronto, ON, CAN
Senior level
Senior level
Software
As Head of Demand Generation, you'll strategize and execute multi-channel demand generation, manage paid acquisition, optimize email sequences, and enhance BDR coordination to drive pipeline growth.
Top Skills: Google AdsGoogle Analytics 4HubspotSalesforce
7 Hours Ago
Remote or Hybrid
Canada
Junior
Junior
HR Tech • Information Technology • Professional Services • Sales • Software
The SMB Account Executive drives new business growth by managing the sales cycle, from prospecting to closing deals in a SaaS environment.
Top Skills: Salesforce
7 Hours Ago
In-Office or Remote
CA
Expert/Leader
Expert/Leader
Blockchain • eCommerce • Fintech • Payments • Software • Financial Services • Cryptocurrency
The Compliance Technology Oversight Lead manages machine learning compliance models, partners with engineering for oversight, and ensures alignment with regulatory standards.
Top Skills: Data AnalyticsData VisualizationHeuristic ModelsMachine Learning

What you need to know about the Toronto Tech Scene

Although home to some of the biggest names in tech, including Google, Microsoft and Amazon, Toronto has established itself as one of the largest startup ecosystems in the world. And with over 2,000 startups — more than 30 percent of the country's total startups — Toronto continues to attract new businesses. Be it helping entrepreneurs manage their finances, simplifying business operations by automating payroll or assisting pharmaceutical companies in launching new drugs, the city's tech scene is just getting started.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account