Armilla AI Logo

Armilla AI

Applied Scientist, AI Risk

Posted 14 Days Ago
Be an Early Applicant
In-Office
Toronto, ON
Mid level
In-Office
Toronto, ON
Mid level
Conduct research into AI model behavior and failure modes; design and build models, agents, and automated evaluation pipelines to assess AI risk; develop production-grade tooling for continuous monitoring and model testing; create novel risk metrics; collaborate with underwriting and actuarial teams to translate findings into pricing signals; prototype and deploy robust, well-tested systems while communicating results to technical and non-technical stakeholders.
The summary above was generated by AI

About Armilla AI

Armilla AI is a cutting-edge startup at the intersection of artificial intelligence and insurance. Based in Toronto, Ontario, Canada, we're building innovative solutions to manage, underwrite, and insure the rapidly evolving risks associated with AI systems. We're a dynamic team passionate about pioneering the future of AI risk management.

The Role

We're seeking an exceptional Applied Scientist who bridges the worlds of deep AI research and practical, production-grade applications. As our Applied Scientist, AI Risk, you'll be instrumental in both advancing our understanding of AI systems and translating that knowledge into robust risk assessment and evaluation frameworks. This isn't just about building models—it's about understanding their failure modes, vulnerabilities, and real-world reliability. You'll develop AI systems that evaluate other AI systems, creating the next generation of automated risk assessment tools. You'll be shaping how the insurance industry evaluates and prices AI risk.


Role Responsibilities

  • Conduct deep technical research into AI model behavior, failure modes, and edge cases, with a focus on practical risk assessment as well as deep research.
  • Design and develop AI systems—including specialized models, agents, and automated evaluation pipelines—that assess the safety, reliability, and risk profiles of other AI systems.
  • Build production-grade tooling and platforms for automated AI risk assessment, model testing, and continuous monitoring.
  • Evaluate diverse AI systems—from traditional ML models to Large Language Models and multimodal systems—identifying insurable risks, vulnerabilities, and potential failure scenarios.
  • Develop novel evaluation methodologies and metrics that capture AI-specific risks such as adversarial vulnerabilities, distribution shift, hallucinations, and behavioral misalignment.
  • Collaborate closely with our underwriting and actuarial teams to translate technical findings into actionable risk insights and pricing signals.
  • Stay at the forefront of AI research and safety literature, rapidly prototyping and validating new risk assessment techniques.
  • Write clean, maintainable, and well-tested code that scales from research prototypes to production systems.
  • Communicate complex technical concepts clearly to diverse stakeholders, from engineers to underwriters to executive leadership.

What We're Looking For

  • Advanced degree (Master's or PhD) in Computer Science, Machine Learning, Statistics, or related field, with demonstrated research experience in AI/ML.
  • Strong track record of applied research—you've published, contributed to open source, or shipped ML products that had real-world impact beyond academic settings.
  • Deep technical expertise in machine learning fundamentals, including both classical ML and modern deep learning approaches.
  • Experience building AI/ML systems from conception to deployment, not just running evaluations on existing models.
  • Hands-on experience with model evaluation, testing, and validation—you think critically about where models fail, not just where they succeed.
  • Solid software engineering skills with expertise in Python and experience with ML frameworks (PyTorch, TensorFlow, JAX) and scientific computing libraries (NumPy, Pandas, Scikit-learn).
  • Experience with LLMs and generative AI, including familiarity with their unique risks, evaluation challenges, and safety considerations.
  • Background in AI safety, robustness, interpretability, or adversarial ML is a significant asset.
  • Ability to work both independently on deep technical problems and collaboratively in a fast-paced startup environment.
  • Intellectual curiosity about the "other side"—not just building AI, but understanding its risks, limitations, and societal implications.
  • Strong problem-solving abilities and attention to detail, with a pragmatic approach to balancing research rigor with business needs.

What's In It For You

  • Pioneering a New Frontier: You'll be at the forefront of an emerging field, defining how AI systems are evaluated, understood, and insured at scale.
  • Dual-Sided Impact: Work on both the technical frontier of AI evaluation and the practical challenge of real-world risk assessment—a rare combination.
  • Meta-AI Challenge: Tackle the fascinating problem of building AI systems that understand and evaluate other AI systems.
  • Impactful Work: Your research and tooling will directly shape how AI risk is quantified and managed across industries.
  • Startup Agility: Enjoy the fast-paced, innovative, and collaborative culture of a growing startup where your ideas can quickly become reality.
  • Professional Growth: Unparalleled opportunities to develop expertise at the intersection of AI research, risk management, and insurance alongside deeply experienced AI and industry experts.
  • Technical Freedom: Latitude to pursue novel research directions and evaluation approaches that advance both the field and our business.

Top Skills

Python,Pytorch,Tensorflow,Jax,Numpy,Pandas,Scikit-Learn,Llms,Generative Ai

Armilla AI Toronto, Ontario, CAN Office

Toronto, Ontario, Canada

Similar Jobs

58 Minutes Ago
Easy Apply
Remote or Hybrid
CA
Easy Apply
Senior level
Senior level
Artificial Intelligence • Cloud • Computer Vision • Hardware • Internet of Things • Software
The Senior Marketing Analytics Manager will conduct data analysis to guide marketing strategies, design marketing experiments, develop predictive models, and collaborate with cross-functional teams to enhance marketing insights through advanced analytics.
Top Skills: AIMachine LearningPythonRSQLTableau
An Hour Ago
In-Office or Remote
4 Locations
Senior level
Senior level
Productivity • Software • App development • Automation
Responsible for developing PR strategies, managing media relations, and crafting narratives to enhance Apryse's brand presence globally.
Top Skills: Public Relations SoftwareSocial Media Tools
An Hour Ago
Easy Apply
Remote or Hybrid
2 Locations
Easy Apply
Expert/Leader
Expert/Leader
Artificial Intelligence • Information Technology • Machine Learning • Natural Language Processing • Productivity • Software • Generative AI
The Senior Engineering Manager will lead mobile engineering teams, oversee project execution, mentor engineers, and advocate for user experience and quality across iOS and Android platforms.
Top Skills: AndroidiOS

What you need to know about the Toronto Tech Scene

Although home to some of the biggest names in tech, including Google, Microsoft and Amazon, Toronto has established itself as one of the largest startup ecosystems in the world. And with over 2,000 startups — more than 30 percent of the country's total startups — Toronto continues to attract new businesses. Be it helping entrepreneurs manage their finances, simplifying business operations by automating payroll or assisting pharmaceutical companies in launching new drugs, the city's tech scene is just getting started.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account