Midpage Logo

Midpage

RL Deep Learning Engineer (Remote)

Posted 3 Days Ago
In-Office or Remote
Hiring Remotely in Canada
Mid level
In-Office or Remote
Hiring Remotely in Canada
Mid level
Build and scale RL environments and evaluation harnesses for long-horizon legal reasoning. Own pipelines converting court filings into contamination-free benchmarks and RL tasks, integrate with partner model APIs, and collaborate with attorneys to create scorable task formats.
The summary above was generated by AI

About Midpage

Midpage is the search engine for legal data used by AI labs. We cover all US court data - 20M records. Over 300 law firms use our platform directly, 200k+ visitors read cases on our site every month, and five multibillion-dollar companies including Perplexity trust us as their legal data supplier. We're a team of 7 in Bowery, lower Manhattan. Our ARR has grown from $400k to $2M in the last 4 months.

The role

We're seeking an engineering generalist to build the first RL environments and benchmarks purpose-built for long-horizon legal reasoning—tasks where AI agents must search, read, analyze, and draft across real case filings, the same work that still takes teams of lawyers days to weeks. Frontier labs are will use these environments to make future models more legally capable and we need an engineer to own the infrastructure that makes it all work.

You'll design and scale the systems that turn millions of real court filings into verifiable evaluation environments and RL training tasks. You'll work directly with our attorneys, our data pipeline, and our partners at frontier AI labs.

What you'll do

- Build and maintain the evaluation harness and RL environment infrastructure—task runners, sandboxed environments, and scoring logic that can scale to thousands of parallel agents

- Own the data pipeline that turns freshly collected court filings into benchmark and RL tasks before they reach any model's training set

- Integrate with partner harnesses and model APIs to run contamination-free evaluations

- Collaborate with attorneys to translate legal workflows like cite checks, motion drafting, and precedent research into structured, scorable task formats using the Harbor spec

What we're looking for

- Strong generalist software engineering fundamentals. You've built, scaled, and maintained diverse systems in production

- You’ve built entire systems yourself, don’t require detailed specs or product managers, and take full ownership over your projects

- Deep experience with Python, bonus for TypeScript. Most importantly, you can work on hard engineering problems

- You should be kind, self-managing, and a clear communicator

- You make effect use of Cursor/Claude Code/Codex and are capable of writing good code without them

Bonuses but not requirements

- Familiarity with LLM evaluation. You get what makes a good rubric and why benchmarks leak

- Comfort working with messy, real-world document data (legal filings, PDFs, long-form text)

Similar Jobs

26 Minutes Ago
Easy Apply
Remote
Canada
Easy Apply
Senior level
Senior level
Big Data • Fintech • Mobile • Payments • Financial Services
Independently validate and monitor machine learning models used for credit underwriting, credit risk, and fraud detection. Identify weaknesses, recommend improvements, and collaborate with model owners to remediate findings. Partner cross-functionally to implement and maintain the Model Risk Management framework and support audit, compliance, and regulatory requests.
Top Skills: PysparkPythonScikit-LearnSQL
Mid level
Artificial Intelligence • Healthtech • Machine Learning • Natural Language Processing • Biotech • Pharmaceutical
Field-based pharmaceutical sales role promoting Primary Care portfolio to providers and health systems. Execute brand and account plans, deliver approved promotional and disease-state education, build compliant clinical relationships, coordinate access pathways, use digital tools and cross-functional collaboration to drive territory performance and patient outcomes.
4 Hours Ago
Remote or Hybrid
Canada
Senior level
Senior level
Digital Media • Gaming • Information Technology • Software • Sports • Esports • Big Data Analytics
Lead end-to-end personalization ML initiatives: build scalable ML pipelines, design CI/CD for models, monitor production performance, implement retraining and drift detection, partner with cross-functional teams, and mentor engineers.
Top Skills: A/B TestingCi/CdDatabricksGitopsJenkinsMlflowPythonSparkSQL

What you need to know about the Toronto Tech Scene

Although home to some of the biggest names in tech, including Google, Microsoft and Amazon, Toronto has established itself as one of the largest startup ecosystems in the world. And with over 2,000 startups — more than 30 percent of the country's total startups — Toronto continues to attract new businesses. Be it helping entrepreneurs manage their finances, simplifying business operations by automating payroll or assisting pharmaceutical companies in launching new drugs, the city's tech scene is just getting started.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account