Support researchers and ML engineers to optimize and scale HPC workloads on CPU/GPU infrastructure. Analyze performance, diagnose bottlenecks across storage, networking, and schedulers, implement profiling and benchmarking, influence platform design, and define best practices for shared compute environments while ensuring security and compliance.
At IMC, technology is not a department; it's at the heart of everything we do. Developed in-house, our systems power world-class research and trading, enabling teams to make faster and better decisions every day. Our high-performance computing platforms sit at the core of this capability, supporting large-scale simulation, research, and machine learning workloads across the firm.
We're looking for an HPC Engineer to work closely with Researchers, Machine Learning Engineers, and Software Engineers to get the most out of our compute platforms. This role is about applying deep systems and parallel computing expertise to help users get the best possible performance from modern CPU and GPU-based infrastructure.
Your Core Responsibilities:
As an HPC Engineer, you'll act as a technical partner to our compute users, helping translate computational problems into efficient, scalable workloads. You'll work across teams to improve performance, throughput, and reliability, while shaping best practices for how HPC resources are used across IMC.
Your Skills and Experience
We're looking for an HPC Engineer to work closely with Researchers, Machine Learning Engineers, and Software Engineers to get the most out of our compute platforms. This role is about applying deep systems and parallel computing expertise to help users get the best possible performance from modern CPU and GPU-based infrastructure.
Your Core Responsibilities:
As an HPC Engineer, you'll act as a technical partner to our compute users, helping translate computational problems into efficient, scalable workloads. You'll work across teams to improve performance, throughput, and reliability, while shaping best practices for how HPC resources are used across IMC.
- Partner with Researchers, Quants, and MLEs to analyze, optimize, and scale HPC workloads across CPU and GPU platforms
- Support and optimize ML and simulation workloads, with a focus on performance, resource efficiency, and scalability
- Apply deep understanding of storage, networking, and scheduling systems to improve end-to-end workload performance
- Define and promote best practices for running large-scale workloads on shared HPC infrastructure
- Work with platform teams to influence system design decisions based on real workload behavior and performance data
- Implement and use performance monitoring, profiling, and benchmarking tools to drive continuous improvement
- Ensure workloads and platforms meet internal security and compliance standards
Your Skills and Experience
- Experience working with HPC or large-scale compute environments, with a strong focus on workload performance and optimization
- A solid grasp of parallel computing concepts and how to tune performance on CPUs and GPUs
- Hands-on experience with GPU acceleration (e.g. CUDA) and an understanding of how GPU workloads really behave
- Solid systems knowledge across Linux, storage and networking; enough to diagnose bottlenecks and guide users effectively
- Experience supporting or optimizing ML, simulation, or data-intensive workloads in shared compute environments
- Familiarity with containers and orchestration tools like Kubernetes is beneficial
- Programming experience is advantageous, particularly in Python and C++
- Strong communication skills and a genuine interest in helping others get better performance from complex systems
- A high degree of flexibility and adaptability: willing and able to deal with uncertainty and ambiguity in a rapidly evolving environment
Similar Jobs at IMC Trading
Fintech • Machine Learning • Software • Financial Services
Design and verify next-generation ASICs for latency-sensitive trading systems. Responsibilities include RTL development in SystemVerilog, verification and testing, debugging, integration, tooling and automation, and cross-team collaboration to meet performance and reliability goals.
Top Skills:
AsicBashC++Clock-Domain CrossingDebugging ToolsEthernetFpgaPmaPythonRtlSystemverilogUnit TestingVerification Frameworks
Fintech • Machine Learning • Software • Financial Services
Design, build, and operate big-data and streaming platforms (Kafka, Hadoop, Dremio). Develop, deploy, and monitor data pipelines using Java, Python, Spark/Flink; collaborate on data modeling and ingestion, ensure data integrity and availability, act as a Big Data SME, and perform root-cause analysis.
Top Skills:
Alert ManagerAlertaBashDatabricksDockerDremioFlinkGrafanaHadoopHdfsIcebergJavaKafkaKubernetesOpsgeniePrometheusPythonS3SparkSQL
Fintech • Machine Learning • Software • Financial Services
Perform hands-on electronics lab work: validate and bring up circuit boards and ASICs, operate and maintain test equipment and fixtures, manage inventory and lab infrastructure, support manufacturing coordination, assembly, shipping, and documentation to enable hardware production and deployment.
Top Skills:
3D PrinterMultimeterOscilloscopePcb/Circuit BoardsSolderingTest EquipmentTest Fixtures
What you need to know about the Toronto Tech Scene
Although home to some of the biggest names in tech, including Google, Microsoft and Amazon, Toronto has established itself as one of the largest startup ecosystems in the world. And with over 2,000 startups — more than 30 percent of the country's total startups — Toronto continues to attract new businesses. Be it helping entrepreneurs manage their finances, simplifying business operations by automating payroll or assisting pharmaceutical companies in launching new drugs, the city's tech scene is just getting started.


.png)