Seeking an experienced GPU Architect/Designer to define next-generation GPU microarchitecture, optimize performance, and implement systems for high-efficiency parallel computing solutions.
GPU Architecture / Design (Shader / SIMT)
About the Company Our client is a well-funded, venture-backed semiconductor startup developing next-generation GPU technology. The company is in a growth stage with significant capital backing and is building a world-class engineering team to design high-performance, scalable GPU architectures from the ground up. This is a rare opportunity to join at a foundational stage and directly shape the direction of cutting-edge silicon.
Job Summary We are seeking an experienced GPU Architect/Designer with strong expertise in shader core architecture and SIMT (Single Instruction Multiple Threads) execution models. This role involves defining next-generation GPU microarchitecture, optimizing throughput and efficiency, and driving scalable, high-performance parallel compute solutions.
The ideal candidate will have deep knowledge of GPU shader pipelines, thread scheduling, memory hierarchy, and parallel execution models, with experience translating architectural concepts into high-quality RTL implementations.
Key Responsibilities
Architecture & Microarchitecture
- Define and evolve GPU shader core architecture, including SIMT execution units and pipeline design.
- Design warp/wavefront scheduling, thread dispatch, and execution models.
- Architect SIMT execution pipelines, including ALU pipelines, vector units, and control flow units.
- Define thread divergence handling, reconvergence strategies, and branch control mechanisms.
- Develop scalable shader architectures supporting high thread-level parallelism.
- Collaborate on ISA definitions related to shader and compute workloads.
- Analyze shader workloads and identify performance bottlenecks.
- Optimize GPU execution efficiency across diverse workloads including compute shaders, AI/ML kernels, and high-performance parallel workloads.
- Drive performance-per-watt and area efficiency improvements.
Memory & Interconnect
- Define GPU memory subsystem interactions including register files, shared/local memory, L1/L2 cache hierarchy, and memory coalescing mechanisms.
- Optimize memory access scheduling and bandwidth utilization.
- Collaborate on interconnect and memory fabric architecture.
RTL & Design
- Translate architectural specifications into microarchitecture definitions.
- Implement shader pipeline logic in SystemVerilog.
Verification & Validation
- Define architectural test plans and validation strategies.
- Develop directed tests, constrained-random tests, and performance validation frameworks.
- Analyze simulation and silicon results to drive design improvements.
Required Qualifications
Education: Bachelor's, Master's, or PhD in Computer Engineering, Electrical Engineering, or Computer Science.
10+ years of experience in GPU, CPU, or parallel processor architecture.
Strong experience with:
- SIMT / SIMD architectures
- Shader core design
- Thread scheduling
- Pipeline microarchitecture
- Memory hierarchy design
Proficiency in:
- SystemVerilog or Verilog
- Microarchitecture specification development
- Performance modeling tools
- RTL-level debugging
Deep understanding of:
- Parallel computing models
- GPU execution models
- Pipeline hazard handling
- Synchronization primitives
Compensation: 175,000 - 250,000 USD + Meaningful Equity
CompensationThe base pay range for this role is $175,000 – $250,000 per year.
Similar Jobs
Big Data • Fintech • Mobile • Payments • Financial Services
As a Machine Learning Engineer II at Affirm, you'll develop AI systems for automating customer operations and collaborate on building models while ensuring high-quality code production.
Top Skills:
AirflowCatboostKubeflowLangchainLanggraphLightgbmLlm ApisMlflowPythonXgboost
Artificial Intelligence • Edtech • Machine Learning • Software
The Engineering Manager will lead the data platform team, driving architectural decisions, optimizing systems for performance, and collaborating cross-functionally to enhance data usability and value.
Top Skills:
Analytics SystemsAWSAzureEltETLGCPJavaScriptNode.jsReactSpark
Artificial Intelligence • Hardware • Healthtech • Software
The Senior Data Scientist will build models and analyses, design experiments, integrate datasets, and leverage AI for improved workflows and insights in data science.
Top Skills:
DatabricksMlflowPandasPythonPyTorch
What you need to know about the Toronto Tech Scene
Although home to some of the biggest names in tech, including Google, Microsoft and Amazon, Toronto has established itself as one of the largest startup ecosystems in the world. And with over 2,000 startups — more than 30 percent of the country's total startups — Toronto continues to attract new businesses. Be it helping entrepreneurs manage their finances, simplifying business operations by automating payroll or assisting pharmaceutical companies in launching new drugs, the city's tech scene is just getting started.


