The role involves designing and operating large-scale batch and streaming data pipelines, ensuring data correctness and efficiency across high-volume datasets for reporting and real-time products.
At Index Exchange, we’re reinventing how digital advertising works—at scale. As a global advertising supply-side platform, we empower the world’s leading media owners and marketers to thrive in a programmatic, privacy-first ecosystem.
We’re a proud industry pioneer with over 20 years of experience accelerating the ad technology evolution. Our proprietary tech is trusted by some of the world’s largest brands and media owners and plays a crucial role in keeping the internet open, accessible, and largely free.
We process more than 550 billion real-time auctions every day (in comparison, Google processes 8.5 billion searches per day) with ultra-low latency. Our platform is vertically integrated from servers to networks and runs primarily on our own metal and cloud infrastructure. This end-to-end infrastructure is designed to provide both stability and agility, enabling us to adapt quickly as the market evolves.
At the core of it all is our engineering-first culture. Our engineers tackle internet-scale problems across tight-knit, global teams. From moving petabytes of data and optimizing with AI to making real-time infrastructure decisions, Indexers have the agency and influence to shape the future of advertising. We move fast, build thoughtfully, and stay grounded in our core values.
We are hiring a Senior / Staff Data Engineer to build and evolve the data processing and pipeline layer that powers reporting, billing systems, and real-time data products at Index Exchange.
This role focuses on designing and operating large-scale batch and streaming data pipelines, enabling reliable, scalable, and efficient data transformation across the platform.
You will work on systems that transform raw, high-volume event data into clean, queryable, and production-grade datasets, supporting both API-driven data products and analytical workflows.
You will work on high-scale data systems that:
- Process billions of events per day across distributed pipelines
- Power core business datasets (reporting, billing, marketplace metrics)
- Operate across batch (Spark) and streaming (Kafka / Flink) architectures
- Require careful balancing of:
- data correctness
- processing efficiency
- latency vs cost trade-offs
You will solve problems such as:
- Designing pipelines that scale without exploding compute costs
- Managing data correctness at scale (deduplication, late data, joins)
- Building systems that support both:
- historical backfills
- near real-time updates
- Evolving pipelines from centralized processing ( Hadoop) toward more distributed and efficient patterns
- Streaming pipelines and Streaming DWs.
- Strong experience in data engineering at scale
- Deep expertise in:
- Spark (required)
- SQL and data modeling
- Experience with:
- Airflow or workflow orchestration
- Kafka or streaming systems
- Strong understanding of:
- distributed data processing
- data modeling (large-scale datasets)
- performance optimization
- Ability to:
- own pipelines end-to-end
- debug complex data issues
- work in high-scale, evolving environments
Staff-Level Expectations (if applicable)
- Define data processing standards and patterns across teams
- Lead large-scale pipeline and platform initiatives
- Influence data architecture and modeling decisions
- Drive improvements across:
- reliability
- cost efficiency
- scalability
Data Pipelines (Batch + Streaming)
- Design and operate pipelines using:
- Spark (primary)
- Kafka / Flink (streaming)
- Transform raw event data into:
- cleaned datasets (silver layer)
- business-ready datasets (gold / reporting tables)
Core Data Models & Datasets
- Build and maintain canonical datasets (aggregated datasets, reporting tables)
- Define data contracts and ensure consistency across pipelines
- Support evolving use cases:
- reporting
- billing
- ML / experimentation
Workflow Orchestration
- Build and maintain Airflow DAGs for:
- pipeline scheduling
- dependency management
- backfills
- Improve reliability and observability of workflows
Data Processing Optimization
- Optimize pipelines for:
- performance (runtime, throughput)
- cost (compute efficiency)
- scalability (data growth)
- Improve:
- partitioning strategies
- data layout
- job execution patterns
Streaming & Near Real-Time Pipelines
- Build pipelines that support:
- incremental updates
- streaming transformations
- aggregation at scale
- Contribute to evolving patterns such as:
- edge aggregation
- streaming → batch convergence
- real-time data availability
Platform & System Design Responsibilities
- Define and evolve data processing patterns:
- batch vs streaming
- aggregation strategies
- incremental vs full recompute
- Work across:
- Spark (core processing)
- Kafka (transport)
- Flink (streaming compute)
- storage systems (Hadoop / Ceph)
- Contribute to:
- data platform architecture decisions
- pipeline standardization
- reusable data processing frameworks
- Influence trade-offs:
- latency vs cost
- correctness vs performance
- compute vs storage
You will work closely with:
- APIs & data products
- Data Systems / Platform teams
- ML and experimentation teams
- Application Engineering
- Comprehensive health, dental, and vision plans for you and your dependents
- Paid time off, health days, and personal obligation days plus flexible work schedules
- Competitive retirement matching plans
- Equity packages
- Generous parental leave available to birthing, non-birthing, and adoptive parents
- Annual well-being allowance plus fitness discounts and group wellness activities
- Commuter benefits and discounts, where available
- Employee assistance program
- Mental health first aid program that provides an in-the-moment point of contact and reassurance
- One day of volunteer time off per year and a donation-matching program
- Bi-weekly town halls and regular community-led team events
- Multiple resources and programming to support continuous learning
- A workplace that supports a diverse, equitable, and inclusive environment – learn more here
At Index Exchange, we believe that successful products are built by teams just as diverse as the audience who uses them. As such, we are committed to equal employment opportunities. We celebrate diversity of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or expression, or veteran status. Additionally, we realize that diversity is deeper than any status or classification—diversity is the human experience. For those who show grit, passion, and humility—Index will welcome you.
Index Exchange welcomes and encourages individuals with disabilities to apply to work with us.
If you require an accommodation, please share the details of your request and any information how we can assist you with the hiring recruiter when they contact you. Index Exchange will make reasonable efforts to ensure accommodation requests are met throughout the recruitment process.
Our corporate headquarters are in Toronto, with major offices in New York, Montreal, Kitchener, London, San Francisco, and many other global cities. As a major global advertising exchange, we are committed to operating as a tightly knit global team and embracing and empowering talent wherever our colleagues may be.
Index Exchange Toronto, Ontario, CAN Office
8 Spadina Ave. Suite 2900, Toronto, Ontario, Canada, M5V 0S8
Index Exchange Kitchener, Ontario, CAN Office
305 King St. W, Suite 801 , Kitchener, Ontario, Canada, N2G 1B9
Similar Jobs
Artificial Intelligence • Marketing Tech • Software
The Senior Data Engineer will design and maintain streaming data pipelines, develop analytical databases, and ensure data system reliability while collaborating with teams across the organization.
Top Skills:
Apache FlinkAWSCloudFormationElixirGCPKafkaKinesisPulsarPythonSQLTerraform
Marketing Tech • Software
As a Senior Data Engineer, you'll design and maintain data platforms and pipelines, collaborate with stakeholders, ensure data security, and optimize workflows using various tools and technologies.
Top Skills:
Azure FabricDatabricksHadoopKafkaPythonSnowflakeSparkSQL
Fintech • Financial Services
As a Senior Data Engineer, you will build a data platform, create data pipelines for market and trade data, and collaborate with traders and quants to enhance analytical workflows.
Top Skills:
AirflowDockerFastapiKdbPolarsPythonTrino
What you need to know about the Toronto Tech Scene
Although home to some of the biggest names in tech, including Google, Microsoft and Amazon, Toronto has established itself as one of the largest startup ecosystems in the world. And with over 2,000 startups — more than 30 percent of the country's total startups — Toronto continues to attract new businesses. Be it helping entrepreneurs manage their finances, simplifying business operations by automating payroll or assisting pharmaceutical companies in launching new drugs, the city's tech scene is just getting started.



