Fusemachines

Lead Spark Data Engineer

Posted Yesterday

Be an Early Applicant

In-Office

Toronto, ON

Senior level

In-Office

Toronto, ON

Senior level

Lead design and implement scalable data solutions for an IoT platform using Spark (batch) and Flink (stream). Build grammar/parser, query validator, Spark translation layer, graph relationship logic; ensure parity across batch/stream, optimize Databricks/Azure resources, enforce data governance, and participate in Agile processes.

The summary above was generated by AI

About Fusemachines

Fusemachines is a leading AI strategy, talent, and education services provider. Founded by Sameer Maskey Ph.D., Adjunct Associate Professor at Columbia University, Fusemachines has a core mission of democratizing AI. With a presence in 4 countries (Nepal, the United States, Canada, and the Dominican Republic) and more than 450 full-time employees, Fusemachines brings global AI expertise to transform companies worldwide. Founded in 2013, Fusemachines is a global provider of enterprise AI products and services, on a mission to democratize AI. Leveraging proprietary AI Studio and AI Engines, the company helps drive the clients’ AI Enterprise Transformation, regardless of where they are in their Digital AI journeys. With offices in North America, Asia, and Latin America, Fusemachines provides a suite of enterprise AI offerings and specialty services that allow organizations of any size to implement and scale AI. Fusemachines serves companies in industries such as retail, manufacturing, and government.

Fusemachines continues to actively pursue the mission of democratizing AI for the masses by providing high-quality AI education in underserved communities and helping organizations achieve their full potential with AI.

Type: Full-time, Remote

Job Description:

We are looking for an experienced Lead Data Engineer to join our team to build the "Brain" of an IoT platform, a library that allows definition and metrics, validates it against a Virtual Schema, and generates optimized execution plans for both Spark (Batch) and Flink (Stream).

Qualification / Skill Set Requirement:

5+ years of hands-on data engineering experience with deep expertise in the Azure ecosystem.
Expert-level Java, Python and SQL.
Deep understanding of Apache Spark Internals (Catalyst Optimizer, Logical Plans).
Experience with ANTLR v4 or writing custom DSLs/Parsers.
Experience with Databricks and Delta Lake optimization.
Experience constructing Abstract Syntax Trees (ASTs).
Strong understanding of SDLC and Agile methodologies with hands-on experience in Azure DevOps, GitHub, CI/CD, and artifact management.
Skilled in data modeling, data design, and data warehousing solutions on Azure Databricks.
Knowledge of data quality, governance, and security best practices within Azure (AD, NSG, encryption, compliance).
Certifications preferred: Azure Fundamentals, Azure Data Engineer Associate, Databricks Certified Data Engineer Professional and Azure Solutions Architect Expert (nice to have).

Responsibilities

Architect, design, and implement scalable and efficient data solutions on Spark and Flink.
Implement the grammar for the IoT Query Language.
Build the Query Validator to enforce semantic constraints before a query is executed.
Develop a Spark Adapter: A translation layer that converts definition on metrics into Spark code.
Implement relationships logic (traversing a Graph/Ontology) within the core to avoid database bottlenecks.
Ensure 100% logic parity between Spark (Batch) and Flink (Stream) implementations.
Manage and optimize Azure and Databricks resources, for performance, reliability, and cost-efficiency.
Transform, clean, and prepare data using SQL, Python and Java.
Monitor and fine-tune workloads and pipelines for optimal performance and reliability.
Maintain clear documentation of solutions, configurations, and workflows.
Actively participate in Agile team activities and continuous improvement initiatives.
Promote and enforce data engineering best practices, including data governance, security, and data quality.

Fusemachines is an Equal Opportunities Employer, committed to diversity and inclusion. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or any other characteristic protected by applicable federal, state, or local laws.

Top Skills

Abstract Syntax Trees

Antlr V4

Apache Flink

Spark

Azure

Azure Active Directory (Ad)

Azure Databricks

Azure Devops

Ci/Cd

Databricks

Delta Lake

Git

Java

Network Security Groups (Nsg)

Python

Spark Catalyst Optimizer

SQL

Similar Jobs

Citi

Lead Big Data Spark Engineer

8 Days Ago

In-Office

Mississauga, ON, CAN

Senior level

Fintech • Financial Services

Lead the design and development of big data platforms using Scala and Spark, mentor engineers, and collaborate with teams to deliver scalable solutions.

Top Skills: SparkAWSAzureCassandraCi/CdDatabricksDelta LakeDockerGCPHadoopHbaseKafkaKubernetesScalaSnowflake

Mondelēz International

Manager, Procurement Operations

An Hour Ago

Hybrid

Toronto, ON, CAN

Senior level

Big Data • Food • Hardware • Machine Learning • Retail • Automation • Manufacturing

Serve as the single procurement contact for a site/country cluster, managing stakeholder interactions, ensuring material availability, supporting sourcing projects and spot buys, owning source-to-pay compliance KPIs, and implementing regional/global procurement initiatives to support manufacturing, supply chain and commercial teams.

Square

Account Executive

An Hour Ago

Remote or Hybrid

Mid level

eCommerce • Fintech • Hardware • Payments • Software • Financial Services

Field-driven Territory Account Executive responsible for full-cycle, self-sourced sales: prospecting, live demos, closing deals, building pipeline, partnerships, and managing Salesforce-based forecasting and onboarding to exceed quota.

Top Skills: Salesforce

What you need to know about the Toronto Tech Scene

Although home to some of the biggest names in tech, including Google, Microsoft and Amazon, Toronto has established itself as one of the largest startup ecosystems in the world. And with over 2,000 startups — more than 30 percent of the country's total startups — Toronto continues to attract new businesses. Be it helping entrepreneurs manage their finances, simplifying business operations by automating payroll or assisting pharmaceutical companies in launching new drugs, the city's tech scene is just getting started.