McAfee Logo

McAfee

ML Data Engineer (Feature Pipeline & ETL)

Posted 2 Days Ago
Be an Early Applicant
Canada
Mid level
Canada
Mid level
The ML Data Engineer will develop and maintain ETL pipelines for machine learning, focusing on feature engineering, data integration, model training, and compliance. Responsibilities include optimizing data quality and performance, supporting model lifecycle management, and deploying high-performance pipelines for real-time inference.
The summary above was generated by AI

Role Overview:

McAfee is seeking a skilled ML Data Engineer to join our Consumer ML team, specializing in creating robust feature engineering ETL pipelines tailored for machine learning applications. This role requires hands-on experience with Databricks, a solid understanding of the medallion architecture, and expertise in developing, deploying, and managing scalable data pipelines for low-latency model serving.
The ideal candidate will also have experience supporting the end-to-end ML lifecycle, including model training and experiment tracking, with MLflow experience as a strong asset. As part of our AI and Machine Learning team, you will be instrumental in enabling advanced analytics and delivering personalized user experiences.
This is a remote position based in Canada. We will only consider candidates in Canada and are not offering relocation assistance at this time.

About the role:

  • Feature Engineering & Data Integration: Develop and maintain end-to-end ML feature engineering pipelines using Databricks, ensuring data is consistently structured to support ML models effectively.
  • Pipeline Development & Management: Integrate diverse data sources (clickstreams, user behaviour, demographic data, etc.) and tailor data integration processes to optimize data quality and performance.
  • Medallion Architecture Expertise: Build ETL/ELT pipelines that follow the bronze, silver, and gold layers of the medallion architecture, ensuring efficient data structuring for ML workflows.
  • Model Training & Experiment Tracking: Support ML model training and calibration through optimized data pipelines, using MLflow for experiment tracking, model versioning, and performance monitoring.
  • Query Optimization & Low Latency Pipelines: Design and implement optimized queries and low-latency data pipelines to support real-time and batch model inference in production.
  • CI/CD & Deployment: Apply CI/CD best practices to ensure smooth and efficient pipeline deployments, with automated testing for consistent performance.
  • Data Governance & Compliance: Ensure pipelines meet security and compliance standards, particularly for PII, and manage metadata and master data across the data catalogue.
  • Collaboration: Work closely with data scientists, data stewards, and other teams to align data ingestion and transformation efforts with business requirements.

About you:

  • Experience: Minimum 4 years in data engineering, focusing on ML feature engineering, ETL pipeline development, and data preparation for machine learning.
  • Databricks & Medallion Architecture: Proven expertise in managing ETL/ELT pipelines on Databricks, with a solid understanding of the medallion architecture.
  • ML Lifecycle & MLflow: Familiarity with the ML lifecycle and experience using MLflow for model training, calibration, and experiment tracking is highly desirable.
  • Spark & Big Data Technologies: Advanced skills in Apache Spark for big data processing and analytics.
  • Programming & Querying: Strong skills in Python for data manipulation, SQL for query optimization, and performance tuning.
  • Low Latency Data Pipelines: Experience in building and optimizing pipelines for low-latency model inference and serving in production environments.
  • CI/CD & System Integration: Familiarity with continuous integration and deployment practices for ETL/ELT pipeline development.
  • Data Pipeline Management: Expertise in managing data pipelines, ensuring adherence to security, compliance, and best practices.
  • Metadata & Master Data Management: Competency in managing metadata and master data within a technical data catalogue
  • You are a detail-oriented ML Data Engineer passionate about building scalable, efficient data pipelines tailored for machine learning.
  • You thrive in a collaborative environment, working effectively with cross-functional teams to drive data-driven insights and personalized solutions.
  • You are proactive in troubleshooting, monitoring, and optimizing data pipelines to support high-performance ML models in production.

#LI-Remote


Company Overview

McAfee is a leader in personal security for consumers. Focused on protecting people, not just devices, McAfee consumer solutions adapt to users’ needs in an always online world, empowering them to live securely through integrated, intuitive solutions that protects their families and communities with the right security at the right moment.

Company Benefits and Perks:

We work hard to embrace diversity and inclusion and encourage everyone at McAfee to bring their authentic selves to work every day. We offer a variety of social programs, flexible work hours and family-friendly benefits to all of our employees.

  • Bonus Program
  • Pension and Retirement Plans
  • Medical, Dental and Vision Coverage
  • Paid Time Off
  • Paid Parental Leave
  • Support for Community Involvement

We're serious about our commitment to diversity which is why McAfee prohibits discrimination based on race, color, religion, gender, national origin, age, disability, veteran status, marital status, pregnancy, gender expression or identity, sexual orientation or any other legally protected status.

Top Skills

Python
SQL

Similar Jobs

9 Days Ago
8 Locations
Remote
22 Employees
Mid level
22 Employees
Mid level
Fintech • Financial Services
As an NLP Data Engineer, you'll design, build, and maintain data pipelines for automated trading systems, manage subsystems, and engage in R&D of machine learning models to enhance execution systems.
Be an Early Applicant
4 Hours Ago
Toronto, ON, CAN
20,000 Employees
Expert/Leader
20,000 Employees
Expert/Leader
Food • Retail • Agriculture • Manufacturing
The Director of Data Platform Governance will lead the development and implementation of a governance strategy for Enterprise data assets, ensuring data quality, privacy, and security. The role involves collaborating with cross-functional teams to drive data governance solutions across McCain's business units and establish metrics for data quality. Additionally, the Director will oversee compliance with relevant regulations and establish frameworks for AI governance.
Be an Early Applicant
5 Hours Ago
8 Locations
Remote
Hybrid
12,000 Employees
Senior level
12,000 Employees
Senior level
Blockchain • eCommerce • Fintech • Payments • Software • Financial Services • Cryptocurrency
The Senior Machine Learning Engineer will design and implement ML pipelines and services to combat fraud in Cash App's banking products. This role involves developing next-generation ML solutions and collaborating with data scientists and software engineers to enhance security and user experience across a growing financial service platform.

What you need to know about the Toronto Tech Scene

Although home to some of the biggest names in tech, including Google, Microsoft and Amazon, Toronto has established itself as one of the largest startup ecosystems in the world. And with over 2,000 startups — more than 30 percent of the country's total startups — Toronto continues to attract new businesses. Be it helping entrepreneurs manage their finances, simplifying business operations by automating payroll or assisting pharmaceutical companies in launching new drugs, the city's tech scene is just getting started.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account