Voltage Park Logo

Voltage Park

Platform Engineer

Posted 2 Days Ago
Remote
2 Locations
Senior level
Remote
2 Locations
Senior level
As a Platform Engineer at Voltage Park, you will maintain systems vital for platform reliability, develop software for automation, conduct root cause analyses on downtimes, and write scripts to monitor server performance. You'll be integral to shaping the team's engineering practices and company culture.
The summary above was generated by AI

About Voltage Park On-Demand

​​Voltage Park’s mission is to make AI infrastructure accessible to all. Today, we own 24,000+ H100s and operate 7+ data-centers across the US. We serve customers of all sizes, from small research labs to large enterprises. We’re in search of a Platform Engineer to join our On-Demand team, where you’ll help us build a platform that allows customers to flexibly rent out these GPUs for as little or as long as they want. 

Our team is small, highly motivated, and focused on engineering excellence. We all operate with a founder mentality; several of us have founded and launched prior businesses. 

All team members are be hands-on and contribute directly to our team’s mission. 

If you join us, you’ll be an early team member and help us shape:

  • Our future company culture

  • Our engineering practices

  • People that we hire

  • The direction & focus of our products

Note: We are not able to provide sponsorship for this position.

What You’ll Do

  • Maintain servers & systems integral to our platform’s reliability 

  • Develop software — either for automation or for front/back end

  • Write automation for backend orchestration systems — MaaS; Libvirt; PFsense

  • Track downtimes and conduct RCA’s

  • Write automation scripts to audit performance anomalies across our fleet of servers

Who You Are

  • 5+ years Linux administration (Ubuntu/Debian focus)

  • Strong experience with Libvirt (KVM) virtualization

  • Proficient in Python and Bash scripting

  • Experience with automation tools (preferably Ansible)

  • Solid networking knowledge

  • Experience with PostgreSQL

  • Familiarity with CEPH and NFS storage solutions

Ideal Experiences

  • Experience with GPU virtualization and PCIe passthrough

  • Knowledge of Proxmox VE, OpenStack, or OpenNebula

  • Experience with Docker and Kubernetes

  • Experience with bare metal automation (e.g., Ubuntu MAAS)

  • Monitoring experience (Prometheus, Grafana, ELK Stack)

  • Experience with infrastructure-as-code tools (e.g., Terraform)

  • Experience with Redis

  • Experience working with Python (backend), Postgres (database), and React + Tailwind (frontend)

  • Former technical founder: you’re sharp, business-minded, action-oriented, and can move quickly

Voltage Park On-Demand Team Culture

  • You are ambitious and always looking for ways to improve. We operate nearly $1B worth of assets, and the opportunity for impact is limitless. This role will give you the most responsibilities you’ve ever had and hold you to higher standards than other companies you’ve worked at. Expect to do the best and most impactful work of your career at Voltage Park. 

  • You’re focused on impact and don’t get lost in the weeds on details that don’t matter. You’re excited to work on whatever solves the biggest customer problems, not just the coolest technical challenges. You understand when making 80/20 trade-offs is the right thing to do and never compromise on your high standards when making those tradeoffs. 

  • You have a strong work ethic. As a startup, we are trying to change the world and take on many large, $B+ competitors. Raw hours make a huge difference when facing overwhelming odds. Having a strong work ethic is a competitive advantage.

  • You take ownership of your initiatives. When you say you'll do something, you get it done without anyone having to check-in on you. You ship fully baked features end-to-end. You’re accountable for the deadlines you set, and you figure out a solution if something unexpected occurs. 

  • You make tradeoffs when necessary and are open to new ideas. As a startup, we have to make decisions quickly and often with incomplete information. We also face problems that have no obvious solutions. Sometimes, the best ideas sound crazy at first. You don’t dismiss your teammate’s ideas and are open to being challenged by others. 

Voltage Park is an equal opportunity employer and makes employment decisions on the basis of merit. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, protected veteran status, or any other characteristic under federal, state, or local law. If you require an accommodation during the job application process, please notify your recruiter. 

Compensation Range: $120K - $180K


#BI-Remote

Top Skills

Ansible
Bash
Ceph
Debian
Docker
Elk Stack
Grafana
Kubernetes
Libvirt
Linux
Maas
Nfs
Postgres
Prometheus
Python
React
Redis
Tailwind
Terraform
Ubuntu

Similar Jobs at Voltage Park

2 Days Ago
Remote
2 Locations
Senior level
Senior level
Artificial Intelligence • Cloud • Hardware • Machine Learning • Other • Software • Infrastructure as a Service (IaaS)
The Site Reliability Engineer at Voltage Park is responsible for building and operating core infrastructure, including managing thousands of GPU servers, implementing improvements, and collaborating across networks and software development teams. This role involves on-call rotations and requires strong skills in Linux, AWS, Kubernetes, and automation tools.
6 Days Ago
Remote
2 Locations
Mid level
Mid level
Artificial Intelligence • Cloud • Hardware • Machine Learning • Other • Software • Infrastructure as a Service (IaaS)
As a Technical Account Manager, you will build and maintain relationships with customers, ensure their needs are met, guide them on effectively using the cloud infrastructure, and participate in strategic initiatives to enhance customer satisfaction and retention.
Top Skills: Cloud InfrastructureCustomer SuccessData Analytics
18 Days Ago
Remote
2 Locations
Mid level
Mid level
Artificial Intelligence • Cloud • Hardware • Machine Learning • Other • Software • Infrastructure as a Service (IaaS)
The Network Automation Engineer will design, develop, and implement automated systems for network and service activation to enhance efficiency in the AI infrastructure. They will work within cross-functional teams to deploy and maintain automation systems, troubleshoot issues, and ensure optimal performance while implementing cloud security measures and creating Ansible scripts.
Top Skills: AnsibleArista CloudvisionCi/CdGoInfrastructure As CodeJuniper ApstraNautobotNetboxNetconfPuppetPythonRestful ApisRustTerraform

What you need to know about the Toronto Tech Scene

Although home to some of the biggest names in tech, including Google, Microsoft and Amazon, Toronto has established itself as one of the largest startup ecosystems in the world. And with over 2,000 startups — more than 30 percent of the country's total startups — Toronto continues to attract new businesses. Be it helping entrepreneurs manage their finances, simplifying business operations by automating payroll or assisting pharmaceutical companies in launching new drugs, the city's tech scene is just getting started.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account