Software Engineering Manager

Posted 3 Days Ago
Be an Early Applicant
San Francisco, CA
215K-265K Annually
7+ Years Experience
Artificial Intelligence • Cloud • Hardware • Machine Learning • Other • Software • Infrastructure as a Service (IaaS)
We build infrastructure for machine learning
The Role
Manage a team building and maintaining on-demand portal and API for a cloud GPU provider. Lead architecture at scale, drive best practices, and ensure business goals are met with a focus on security and reliability. Bias towards action and strong communication skills are essential.
Summary Generated by Built In

Voltage Park is on a mission to make machine learning infrastructure accessible to all, from large enterprises and research universities to seed-stage startups and nonprofits. We operate a massive fleet of 24,000 fully-owned NVIDIA GPUs colocated across four top-tier data centers, and we are the only cloud provider offering a platform that shows all available GPUs with transparent, market-based pricing, in addition to long-term reserve contracts for our customers. 

To truly democratize AI infrastructure, we’re building a world-class web portal and API that offers customers instant, on-demand access to our compute clusters. We’re looking for an Engineering Manager with a bias to action to serve a critical role in making this a reality. 

This role is onsite in our San Francisco office.

What You’ll Do:

  • Manage a small team building out and maintaining Voltage Park’s on-demand portal and API, giving our customers instant access to our compute clusters

  • Own responsibility over the product

  • Guide the team’s vision, day-to-day priorities, and career development plans and performance feedback

  • Build scalable, quality code that lasts generations

  • Lead the implementation of best practices for security, scalability, and reliability 

  • Provide clarity of decision making, direction, and progress for the team and the company

  • Proactively detect issues before they become a big problem

  • Drive incident response, blameless postmortems, and continuous improvement

  • Coordinate complex technical project work with engineering teams across the stack.

Qualifications:

  • Experience as a manager in a start-up environment, with a track record of building, launching and maintaining product-grade applications at scale

  • Hands-on experience leading architecture at scale

  • A passion for coding – you’re comfortable jumping in to code and bug fix alongside the team when needed

  • Strong communication skills and the ability to gain consensus across teams at all levels

  • Consistent focus on business goals, customer needs, and data-driven decision making

  • Eye for security and best practices in scalability and reliability

  • Bias toward action and a get-it-done attitude

Nice-to-haves

  • Experience working in R&D or a skunkworks environment

  • Customer of Voltage Park, TensorDock, or other cloud GPU provider with ideas on how to improve

Voltage Park is an equal opportunity employer and makes employment decisions on the basis of merit. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, protected veteran status, or any other characteristic under federal, state, or local law. If you require an accommodation during the job application process, please notify your recruiter. 

Compensation Range: $215K - $265K

Top Skills

Python

What the Team is Saying

Melissa Du
The Company
HQ: Berkeley, CA
45 Employees
Remote Workplace
Year Founded: 2023

What We Do

The market for cutting-edge ML compute is broken. Startups, researchers and even big AI labs are scrambling to buy or rent access to the latest chips for ML training. But demand far outstrips supply, and what’s available is only accessible to the well-resourced, placing an artificial damper on innovation.

To solve this challenge, we've launched Voltage Park, and we’re on a mission to make machine learning infrastructure accessible to all, from large enterprises and research universities, to seed-stage startups and nonprofits.

With around 24,000 NVIDIA H100 GPUs, the Voltage Park cloud is one of the most powerful collections of cutting-edge ML compute in the world. Our clusters consist of 80GB H100 SXM5 GPUs fully interconnected with 3.2T InfiniBand. We currently offer bare-metal access for large-scale users that need peak performance. We will add support for short-term leases and hourly billing soon as we spin up our infrastructure along with support for familiar tools like Slurm, Kubernetes, and Mosaic for easy integration into existing training frameworks.

Why Work With Us

You’ll play a pivotal role as a member of the founding team that will change the face of machine learning infrastructure. As an early hire, you’ll have outsize influence in defining the company’s culture and ensuring mission success.

Voltage Park Offices

Remote Workspace

Employees work remotely.

Voltage Park is a 100% remote company.

Typical time on-site: None
HQBerkeley, CA

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account