Site Reliability Engineering Manager, Storage - Apple Cloud Services

London, Greater London, United Kingdom
Software and Services

Summary

Posted:
Role Number:200557072
Are you a talented Site Reliability Engineering Manager with a passion for distributed storage systems? Ready to be part of a focused and lively team bringing distributed storage technologies to Apple's infrastructure? At Apple, scale is huge and impact is enormous. Join our team and be part of our mission, which is to power storage behind many of Apple's most popular services. Bring passion and dedication to your job and there's no limit to what you can achieve!

Description

The Storage SRE organization is seeking a strong engineering leader to manage Storage focused SRE teams, working closely with peer SRE teams and development partners. You'll help build and optimize the Storage stack from the bare metal to the top of the application, helping design provisioning systems, code deployment, monitoring, alerting, and performance improvements. Together with the team, you'll help run the storage used by some of Apple's largest teams.

Minimum Qualifications

  • Proven experience in a leadership role within an SRE or DevOps team, specifically focused on distributed storage.
  • Strong background in distributed systems, storage architectures, and data management.
  • Deep knowledge of SRE principles, including monitoring, alerting, error budgets, fault analysis, and other common reliability engineering concepts
  • Lead initiatives to enhance the scalability and performance of distributed storage systems.
  • Collaborate with engineering teams to design and implement robust and scalable storage solutions.
  • Bachelor's or Master's degree in Computer Science, Engineering, or a related field.

Key Qualifications

Preferred Qualifications

  • Experience with Kubernetes, Docker, and containerization
  • Proficient in at least one of these programming languages: Golang, Java or Rust
  • Knowledge of distributed storage (block storage), or similar large scale distributed databases
  • Familiarity with CI/CD pipelines and infrastructure as code (Terraform, Ansible).
  • Knowledge of security best practices and compliance requirements in storage systems.
  • Understanding of data durability, consistency models, and storage performance optimization techniques.

Education & Experience

Additional Requirements