My job alerts

Senior Site Reliability Engineer

Microsoft

Software Engineering

USD 119,800-234,700 / year

Posted on Jun 6, 2025

Apply now

Senior Site Reliability Engineer

Redmond, Washington, United States

Save

Share job

Date posted

Jun 05, 2025

Job number

1827718

Work site

Up to 50% work from home

Travel

0-25 %

Role type

Individual Contributor

Profession

Software Engineering

Discipline

Site Reliability Engineering

Employment type

Full-Time

Overview

Halo Studios is building the future of the blockbuster Halo series of video games with Unreal Engine. As part of Xbox Game Studios, the Halo franchise encompasses games, novels, comics, licensed collectibles, apparel, and more with a shared vision of heroism, mystery, and wonder. With multiple projects in development, join our team as we forge the next generation of games and experiences in our award-winning sci-fi universe.

You will be a key member of the Halo Studios IT Engineering team, responsible for studio data and IT services and infrastructure within our studio. You will be contributing to the architecture and design of new on-prem and cloud infrastructure, while continuing to drive optimization, performance, security, and reliability with cutting edge technologies and automation. You will empower artists, developers, and others in our studio by proactively designing technical solutions to maximize their efficiency. Along the way, you will be a trusted voice who shares your knowledge and expertise within our team and other teams in the studio. You will be joining a fast-paced team that constantly provides new opportunities to learn and grow. Roles at our studio are flexible, and you can work from home up to two days a week in this role.

Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

Qualifications

Qualifications

Required qualifications:

6+ years technical experience in software engineering, network engineering, or systems administration

OR Bachelor's Degree in Computer Science, Information Technology, or related field AND 3+ years technical experience in software engineering, network engineering, or systems administration

OR Master's Degree in Computer Science, Information Technology, or related field AND 2+ years technical experience in software engineering, network engineering, or systems administration.

5+ years of experience managing technical infrastructure, including 3+ years of hands-on Linux system administration involving troubleshooting, performance tuning, security configuration, and automation of core OS services.

3+ years of experience building and maintaining infrastructure automation using scripting languages (e.g., PowerShell, Python, Bash) and infrastructure-as-code tools (e.g., Docker, Kubernetes, Terraform, Azure Bicep), with a focus on deploying and managing containerized applications and services.

5+ years of experience owning and operating production-grade infrastructure systems at scale, including responsibility for reliability, performance tuning, monitoring/observability, and incident response across hybrid or cloud-native environments?

Other Requirements:

Ability to meet Microsoft, customer and/or government security screening requirements is required for this role. These requirements include, but are not limited to, the following specialized security screenings:

Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.

Preferred qualifications:

Experience with troubleshooting and designing solutions with core infrastructure technologies including on-prem and/or Azure networking, cloud technologies, Active Directory / Entra ID

Experience maintaining high-availability version control systems (e.g., Perforce) and CI/CD build infrastructure

Experience with infrastructure observability, incident response, and capacity planning for cloud and hybrid systems

Experience with Entra ID authentication (oauth2, OIDC, SAML) for Azure Resources and App Registrations

Site Reliability Engineering IC4 - The typical base pay range for this role across the U.S. is USD $119,800 - $234,700 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $158,400 - $258,000 per year.

Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here: https://careers.microsoft.com/us/en/us-corporate-pay

Microsoft will accept applications for the role until June 26, Year.

#gamingjobs #halo #halojobs

Responsibilities

Responsibilities

Architect, implement, and optimize critical hybrid and cloud-based IT infrastructure utilizing Infrastructure as Code (IaC) technologies (e.g., Docker, Terraform, AKS) to ensure high availability, scalability, security, and operational efficiency.

Design, scale, and maintain Perforce, Swarm, and build farm infrastructure used by game development teams and automated build environments, to ensure robust, high-performance workflows across distributed game development environments.

Design and implement Azure Networking solutions including Site-to-Site tunnels, App Gateway, Private Endpoints/Private Link, DNS, and network security for Azure resources, ensuring secure and reliable connectivity.

Architect and deliver automation solutions to improve service health, manageability, reliability, telemetry, and alerting.

Implement data governance, storage, backup, and disaster recovery solutions for a multi-Petabyte Azure-based environment, ensuring data integrity, security, and performance.

Research, evaluate, and integrate emerging tools and methodologies into the technology roadmap, to continuously optimize efficiency, reliability, and scalability.

Produce and maintain clear and accurate technical documentation and design specifications that align with best practices.

Collaborate with software engineers, project management, and operations teams to improve and optimize infrastructure and evolve services, ensuring alignment with organizational goals.

Participate in on-call rotations, lead incident response, and conduct postmortems to identify root causes and implement preventative infrastructure improvements.

Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work.

Industry leading healthcare

Educational resources

Discounts on products and services

Savings and investments

Maternity and paternity leave

Generous time away

Giving programs

Opportunities to network and connect

Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations.

Apply now

See more open positions at Microsoft

Queering the tech ecosystem!