Senior Software Engineer - Site Reliability
Division: Bethesda Softworks | Department: Platform |
Rockville , MD, US
The Bethesda.net team is seeking a Site Reliability Engineer (SRE) to help solve our toughest problems. Working on the platform’s technical foundations, you will help improve performance and stability. Working with our product teams, you will mentor engineers and guide future development to bring sustainable growth to our services.
Given the choice between fast and perfect, you seek the proper balance. Your experience brings an understanding of how past and present choices affect the future potential of systems and teams. When you see repetitive work or manual processes, you actively seek to reduce this wasted effort and increased risk.
- Review new and existing services for performance, reliability, and sustainable coding practices
- Understand and define infrastructure as code to support systems developed
- Write clean, maintainable code, that is suitable for continuous integration and deployment (CI/CD), following best practices and software guidelines
- Design, engineer, and maintain common code libraries that can be used by engineers to leverage the platform in a consistent manner
- Work closely with engineers throughout the development process to ensure standards for infrastructure and managed services are understood and implemented correctly
- Understand diverse languages and technologies - Python, Go, Nginx, Redis, MySQL, AWS technologies, etc.
- Investigate, assess, and make recommendations for new technologies
- Work with tech leads and other engineering leaders to build resource utilization estimates
- Support systems in a 24x7 environment including troubleshooting, hot fixing, and root cause analysis
- Act as an agent of change and improvement by observing live systems and providing recommendations for continuous improvement for all areas of development
- Investigate and identify root cause analysis for issues in all stable and live environment
- Act as the subject matter expert on AWS cloud infrastructure and managed services
- Identify and implement automation for repeated and time consuming tasks
- Participate in on-call rotation with the rest of the engineering team to provide escalated support for Tier 1 & 2
- Perform under minimal supervision on significantly complex assignments
- Other duties as assigned
- 4 years of experience as a software engineer
- You should possess a strong technical background and a good grasp of software engineering principles, exceptional problem solving, design, programming, and testing skills
- Experience developing and designing software solutions in an online environment
- Experience operating and deploying large scale and complex systems in a cloud environment.
- Experience with configuration management systems
- Experience with engineering automated build/deploy systems which include continuous integration as well as infrastructure as code
- Understand and have implemented Docker and other container based systems
- Able to troubleshoot complex systems in a live environment quickly and effectively
- Familiarity with Linux system administration
- Familiarity with network engineering
How to Apply
Previously Applied? Click here.
No Recruiters or Agencies Please
Please Note: Individuals submitting resumes or otherwise responding to employment opportunities are NOT considered applicants until they apply for a particular position and have been invited to complete the company's employment application.