Site Reliability Engineer
The Site Reliability Engineer will help optimize existing systems and develop creative solutions to operations problems, and partner with other teams across the organization to create efficiencies through automation and thoughtful design.
Strong candidates have the appropriate technical experience, an interest in generating creative solutions, and client- and customer-oriented approach to problem-solving and design.
This is a remote role that requires applicants to work EST business hours.
If you are interested and meet the qualifications below, apply with your resume today!
Responsibilities:
Collaborate with internal teams to deliver platform improvements to manage hybrid cloud environment
Carry out architecture reviews, code reviews, capacity planning, and chaos testing for reliability practices
Automate deployment of AWS of infrastructure and services
Ensure cloud architecture meets all guidelines and requirements for scalability and cost
Design, code, test, and deliver software to automate manual operational work
Troubleshoot and properly document incidents according to internal best practices
Conduct post-incident reviews to prevent problem recurrence and improve service reliability
Facilitate development of software for reliability and scale with development team
Support service-level objectives leveraging application patterns and analytics
Design automated software and product upgrades, and resiliency patterns
Qualifications:
Bachelor's degree in Computer Science or equivalent practical coding experience
5+ years of IT infrastructure experience; 3+ years as on-call DevOps, SRE, or Cloud Operations Senior Engineer
3+ years of Linux administration experience and bein
1+ years of experience migrating and managing AWS workloads
Proven experience automation glue code, and managing production infrastructure in AWS
AWS Certified Solutions Architect (CSA) certification is a huge bonus
Must have experience with Azure, Openshift, and Terraform
Experience with MongoDB, Elasticsearch, and Redis
Strong automation capability using Bash, Python, PowerShell, and AWS CLI
Robust organizational skills; ability to keep track of and prioritize multiple projects and tasks effectively
Experience with improving developer experience with desktop tooling and scripts