Harrison Clarke have partnered with one of the largest media companies in the United States. A Multi-Billion-dollar company and are currently in the search for a Senior Director of Site Reliability Engineering. The Senior Director of SRE will be responsible for leading a team of talented DevOps and SREs to scale the company’s infrastructure. You will bring your solid understanding of cloud architecture and deployments to help guide the company’s vision and strategy for production operations. You will use your knowledge and experience of the DevOps domain, including production support, CI/CD, and cloud service delivery to set the team’s values and establish the production service KPIs which drive the company’s success. You are results oriented and you thrive in a result driven culture that values initiative, leadership, and collaboration.
- Establish the service delivery culture for the business, building best-in class service engineering capabilities in the DevOps team.
- Deliver highly-performant and highly-scalable datacentre and cloud services at an optimized cost point
- Develop and maintain the company’s platform services and tooling to provide event logging, metrics monitoring, CI / CD, and other backend services.
- Define SLAs, and ensure the engineering teams are working towards those SLAs
- Maintain a high level of service availability. Perform quality reviews, manage operations, and operational issues.
- Work across the engineering team to influence software development to meet operational needs and influence product and cloud engineering to improve the manageability, quality, and the supportability of the company’s production services.
- Establish a culture of high performance, transparency and continuous improvement as it relates to the production support of the services and streamlining of the development pipeline.
- 5+ years of leadership experience
- Experience with managing both software development and Production SAAS is a must
- Experience with Agile development, techniques and methodologies and with CI/CD
- Experience designing, building and managing large scale infrastructure in AWS, including experience leveraging one or more coding languages for automation.
- Prior start-up experience; managing small to mid-size teams (10-20 people).
- Direct, hands-on experience leveraging large repositories of data to track, drive, and measure your operational effectiveness is required
- Experience using Hadoop/Hive, Impala, Spark, Zeppelin is a plus.
- Solid understanding of networking, network routing, client-server architecture.
- Proven experience leading positive change, empowering people, cultivating product technology visions and innovative solutions, and fostering effective engineering practices and culture.
- Working knowledge of several configuration management systems (Chef, Ansible, Puppet, Salt, Terraform, etc.) and several monitoring tools and frameworks (Splunk, ELK, Nagios, Zabbix, InfluxDB, Prometheus, etc.)
- Experience in driving process improvements, with a strong focus on leveraging technology for the establishment of fluid interactions and interfaces between teams
- Solid foundation in Linux / Unix
Package: $200,000+ sign on bonus + equity