Kaunas, LithuaniaJob ID R81785 Date posted Dec. 14, 2018
*Remote / Virtual Opportunity
Have you ever wondered what it takes to run a cloud infrastructure at scale? Do you enjoy a challenge and improving services already at scale? Our team of engineers at Virtustream work to design, build and operate a infrastructure-as-a-service cloud for some of the biggest companies in the world. The platform team is looking for individuals with a diverse set of experience and skills to design, build and operate solutions needed for the exciting new Virtustream Cloud. As a platform SRE you will develop extensible services and platforms that provide service insight, automated remediation, and service management at scale needed to maintain high service reliability with low touch.
● Design, build, and operate a global compute platform and related services
● Develop solutions for service monitoring, automated remediation, measuring availability and reliability, performance, analytics and security
● Design services and libraries on top of traditional VMware environments.
● Maintaining environment state with the use of configuration tools and event driven automation
● Participate in collaborative projects with software engineering teams
● Advise management on service onboarding strategies and execution
● Participate in troubleshooting, capacity planning and analysis, performance analysis activities.
● Part of a 24x7 service watch rotation team
● Experience engineering, operating, troubleshooting, administering and scaling platform services with code
● Production experience using configuration management tools (eg Ansible, Saltstack, Puppet, Chef)
● Proficiency implementing and maintaining continuous integration and delivery workflows.
● Operational experience with datacenter storage platforms (eg vSAN, Ceph, fibre channel, iSCSI, NFS)
● Experience supporting and troubleshooting production virtualization environments at scale
● Experience managing Unix/Linux systems in production
● A tenacious ability to diagnose and fix performance and reliability problems
● Experience in VMware products, specifically cloud related solutions such as: vSphere, vCenter, ESXi, vSAN, NSX or contending cloud solutions and products.
● Understanding of Unix/Linux systems from kernel to shell and beyond, taking in system libraries, file systems, and client-server protocols
● Experience with backup and disaster recovery services such as VMware SRM
● 3+ year Experience as DevOps, Operations Engineer, or SRE (development for large online services)
● 3+ year Experience building and operating highly available and scalable infrastructure solutions
● Experience working in distributed, remote teams across multiple time zones a plus
● Ability to travel for team meetings.