VMware Company Overview: At VMware, we believe that software has
the power to unlock new opportunities for people and our planet. We
look beyond the barriers of compromise to engineer new ways to make
technologies work together seamlessly. Our cloud, mobility, and
security software form a flexible, consistent digital foundation
for securely delivering the apps, services and experiences that are
transforming business innovation around the globe. At the core of
what we do are our people who deeply value execution, passion,
integrity, customers, and community. Shape what’s possible today at
As part of the VMware global standards for integrity you will be required to go through a pre-employment screening process before you join.
All job applications will be treated with strict confidentiality.
Why will you enjoy this new opportunity?
We are looking for a talented Site Reliability Engineer - you can join our team and help make VMware’s highly successful SaaS product - VMware Cloud. You’ll be improving our CI/CD and monitoring codebase, automating complex problems and building new tools and integrations, as we move on to AWS and other Hyperscalers. You can enjoy our very well-established processes and pipelines, as we are striving for excellence. As a SRE Engineer you will be able to solve more complex problems in the area of automation and observability. You will be the guard and supporter for various RnD teams to meet our high standards for security and production grade microservices. You must be strict, strive for perfection and have attention to detail, when it comes to production readiness. In return VMC offers great career opportunities and interesting technology stack to work with, a place in a very well-sized team with supportive atmosphere, supportive culture, growth through challenging tasks and management attention and guidance.
What You’ll Do?
Within a month:
• You will have a learning buddy and start onboarding on our well-prepared documentation. Create accounts and access various systems
• Start fixing existing issues in our Staging Systems and prepare for Prod Support
• Get familiar with PageDuty, Jira Service Desk and our CI/CD pipelines
• Participate in our Scrum meetings
• Working with configuration management tools in Linux
• Start supporting our PROD System on-call rotation with help of the team
• Deep dive into Python and our automation framework system
• Start automating things. Take tasks from our backlog
• Conduct post-mortems to analyze and prevent repeatable failures
• Handle seamless upgrades of infrastructure and services through automation
Half a Year:
• Architecture and implementation of advanced cloud-based monitoring, alerting and reporting
• Work closely with software engineering teams to improve the availability of services, help migrations to our new version of the platform
• Maintain 99.995% availability of VMware's global services platform
• Participate in roadmap planning and drive team initiatives in scope of automation and observability
• Identify, gather, analyse and automate responses to key performance metrics, logs, and alerts
• Ensure infrastructure security compliance like PCI, HIPPA, SOC2
What assignments and tasks will you be performing on a regular basis?
• Manage code using a preferred scripting language – Bash and/or Python
• Use Git and related workflows for code management
• Supporting an enterprise-level SaaS environment. You must be comfortable as a driver of the SRE work end to end
• Help RnD teams deliver their Microservices to the new version of the platform
• Follow procedures for different compliance standards - IL6, PCI, SOC2 etc.
• Strengthen observability and performance, apply critical security recommendations
• Automate everything you see as repeatable process, like patching and upgrades, auto-remediation
• Help with chaos engineering