VMware Company Overview: At VMware, we believe that software has
the power to unlock new opportunities for people and our planet. We
look beyond the barriers of compromise to engineer new ways to make
technologies work together seamlessly. Our cloud, mobility, and
security software form a flexible, consistent digital foundation
for securely delivering the apps, services and experiences that are
transforming business innovation around the globe. At the core of
what we do are our people who deeply value execution, passion,
integrity, customers, and community. Shape what’s possible today at
http://careers.vmware.com.
As part of the VMware global standards for integrity you will be
required to go through a pre-employment screening process before
you join.
All job applications will be treated with strict
confidentiality.
Why will you enjoy this new opportunity?
We are looking for a talented Site Reliability Engineer - you can
join our team and help make VMware’s highly successful SaaS product
- VMware Cloud. You’ll be improving our CI/CD and monitoring
codebase, automating complex problems and building new tools and
integrations, as we move on to AWS and other Hyperscalers. You can
enjoy our very well-established processes and pipelines, as we are
striving for excellence. As a SRE Engineer you will be able to
solve more complex problems in the area of automation and
observability. You will be the guard and supporter for various RnD
teams to meet our high standards for security and production grade
microservices. You must be strict, strive for perfection and have
attention to detail, when it comes to production readiness. In
return VMC offers great career opportunities and interesting
technology stack to work with, a place in a very well-sized team
with supportive atmosphere, supportive culture, growth through
challenging tasks and management attention and guidance.
What You’ll Do?
Within a month:
• You will have a learning buddy and start onboarding on our
well-prepared documentation. Create accounts and access various
systems
• Start fixing existing issues in our Staging Systems and prepare
for Prod Support
• Get familiar with PageDuty, Jira Service Desk and our CI/CD
pipelines
• Participate in our Scrum meetings
• Working with configuration management tools in Linux
Second Month:
• Start supporting our PROD System on-call rotation with help of
the team
• Deep dive into Python and our automation framework system
• Start automating things. Take tasks from our backlog
• Conduct post-mortems to analyze and prevent repeatable
failures
• Handle seamless upgrades of infrastructure and services through
automation
Half a Year:
• Architecture and implementation of advanced cloud-based
monitoring, alerting and reporting
• Work closely with software engineering teams to improve the
availability of services, help migrations to our new version of the
platform
• Maintain 99.995% availability of VMware's global services
platform
• Participate in roadmap planning and drive team initiatives in
scope of automation and observability
• Identify, gather, analyse and automate responses to key
performance metrics, logs, and alerts
• Ensure infrastructure security compliance like PCI, HIPPA,
SOC2
What assignments and tasks will you be performing on a regular
basis?
• Manage code using a preferred scripting language – Bash and/or
Python
• Use Git and related workflows for code management
• Supporting an enterprise-level SaaS environment. You must be
comfortable as a driver of the SRE work end to end
• Help RnD teams deliver their Microservices to the new version of
the platform
• Follow procedures for different compliance standards - IL6, PCI,
SOC2 etc.
• Strengthen observability and performance, apply critical security
recommendations
• Automate everything you see as repeatable process, like patching
and upgrades, auto-remediation
• Help with chaos engineering