Do you have primary responsibility for running Cassandra or Docker in production?
DevOps Engineer needed with AWS expertise, paid relocation to South Florida, or work remote from anywhere.
Join this cutting edge Google Ventures funded South Florida based software platform company.
We are looking for a skilled Cloud Systems Engineer who will be responsible for supporting our application infrastructure. The candidate will be responsible for maintaining strict production SLAs of a custom distributed application running on AWS Cassandra/Docker cloud infrastructure, monitoring, alerting, incident management, SOP, capacity planning, change management, security, and overall platform stability and improvement.
The ideal candidate will have a deep technical understanding of Linux operating systems and at least an intermediate understanding of Windows operating systems, strong experience with Amazon Web Services (AWS), advanced scripting skills, experience with variety of monitoring solutions, deep understanding of network communications and excellent troubleshooting skills.
Finally, we are seeking someone who wants to be a major contributor in a small, dynamic work environment, loves a challenge, and has a strong balance of technical and people skills.
The Cloud Systems Engineer will be responsible for:
• Maintain, monitor, and improve the performance and availability of the 24×7 production environment including networks, servers, databases, etc.
• Participate in on-call rotation and provide hands-on support during emergencies, outages, and service transitions.
• Define and monitor compliance to production environment SLAs.
• Participate in creating long-term and short term strategies for scaling the production environment.
• Adhere to a comprehensive incident management program including problem management.
• Generate KPIs for service availability, uptime, and adherence to SOPs, and SLAs.
The ideal qualifications of the Cloud Systems Engineer:
• Strong experience with Amazon Web Services (EC2, CloudFormation, CodeDeploy, etc.)
• Strong experience with Linux server administration
• At least basic Windows server administration skills
• Strong scripting skills (Bash, Perl, Python) with the ability to develop ad hoc tools at scale.
• Deep technical knowledge of several monitoring and analytics tools such as AWS CloudWatch, Sumologic, Datadog
• Strong Experience with distributed web application architectures
• Experience with NoSQL databases such as Cassandra, Redis, Aerospike
• Experience with messaging technologies such as RabbitMQ
• Experience building/maintenance system monitoring and backup strategies
• Experience maintaining a secure production environment
• Experience with release/build/deployment management
• Deep understanding of 24×7 production operations and SLA development
• BS in Computer Science/Engineering
• 5 + years’ experience with 24×7 production operations
SherlockTalent loves to share $500 referral bonuses!