Senior DevOps Engineer, Cloud Monitoring and Performance
Sensity Systems’ product technology stack spans across hardware, modern software stack, API, public and private networks and worldwide deployment. The needs for cloud operations, monitoring and security are quite broad and complex. We are seeking an experienced engineer with relevant work experience and most importantly, a “can-do” attitude, who is ready to take on new challenges and quickly come up to speed to drive critical initiatives. This position is part of a newly formed Cloud Services group reporting to the Director of Cloud Services.
- Cloud Services: AWS, EC2, S3, Docker
- Linux (Ubuntu)
- Languages: Python, Go
- Technology Stack : ZeroMQ, Node4J, Cassendara, Neo4J, ELK, MQTT etc.
- Tools: Git, Docker, Cloudformation, salt-stack -
- Security: Internet security, SSL, HTTPS, TLS, DTLS etc.
- Extensively collaborate with engineering, QA, field performance and support teams to deliver a solid automated, monitoring and alerting platform.
- Build best practices around monitoring and define various metrics that can be measured including SLA.
- Operationalize new customer onboarding and certify production readiness with alerts configured.
- Work closely with field performance and support teams to understand gap in type of issues found in field, support and work towards building monitoring and alerting around it.
- Actively participate in architecture and design discussion.
- Monitor, troubleshoot and resolve issues for application, API, database etc.
- Take on new responsibility such as CI/CD, help DevOPS team etc.
Skills & Ideal Experience:
- 8-10 years related experience, ideally in a fast-paced, growing start-up company.
- BS in Computer Science, or equivalent experience.
- Good programming skills in Python or Go, and demonstrated ability to grasp new programming languages quickly.
- Cloud configuration management, preferably SaltStack and ability to architect monitoring with it.
- Define and lead monitoring for IaaS, PaaS, SaaS (API), 24x7 from low level monitors to high level service level monitors (e.g. process/service monitor, log monitor, REST API monitor, storage monitor etc).
- Experience working on tools like nagios, zabbix, APM tools (new relic, appdynamics etc.)
- Extensive experience working on actionable intelligent, possibly self serve alert with building standard operating procedure/run book.
- Experience integrating with ZenDesk, Pager Duty, JIRA etc.
- Experience building various monitoring metrics/analytics both around API and cloud platform.
- Ability to diagnose complex performance issues end to end and determine root cause.
- Experience building analytics from data collected from various monitors. e.g. TOP N APIs, TOP N customer etc.
- Knowledge to build capacity planning map from historical data collected from production system.
- Previous work in one of the public clouds: AWS (preferably), Rackspace, google cloud etc.
- Good understanding of large-scale distributed systems in practice, including multi-tier architectures, application security, monitoring and storage systems, load balancing.
- Experience in the Linux environment and a good understanding of its fundamentals and internals: filesystems, security, networking etc.
- Working knowledge of internet security, SSL, HTTPS, TLS, DTLS etc.
- Solid understanding of software application builds and deployments and drive CI/CD from development through production pipeline.
- A demonstrated passion for automation.
- Results-oriented, collaborative professional with ability to work successfully in a matrixed organization.
- Clear communicator who is very conductive to working in a team environment and helps lift team spirit.
- Grit, drive and a strong feeling of ownership.
- Innovative professional with a bias towards action rather than simply maintaining status quo.
To apply, please click here.