775 IT & Software Developer jobs in the US

Ellofant jobs

Site Reliability Engineer - Birmingham

$53,000 - 81,000
Ellofant
24th Street North 424, Birmingham
$53,000 - 81,000
Company Size icon
Company Size
<50
Company Type icon
Company Type
Services
Exp Level icon
Exp Level
Junior
Job Type icon
Job Type
Full-Time
Language icon
Language
English
Visa sponsorship icon
Visa sponsorship
No

Requirements

Must:
- 3-5 years of relevant experience in site reliability, infrastructure, or DevOps engineering - Strong expertise in monitoring and observability tools (Dynatrace, Grafana, Prometheus, Splunk, or similar) - Familiarity with incident management and event correlation platforms (BigPanda, ServiceNow, Moogsoft) - Proficiency with Linux/Unix systems (RHEL) and Windows Server environments - Hands-on experience with cloud platforms: AWS, Azure, or OpenShift - Strong knowledge of containerization and orchestration: Kubernetes, Docker, OpenShift - Experience with chaos engineering and fault injection frameworks (Litmus, Gremlin, AWS FIS, Azure Chaos Studio) - Solid understanding of networking, database systems (Oracle, SQL), and distributed architectures - Experience with event streaming platforms (Kafka) and service mesh technologies (Istio) - Familiarity with mainframe systems and legacy infrastructure - Experience with infrastructure as code and automation tools - Knowledge of job scheduling systems (CA7 or similar) and middleware technologies - Proficiency with Jira, Confluence, and ITSM tools - Experience working in financial services or other highly regulated industries preferred - Relevant certifications valued: AWS/Azure architecture, RHCE, VCP, Kubernetes (CKA/CKAD) - Strong analytical thinking, problem-solving abilities, and troubleshooting skills - Excellent written and verbal communication skills for cross-functional collaboration

Technologies

Lambda
Confluence
Datadog
Dynatrace
Istio
ITSM

Responsibilities

- Coordinate responses to critical events with application support teams and the Site Reliability Center - Triage and respond to alerts generated through the BigPanda event correlation platform - Assess cross-domain impacts and engage appropriate support teams or escalate as needed - Participate in on-call rotations to provide 24x7 coverage for critical systems - Conduct blameless post-mortems and root cause analysis to drive continuous improvement - Design and implement automated monitoring and alerting systems using Dynatrace, Grafana, Logscale, CrowdStrike, Prometheus, Splunk, Moogsoft, and Datadog - Create robust dashboards and implement SLAs/SLOs through comprehensive monitoring - Analyze metrics from operating systems and applications to assist in performance tuning and fault detection - Develop and implement chaos engineering practices using Litmus, Gremlin, Azure Chaos Studio, and Chaos Mesh - Design fault injection experiments to validate system resilience using AWS Resilience Hub - Build self-healing capabilities and automated remediation workflows - Implement health checks, autoscaling solutions using AWS Lambda, Kubernetes, OpenShift, and Istio service mesh - Manage infrastructure across mainframe systems, Windows, RHEL, and cloud platforms (AWS, Azure, OpenShift) - Work with containerized environments, event streaming platforms (Kafka), and database systems (Oracle, SQL) - Maintain virtualization infrastructure (VMware) and storage systems (NAS) - Leverage ServiceNow for incident management, Jira for issue tracking, and CA7 for job scheduling - Identify opportunities to improve application stability and promote SRE best practices - Maintain comprehensive knowledge bases and runbooks in Confluence - Mentor junior team members on resiliency patterns and operational excellence

Description


At Ellofant, we are a progressive consulting firm dedicated to impactful work that drives meaningful change. Our mission is to assist organizations in navigating complexities and scaling effectively through strategic insight, reliable technology, and hands-on execution. We prioritize clarity, momentum, and the value of people in our processes. Located in Birmingham, we are situated in a burgeoning technology hub that offers a balanced lifestyle, with affordable living, vibrant culture, and outdoor recreational opportunities. We offer a competitive benefits package that includes medical and dental coverage, retirement plans, and support for professional development.
Something wrong or incorrect with this job? Tell us in the chat 💬 on the right ➡️
You can find DevOps salaries in the United States here.

How many DevOps jobs are in the United States?

Currently, there are 775 DevOps openings. Check also: Cloud jobs, AWS jobs, Azure jobs, GCP jobs, Kubernetes jobs, Docker jobs, Terraform jobs - all with salary brackets.

Is the US a good place for DevOps?

The US is one of the best countries to work as a DevOps. It has a vibrant startup community, growing tech hubs and, most important: lots of interesting jobs for people who work in tech.

Which companies are hiring for DevOps jobs in the United States?

D3 Security Management Systems, Nurse Next Door, Snaplii, LYNKED Inc., Clarence Farm Services Ltd., DataAnnotation, Studio 3 Marketing among others, are currently hiring for DevOps roles in the United States.

The company with most openings is Peraton as they are hiring for 43 different DevOps jobs in the United States. They are probably quite committed to find good DevOps.