805 IT & Software Developer jobs in the US

Ellofant jobs

Site Reliability Engineer (SRE) - Strongsville

$53,000 - 81,000
Ellofant
Ellsworth Drive 19696, Strongsville
$53,000 - 81,000
Company Size icon
Company Size
<50
Company Type icon
Company Type
Services
Exp Level icon
Exp Level
Junior
Job Type icon
Job Type
Full-Time
Language icon
Language
English
Visa sponsorship icon
Visa sponsorship
No

Requirements

Must:
- 3-5 years of relevant experience in site reliability, infrastructure, or DevOps engineering - Strong expertise in monitoring and observability tools (e.g., Dynatrace, Grafana, Prometheus, Splunk) - Familiarity with incident management and event correlation platforms (e.g., BigPanda, ServiceNow, Moogsoft) - Proficient in Linux/Unix systems (RHEL) and Windows Server environments - Practical experience with cloud platforms: AWS, Azure, or OpenShift - Solid understanding of containerization and orchestration (Kubernetes, Docker, OpenShift) - Experience with chaos engineering and fault injection frameworks (Litmus, Gremlin, AWS FIS, Azure Chaos Studio) - Comprehensive knowledge of networking, database systems (Oracle, SQL), and distributed architectures - Exposure to event streaming platforms (Kafka) and service mesh technologies (Istio) - Familiarity with mainframe systems and legacy infrastructure - Experience in infrastructure as code and automation tools - Knowledge of job scheduling systems (CA7 or similar) and middleware technologies - Proficiency with Jira, Confluence, and ITSM tools - Background in financial services or other highly regulated industries preferred - Relevant certifications valued (AWS/Azure architecture, RHCE, VCP, Kubernetes CKA/CKAD) - Strong analytical skills, problem-solving acumen, and troubleshooting expertise - Excellent written and verbal communication skills for effective cross-functional collaboration

Technologies

Lambda
Confluence
Datadog
Dynatrace
Istio
ITSM

Responsibilities

- Coordinate responses to critical incidents with application support teams and the Site Reliability Center - Triage and respond to alerts generated by BigPanda event correlation platform - Assess cross-domain impacts and engage appropriate support teams or escalate as necessary - Participate in on-call rotations to ensure 24/7 coverage for critical systems - Conduct blameless post-mortems and root cause analyses to foster continuous improvement - Design and implement automated monitoring and alerting systems using tools like Dynatrace, Grafana, Logscale, and Datadog - Create effective dashboards and implement SLAs/SLOs through thorough monitoring - Analyze metrics from operating systems and applications for performance tuning and fault detection - Develop and implement chaos engineering practices using tools such as Litmus and Gremlin - Design fault injection experiments to validate system resilience using AWS Resilience Hub - Build self-healing capabilities and automated remediation workflows - Implement health checks and autoscaling solutions using AWS Lambda, Kubernetes, and OpenShift - Manage infrastructure across mainframe systems, Windows, RHEL, and cloud platforms like AWS and Azure - Work with containerized environments, event streaming platforms (Kafka), and databases (Oracle, SQL) - Maintain virtualization infrastructure (VMware) and storage systems (NAS) - Utilize ServiceNow for incident management, Jira for issue tracking, and CA7 for job scheduling - Identify opportunities to enhance application stability and advocate for SRE best practices - Maintain detailed knowledge bases and operational runbooks in Confluence - Mentor junior team members on resiliency patterns and operational excellence

Description


At Ellofant, we are a forward-thinking consulting firm dedicated to facilitating impactful change for our clients. We navigate complexity and support businesses in scaling through strategic insight, trusted technology, and hands-on execution. Our team values clarity, momentum, and the human aspect of our work. Were seeking a seasoned Site Reliability Engineer to enrich our infrastructure resiliency team. This role offers an opportunity to ensure system stability and reliability across diverse technology environments. We foster an inclusive workplace and offer a competitive benefits package, making Ellofant a great place to grow professionally.
Something wrong or incorrect with this job? Tell us in the chat 💬 on the right ➡️
You can find DevOps salaries in the United States here.

How many DevOps jobs are in the United States?

Currently, there are 805 DevOps openings. Check also: Cloud jobs, AWS jobs, Azure jobs, GCP jobs, Kubernetes jobs, Docker jobs, Terraform jobs - all with salary brackets.

Is the US a good place for DevOps?

The US is one of the best countries to work as a DevOps. It has a vibrant startup community, growing tech hubs and, most important: lots of interesting jobs for people who work in tech.

Which companies are hiring for DevOps jobs in the United States?

D3 Security Management Systems, Nurse Next Door, Snaplii, LYNKED Inc., Clarence Farm Services Ltd., DataAnnotation, Studio 3 Marketing among others, are currently hiring for DevOps roles in the United States.

The company with most openings is Peraton as they are hiring for 43 different DevOps jobs in the United States. They are probably quite committed to find good DevOps.