About the Role:We are looking for an experienced L2 Operations Manager to lead cloud platform operations and ensure seamless, high-availability payment processing in a regulated, fast-paced environment. This role requires deep expertise in payment technologies (ACH, Fedwire, EPP, SWIFT, ISO 20022, TCH), cloud infrastructure (AWS, Azure, GCP), and strong command over SRE principles, observability, and incident management. You’ll be accountable for not only driving operational excellence and reliability but also fostering a culture of continuous improvement, cost efficiency, and team capability growth.Key Responsibilities:Lead L2 Operations to ensure high availability, rapid incident resolution, and thorough RCA, aligned with SRE principles.Define and manage SLIs/SLOs, drive automation, and implement resilience measures like auto-scaling and self-healing.Oversee cloud infrastructure (AWS, Azure, GCP) for scalability, cost efficiency, and performance optimization in partnership with FinOps.Manage observability platforms (Datadog, Prometheus, Grafana, Splunk, ELK), building custom dashboards, alerts, and log parsers.Automate operational tasks and health checks using Python, Bash, Terraform, or Ansible.Lead Post-Incident Reviews, maintain runbooks/playbooks, and promote a blameless, learning-focused culture.Own 24x7 support coverage, on-call schedules, DR readiness, and ensure SLA compliance.Collaborate with DevOps, Product, and Engineering on system performance, chaos testing, and risk mitigation strategies.Drive team capability through mentoring, training, and continuous improvement initiatives.Provide leadership in payment infrastructure upgrades, transaction automation, and adherence to evolving standards (ISO 20022, SWIFT, TCH).Required Skills and Qualifications:15+ years of experience in IT Operations, Cloud Infrastructure, and SRE, with 5+ years in leadership roles.In-depth understanding of Enterprise Payments Platform (EPP), ACH, Fedwire, and ISO 8583/20022 protocols.Strong hands-on experience with AWS, Azure, or GCP for cloud operations and infrastructure deployment.Expertise in observability platforms (Datadog, Prometheus, Grafana, Splunk, ELK).Familiarity with incident resolution, RCA processes, and ITIL frameworks.Experience with automation tools and scripting languages (Python, Bash, Terraform, PowerShell, Ansible).Sound knowledge of Docker, Kubernetes, microservices, and CI/CD tools like Git, Jenkins, or GitLab.Proven track record in cost governance, collaboration with FinOps, and managing tool spend.Experience working in Agile, Scrum, or ITSM environments.Excellent communication skills with the ability to convey complex technical concepts to non-technical stakeholders.
Job Title
L2 Ops Manager