IN EmploymentAlert | K&K social resources and development GmbH | Site Reliability Engineer
Skip to Main Content

Job Title


K&K social resources and development GmbH | Site Reliability Engineer


Company : K&K social resources and development GmbH


Location : Dindigul, Tamil nadu


Created : 2025-01-07


Job Type : Full Time


Job Description

K&K Social Resources & Development GmbH is an international recruiting agency that has been providing technical resources in the European region since 1993. This position is with one of our clients in India who is actively hiring candidates to expand their teams.Title: Site Reliability EngineerLocation: India - RemoteEmployment Type: PermanentNotice Period: Immediate or 1 weekResponsibilities:Manage, monitor, and optimize cloud infrastructure preferably on Azure or Google Cloud Platform (GCP), ensuring high availability and performance.Design, deploy, and maintain containerized applications using Kubernetes and related tooling.Implement infrastructure as code (IaC) using tools like Terraform and Ansible to automate environment provisioning, configuration, and scaling.Build and maintain CI/CD pipelines using Jenkins, Git, and GitOps principles to ensure smooth deployment and integration processes.Apply SRE best practices to improve system reliability, availability, and performance through monitoring, alerting, and automation.Work closely with development, operations, and QA teams to streamline processes and promote a culture of continuous improvement.Participate and be a key player in diagnosing and resolving production issues.Maintain comprehensive documentation for systems, procedures, and processes.Required Skills:Strong hands-on experience in Azure or GCP cloud environments.Proficiency in Kubernetes, Ansible, Terraform, and Git.Solid understanding of CI/CD pipelines and related tools such as Jenkins and GitOps.Familiarity with DevOps practices, including automation, continuous integration, and continuous deployment.Knowledge of software development and its intersection with infrastructure and operations.Experience with SRE principles such as monitoring, alerting, reliability metrics, and incident management.Experience with scripting languages such as Python and Shell Scripting.Certifications in Azure, GCP, or Kubernetes are a plus.Experience with monitoring and logging tools like Prometheus, Grafana, or ELK stack.Excellent problem-solving skills and a proactive attitude.Strong communication and collaboration skills.