K&K Social Resources & Development GmbH is an international recruiting agency that has been providing technical resources in the European region since 1993. This position is with one of our clients in India who is actively hiring candidates to expand their teams.Title: Site Reliability EngineerLocation: India - RemoteEmployment Type: PermanentNotice Period: Immediate to 1 weekWe are looking for a Site Reliability Engineer (SRE) with a strong infrastructure background to ensure the reliability, scalability, and performance of our systems. This role will focus on managing cloud infrastructure, automating processes, and enhancing operational efficiency.Key ResponsibilitiesDevelop and maintain infrastructure as code using Terraform to provision and manage resources.Design, deploy, and manage containerized applications using Kubernetes (K8s).Automate operational tasks through scripting in languages such as Python, Bash, or any other scripting language.Build and enhance CI/CD pipelines for seamless deployment and system updates.Manage and optimize systems on any major cloud platform (AWS, Azure, or GCP).Monitor system performance, troubleshoot issues, and implement improvements to ensure high availability.Establish robust monitoring, logging, and alerting systems to maintain operational excellence.Collaborate with cross-functional teams to design resilient and scalable systems.Required Skills and Qualifications3+ years of experience in Site Reliability Engineering or Infrastructure Engineering.Strong expertise in Terraform for infrastructure automation.Hands-on experience with Kubernetes (K8s) and container orchestration.Proficiency in at least one scripting language (e.g., Python, Shell, Ruby, etc.).Experience with CI/CD tools like Jenkins, GitLab, or GitHub Actions.Familiarity with cloud platforms such as AWS, Azure, or GCP.Solid understanding of infrastructure components, networking, and system security.Knowledge of monitoring tools like Prometheus, Grafana, or similar.Preferred SkillsExperience with system performance optimization and scalability strategies.Familiarity with disaster recovery planning and implementation.Knowledge of configuration management tools (e.g., Ansible, Puppet, or Chef).
Job Title
Site Reliability Engineer