IN EmploymentAlert | Hyperion AI | Senior Java Architect - India
Skip to Main Content

Job Title


Hyperion AI | Senior Java Architect - India


Company : Hyperion AI


Location : Bareilly, Uttar pradesh


Created : 2025-01-07


Job Type : Full Time


Job Description

Experience Level: Senior/Lead Java Architect (10+ years of experience)Job Description:We are seeking an experienced Senior Lead Java Architect with deep expertise in Apache Spark, Azure Data Services (ADLS, Azure Blob Storage, Azure Database), and Redpanda. The ideal candidate will play a pivotal role in designing, architecting, and leading the development of scalable data processing pipelines using Java and Spark. You will work closely with cross-functional teams to build data-intensive applications and drive best practices in cloud-based architecture. If you're passionate about data engineering, distributed systems, and cloud technologies, this role is for you.Key Responsibilities:Design and Architect Distributed Data Processing Systems:Lead the design and implementation of large-scale Apache Spark jobs using Java for processing vast datasets from various sources, including blockchain data streams and financial transactions.Architect robust, high-performance data pipelines integrated with Azure Data Lake Storage (ADLS), Azure Blob Storage, and Azure Databases for real-time and batch processing.Development and Optimization of Spark Jobs:Develop, optimize, and manage Spark jobs primarily in Java (JVM-based), ensuring high scalability, fault tolerance, and performance.Implement data validation, schema management, and parquet file handling in Spark jobs for seamless integration with ADLS and Iceberg tables.Integration with Redpanda:Lead the integration of Redpanda (or Kafka-like message brokers) for real-time data ingestion, ensuring reliable, low-latency communication between the data ingestion services and downstream processing components.Collaborate with the engineering team to build Redpanda consumers for real-time streaming data and ensuring that Spark jobs consume this data efficiently.Azure Cloud Services:Design cloud-native solutions leveraging Azure services such as Azure Data Lake Storage (ADLS), Azure Blob Storage, and Azure SQL/NoSQL Databases.Collaborate with the DevOps team to ensure the seamless integration of Spark jobs with Azure services, including data ingestion and storage management.Manage and optimize storage costs and performance for large datasets on Azure services.Lead and Mentor Engineering Teams:Provide technical leadership, mentoring, and guidance to a team of developers, ensuring best practices in Java development, Spark optimization, and cloud-native architecture.Conduct regular code reviews, architecture assessments, and performance tuning exercises to ensure code quality and adherence to industry standards.End-to-End Data Pipeline Management:Manage the full lifecycle of data pipelines, from initial data ingestion in Redpanda to transforming, storing, and serving data using Azure Data Services and Spark.Ensure that data pipelines are monitored, secure, and highly available by integrating with tools like Prometheus, Azure Monitor, and Grafana.Performance Tuning and Resource Optimization:Lead efforts in performance tuning and resource optimization for Spark jobs, ensuring minimal shuffling and efficient memory usage for large-scale data processing.Develop caching mechanisms, partitioning, and task scheduling strategies to maximize throughput and reduce processing time.Required Qualifications:13+ years of experience in software development with strong expertise in Java and JVM-based technologies.5+ years of experience working with Apache Spark for data processing in Java.Extensive experience with Azure Data Services, including Azure Data Lake Storage (ADLS), Azure Blob Storage, and Azure Databases.Strong expertise in distributed data processing and streaming frameworks, with hands-on experience integrating message brokers like Redpanda or Kafka.Proven track record in cloud-native architecture design with a focus on Azure.Hands-on experience with data storage optimization, partitioning, and Spark performance tuning.Leadership and mentoring experience with the ability to guide and develop high-performing engineering teams.Experience with CI/CD pipelines, containerization (Docker, Kubernetes), and DevOps practices.Strong understanding of security best practices for cloud-based data pipelines.Preferred Qualifications:Experience with Iceberg tables and Parquet file formats in a distributed processing environment.Knowledge of Prometheus, Grafana, or Azure Monitor for observability and monitoring.Experience with streaming data processing using Redpanda and integration with Spark jobs for real-time data flows.Familiarity with microservices architecture using Spring Boot and cloud-based deployment strategies.