Job Description

Who We AreWe are looking for a Software or Data Engineer who is experienced in high-performance Python data processing libraries (often referred to as the Composable Data Stack). You will collaborate directly with our CTO and be part of the core product team.dlt is an open-source library that automatically creates datasets from messy, unstructured data sources. You can use the library to move data from about anywhere into the most well-known SQL and vector stores, data lakes, storage buckets, or local engines such as DuckDB, Arrow or delta-rs. The library automates many cumbersome data engineering tasks and can be handled by anyone who knows Python.dltHub is based in Berlin and New York City. It was founded by data and machine learning veterans. We are backed by Foundation Capital, Dig Ventures, and many technical founders from companies such as Datadog, Instana, Hugging Face, MotherDuck, Mesosphere, Matillion, Miro, and Rasa.Your Tasks and Responsibilities:Your tasks and responsibilities include: You design and implement OSS features that make dlt a gateway to composable data stack: integrating query engines, transformation frameworks, and table formats with our library. You listen to our users, always paying attention to what they need to go to production with dlt. You work with our customers in commercial projects where dlt is combined with existing modern data stack infrastructure. You maintain the open source project with the team (e.g., review PRs, resolve issues, talk with community contributors, etc.). Who You AreIf you are fascinated by the emerging ecosystem of data libraries in Python, youll enjoy working with us. You know what duckdb, arrow, datafusion, lancedb, delta-rs, ibis, pyiceberg, sqlglot, kedro, hamilton and similar Python libraries do and know when to apply them. You have experience in building data apps or products based on composable data stack. You have contributed code to any (or similar) of the projects above. You know what the Modern Data Stack is and appreciate certain aspects of it (i.e., maturity, fitting into enterprise workflows, etc.). You are interested in combining both worlds. You really like Python and are fluent in writing Python code (e.g., Python typing, unit testing, writing docstrings, etc.). You have a degree in computer science, data science, or other equivalent experience. You are familiar with GitHub workflows (e.g., pull requests, code reviews, CI/CD services, etc.). Nice to Have: You are based in Berlin and willing to work in our office regularly. You have a hacker nature and you love to make things optimized. Experience with DevOps (e.g., CI systems like GitHub Actions, Docker, Kubernetes, AWS/GCP/Digital Ocean, etc.). Experience with machine learning (e.g., the toolset, the workflows, practical applications, etc.). What Do We OfferIn our work culture, we value each others autonomy and efficiency. We have set hours for communication and deep work. We like automation, so we automate our work before we automate the work of others. We are an office-first company but give you plenty of opportunities for deep work and work from home. Dedicated "no meeting days" to help the team focus on their most impactful work. As we work often from the Berlin office, we cover your public transportation ticket. We are deeply committed to your personal and professional growth, so we have an annual budget for learning and development. We offer regular subsidized team lunches and Urban Sports Club membership. We also have an ESOP plan for employees, depending on their role and dedication. We provide an option to increase your ESOP if you grow with us. #J-18808-Ljbffr

Job Title

Company : dlthub.com

Location : Berlin, Berlin

Created : 2024-12-28

Job Type : Full Time