About the role:
Data engineers at DGlobal build data pipelines that synthesize data from millions of customers and convert it into useful information for trends and behavior analysis. The role requires a blend of technical competence, bias to action, and extreme ownership of the code base or the task at hand. Our platform is built on cutting-edge tech and will provide ample opportunity for budding and experienced data engineers alike to learn and grow professionally.
Must-have:
- Bachelor’s and/or Master’s degree in Computer Science, Computer Engineering, or related technical discipline
- 5+ years of experience with Spark, Hadoop, Hive, Presto, and Kafka
- Relational, NoSQL, and SQL databases including MongoDB and Postgres
- stream-processing systems like Spark-Streaming and Storm
- Data storage formats like parquet
- Object function/object-oriented scripting languages including Scala, C++, Java, and Python
Good-to-have:
- Experience with machine learning frameworks such as Tensorflow, Keras and libraries such as sci-kit-learn
- Familiar with configuring CI/CD pipelines in GitHub, bitbucket, or Jenkins for building application binaries
- Workflow management and pipeline tools such as Airflow, Luigi and Kubeflow
- Columnar Stores such as AWS Redshift