About the role
Data engineers at DGlobal build data pipelines that synthesize data from millions of customers and convert it into useful information for trends and behavior analysis. The role requires a blend of technical competence, bias to action, and extreme ownership of the code base or the task at hand. DGlobal’s platform is built on cutting-edge tech and will provide ample opportunity to budding and experienced data engineers alike to learn and grow professionally.
Must-have
- Bachelor’s and/or Master’s degree in Computer Science, Computer Engineering, or related technical discipline
- 8+ years of experience with Spark, Hadoop, Hive, Presto, and Kafka
- Currently hands-on building big data pipelines
- Relational, NoSQL, and SQL databases including MongoDB and Postgres
- Stream-processing systems like Spark-Streaming and Storm
- Data storage formats like parquet
- Object function/object-oriented scripting languages including Scala, C++, Java, and Python
- Familiar with configuring CI/CD pipelines in GitHub, bitbucket, or Jenkins for building application binaries
- Workflow management and pipeline tools such as Airflow, Luigi and Kubeflow
Good-to-have
- Experience with machine learning frameworks such as Tensorflow, Keras and libraries such as scikit-learn
- Columnar Stores such as AWS Redshift