About the role:

Data engineers at DGlobal build data pipelines that synthesize data from millions of customers and convert it into useful information for trends and behavior analysis. The role requires a blend of technical competence, bias to action, and extreme ownership of the code base or the task at hand. Our platform is built on cutting-edge tech and will provide ample opportunity for budding and experienced data engineers alike to learn and grow professionally.

Must-have:

  • Bachelor’s and/or Master’s degree in Computer Science, Computer Engineering, or related technical discipline
  • 5+ years of experience with Spark, Hadoop, Hive, Presto, and Kafka
  • Relational, NoSQL, and SQL databases including MongoDB and Postgres
  • stream-processing systems like Spark-Streaming and Storm
  • Data storage formats like parquet
  • Object function/object-oriented scripting languages including Scala, C++, Java, and Python


Good-to-have:

  • Experience with machine learning frameworks such as Tensorflow, Keras and libraries such as sci-kit-learn
  • Familiar with configuring CI/CD pipelines in GitHub, bitbucket, or Jenkins for building application binaries
  • Workflow management and pipeline tools such as Airflow, Luigi and Kubeflow
  • Columnar Stores such as AWS Redshift