Person will be responsible for creating data pipeline architecture and implementing it between cross functional teams and analytics team. Candidate who is passionate in creating data pipelines from scratch and implementing it with best practices will be ideal. Data Engineer will work with software developers, database architects, data analysts and data scientists on data initiatives and will ensure optimal data delivery architecture is consistent throughout ongoing projects.
Create processes to support data transformation, data structures, metadata, dependency and workload management.
Experience building and optimizing ‘big data’ data pipelines, architectures and data sets.
Good in analysis and debugging of Root Causes, Databases and Queries.
Good analytic skills for structured and unstructured databases.
Strong project management, organizational and communication skills.
Experience supporting and working with cross-functional teams in a dynamic environment.
Experience with big data tools: Hadoop, Spark, Kafka, etc.
Experience with relational SQL and NoSQL databases, mostly Mysql
Experience with data pipeline and workflow management tools: Azkaban, Luigi, Airflow, etc.
Experience with AWS cloud services: EC2, EMR, RDS, Redshift
Work with cross functional team to assemble large and complex data sets.
Create and maintain optimal data pipeline architecture.
Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS ‘big data’ technologies.
Create and maintain highly availability systems for extraction of data and storage.
Keep our data separated and secure across national boundaries through multiple data centers and AWS regions.
Work with database administrators for further optimising queries as per data pipeline needs.
Work with data and analytics experts to strive for greater functionality in our data systems