If you are a critical thinker with a solid track record of developing data solutions and solving complex problems, we want you to join our team! You will play a vital role in designing and developing our next generation data pipelines and data platform. Join the team and prototype new internal and external data product ideas and concepts!
REQUIREMENTS
- Strong SQL and Python skills including knowledge of Python libraries / frameworks
- Comfortable working directly with data analytics to bridge business requirements with data engineering
- Experience with AWS tools including EMR/Athena, S3, Kinesis, API Gateway, LAMBDA, Athena, etc.
- Excellent troubleshooting and problem-solving skills
- Experience with workflow management tools (Airflow, Oozie, Azkaban, Luigi, etc.)
- Ability to operate in an agile, entrepreneurial start-up environment, and prioritize
- Excellent communication and teamwork, and a passion to learn
- Experience with Data Integration Technologies (Pentaho, Talend, Informatica, Glue, etc.)
- Experience with Snowflake, Redshift or other MPP databases is a plus
- Familiarity with distributed computing platforms (e.g. Hadoop, Spark, Storm)
What You Will Be Doing
- Build and maintain multiple data pipelines to ingest new data sources (APIs, Files, Streaming, Databases, Email, etc.) and support products used by both external users and internal teams
- Optimize infrastructure and pipelines by building DataOps tools to evaluate and automatically monitor data quality, auto-scale serverless infrastructure, and develop data driven pipelines
- Work with our data science and product management teams to design, rapidly prototype, and productize new internal and external data product ideas and capabilities
- Work with the data engineering team to migrate and enhance our existing Pentaho-based ETL pipeline to a new Python-based system and develop a serverless cloud data lake to augment our existing Snowflake Data Warehouse
- Conquer complex problems by finding new ways to solve with simple, efficient approaches with focus on reliability, scalability, quality and cost of our platforms
- Build processes supporting data transformation, data structures metadata, and workload management
- Collaborate with the team to perform root cause analysis and audit internal and external data and processes to help answer specific business questions