Must have skills:
- Have 5+ years of experience with developing data applications using big data technologies such as Hadoop, Spark, Flink, Dataflow etc.
- Experience with workflow orchestration tools such as Airflow/Luigi/Azkaban etc.
- Experience with coding languages like Python/Java/Scala
- Experience with at least one cloud platform AWS/GCP/Azure
- Hands-on experience and highly advanced knowledge of SQL, Data Modeling, ETL Development, and Data Warehousing
- Experienced in scalable, configurable, parameterized, modular programming practices for data engineering.
- Knowledge and experience with Data Management and Data Storage best practices.
- Exposure to large databases, BI applications, data quality and performance tuning
- Good to have understanding of job management, resiliency
- Good to have prior experience with Graph, Time-series databases
Role and Responsibilities:
- Architect highly metadata-driven data pipelines with algorithms for data deduplication, data harmonization, fuzzy matching, identity resolution
- Design and architect relational, time series and graph databases to run OLAP queries
- Design and develop SDKs and APIs to enable configurable data consumption paradigms
- Build tools to monitor the health of the data pipelines and data infrastructure
- Develop and lead the Data Engineering team