– Design and develop data feeds from an on-premise environment into a
datalake environment in an AWS cloud environment
– Establishments functional and non-functional requirements around the
feeds
– Work with the Business Unit owners and integration teams to design the processes for managing and monitoring the feeds to client standards
– Work with the integration team to build and test the feed components
– Design and develop programmatic transformations of the data to correctly
partition it, format it and validate or correct its data quality
– Establish the functional and non-functional requirements for
formatting and validating the data feeds
– Design processes appropriate to high volume data feeds for managing
and monitoring the feeds to client standards
– Build and test the formatting and validation transformation
components
– Design and develop programmatic transformation, combinations and
calculations to populate complex datamarts based on feeds from the
datalake
– Establish requirements that a datamart should support
– Design the target data model, the transformations and the feeds,
appropriate to high volume data flows, required to populate the datamarts
– Build and test the target data model, the transformations and the feed
required to populate the datamarts
– Provide operational support to datafeeds and datamarts
– Identity and perform maintenance on the feeds as appropriate
– Work with the front-line support team and operations to support the feeds
in production
– Design infrastructure required to develop and operate datalake data feeds
– Specify infrastructure requirements for feed and work with operations team to implement those requirements and deploy the solution and future updates
– Design infrastructure required to develop and operate datamarts, their user
interfaces and the feeds required to populated them
– Specify infrastructure required to develop and operate datamarts
– Specify infrstructure in term of front-end tools required to exploit the
datamarts for end-user and work with front end team to deploy a complete solution for the user

Skills:

– Talend
– AWS: EMR, EC2, S3
– Python
– PySpark or Spark
– Business Intelligence data modelling
– SQL
– NoSQL

Learn more/Apply for this position