Be responsible to define and implement the Data Strategy, considering the integrity, governance and curation of the data.
The primary focus will be in applying data mining techniques for statistical analysis, and building high quality prediction systems integrated with our products
Leads and contributes to the delivery of processes to transform and load data from disparate sources into a form that is consumable by analytics processes, for projects with moderate complexity, using strong technical capabilities.
Designs, develops and produces library of reusable code which fulfills pre-processing, model training and building requirements for data science lifecycle .
Designs, develops and produces data models of relatively high complexity, leveraging a sound understanding of data modelling standards to ensure high quality
Builds networks with other departments across the business to help define and deliver business value, and may interface and communicate with program teams, management and stakeholders as required to deliver small to medium-sized projects
Your key responsibilities
Provide technical guidance to team members, provide compliance with data strategy and data management standards and procedures defined in the Data Management Framework
Leading the production of high-quality data engineering deliverables, helping to ensure project timelines are met, and providing informal mentoring / training to junior members of the team
Leading the delivery of data quality reviews including data cleansing where required to ensure integrity and quality
Leading the delivery of data models, data storage models and data migration to manage data within the organization, for a small to medium-sized project
Resolving escalated design and implementation issues with moderate to high complexity
Analyzing the latest industry trends such as cloud computing and distributed processing and beginning to infer risks and benefits of their use in business
Providing technical expertise to maximize value from current applications, solutions, infrastructure and emerging technologies and seek to continuously improve internal processes
Developing working relationships with peers across other engineering teams and beginning to collaborate to develop leading data engineering solutions
Driving adherence to the relevant data engineering and data modelling processes, procedures and standards
Skills and attributes for success
To qualify for the role you must have
Batch Processing - Capability to design an efficient way of processing high volumes of data where a group of transactions is collected over a period of time ·
Data Integration (Sourcing, Storage and Migration) - Capability to design and implement models, capabilities and solutions to manage data within the enterprise (structured and unstructured, data archiving principles, data warehousing, data sourcing, etc.
This includes the data models, storage requirements and migration of data from one system to another ·
Data Quality, Profiling and Cleansing - Capability to review (profile) a data set to establish its quality against a defined set of parameters and to highlight data where corrective action (cleansing) is required to remediate the data ·
Stream Systems - Capability to discover, integrate, and ingest all available data from the machines that produce it, as fast as it’s produced, in any format, and at any quality
Ideally, you’ll also have experience or exposure to few of the technologies in each family.
Business Information Glossaries Azure Data Catalogue, Collibra,
Cloud Computing Services Azure
Distributed Systems Azure ADLS Gen 2, Databricks, Hadoop, HDFS, Kafka, MapReduce / Hive, Spark, Storm, Zookeeper
ETL Tools Alteryx, Azure Data Factory, Power BI Dataflows, Power Query, SAP Data Services, SSIS, Talend, Trifacta.
Graph Databases Azure Cosmos
NoSQL Columnar Stores Accumulo, HBase
Programming Languages .Net, C#, C++, CSS, Hive, HTML5, Java, MATLAB, Node.js, Pig, PowerShell, Python, R, Ruby, Sass, Scala, Shell Scripting, SQL, Visual Studio, XML
Relational SMP Databases Azure SQL PaaS, Oracle, SQL Server
Understanding of machine learning techniques and algorithms, such as k-NN, Naive Bayes, SVM, Decision Forests, etc.
Experience with common data science toolkits
Experience with data visualisation tools - PowerBi
Proficiency in using query languages such as SQL, Hive, Pig, etc
Experience with NoSQL databases, such as MongoDB, Cassandra, HBase, etc
What we look for
Hunger to learn new technologies and coach
Hands on experience.
Must work as a developer The person need hands on experience with Python, R, Java, Scala, C++, Spark; ML libraries such as H2O, Spark MLlib, ML pipelines, Scikit-learn, H2O, Keras, Tensor-flow .
Net , C# , Logic Apps, Web Services, J3.ds with strong coding capability in one or more languages, especially PowerShell and strong integration knowledge based on authentication and authorization in cloud.
Speed to deliver.
What working at EY offers (ready to use)
We offer a competitive remuneration package where you’ll be rewarded for your individual and team performance. Our comprehensive Total Rewards package includes support for flexible working and career development, and with FlexEY you can select benefits that suit your needs, covering holidays, health and well-being, insurance, savings and a wide range of discounts, offers and promotions. Plus, we offer :
Support, coaching and feedback from some of the most engaging colleagues around
Opportunities to develop new skills and progress your career
The freedom and flexibility to handle your role in a way that’s right for you
EY is committed to being an inclusive employer and we are happy to consider flexible working arrangements. We strive to achieve the right balance for our people, enabling us to deliver excellent client service whilst allowing you to build your career without sacrificing your personal priorities.
While our client-facing professionals can be required to travel regularly, and at times be based at client sites, our flexible working arrangements can help you to achieve a lifestyle balance.