A career at our company is an ongoing journey of discovery: our around 56,000 people are shaping how the world lives, works and plays through next generation advancements in healthcare, life science and performance materials. For more than 350 years and across the world we have passionately pursued our curiosity to find novel and vibrant ways of enhancing the lives of others.
Our IT R&D application team is currently seeking Big data Engineer position located in Shanghai, China. This role will work collaboratively with our internal project development team, business facing IT team and business data team, as a core member of project team to bring big data and ETL capabilities into R&D related big data / AI and scientific related analytic application delivery and sustain operation to support global and China.
ESSENTIAL JOB FUNCTIONS
- Contribute to the core design of data architecture, data models and schemas, and implementation.
- Design, build and maintain optimal data pipeline architecture for optimal extraction, transformation, and loading (ETL) of data from a wide variety of data sources, including external APIs, data streams, and data stores.
- Design, create and maintain the foundation for ingesting data providing frameworks and services for operating on that data.
- Design, create and maintain the foundation for real-time streaming analytics, big data analytics capabilities.
- Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing foundation for greater scalability, etc.
- Work with stakeholders including the business facing, and business data teams to assist with data-related technical issues and support their data foundation needs.
- Work with security to implement data privacy and data security requirements to ensure solutions stay compliant to security standards and frameworks.
- Manage database governance, support team to improve database management, database related issue troubleshooting and provide technology solution accordingly.
- Contribute to a culture of teamwork.
- Bachelor degree and above in computer science or equivalent relevant experience.
- 8+ years industry experiences required.
Experience and Skills:
- Proficient with ANSI SQL relational database (Oracle, SQL server, PostgreSQL, MySQL).
- Proficient with NoSQL (MongoDB, Cassandra).
- Proficient on ETL development standard, data modeling and schema design, data governance, meta data design, OLAP related analytic capability.
- Proficient on SQL optimization, performance turning to handle big data, strong capability on storage procedure and ETL related opensource technology and tools.
- Proficient on Python on data pre-processing, using python to process with different data source (text, csv, excel, webpage, etc)
- Experience with Java programming is a plus.
- Passion and aspire to learn new skills and strong self-learning capability.
- Agile mindset and good team work spirit.
- Good English reading and writing
Below big data related experience is a plus.
- Experience with a variety of big data frameworks and tools (Hadoop, HBase, Hive, Pig).
- Spark, working in RDDs and DataFrames/Datasets API (with emphasis on DataFrames) to query and perform data manipulation
- Spark Structured Streaming
- Experience building large scale Spark applications, ideally with either Batch processing and/or Streaming processing
- Experience in SparkSQL
- Experience with Micro services, containerized (Docker, ECS) or serverless (Lambda) deployment
What we offer: With us, there are always opportunities to break new ground. We empower you to fulfil your ambitions, and our diverse businesses offer various career moves to seek new horizons. We trust you with responsibility early on and support you to draw your own career map that is responsive to your aspirations and priorities in life. Join us and bring your curiosity to life!
Curious? Apply and find more information at https://jobs.vibrantm.com