Job Description
Play a key role in building next generation data warehousing systems to power impactful business decisions for large corporations. Design and implement scalable data warehouse architecture, including various ETL pipelines and workflow orchestration tools on leading cloud platforms to enable data science and analytics programs.
Key Responsibilities:
- Create data architecture that is flexible, scalable, consistent for cross-functional use, and aligned to stakeholder requirements
- Leverage cloud services and build cloud-deployable solutions, preferably in a serverless environment
- Design, develop, and maintain scalable and resilient ETL/ELT pipelines for handling large volumes of complex data
- Deploy a real-time data governance strategy (organize, transform, activate) using workflow orchestration tools to comply to data modeling, integrity, and privacy principles
- Ensure that the data pipeline infrastructure meets the analysis, reporting, and data science needs of the organization
- Collaborate with stakeholders like data analysts, data scientist, and IT infrastructure/DevOps teams
Knowledge
Qualification
- Bachelor's / Master's in Computer Science, Engineering, Statistics, Information Systems, or other quantitative fields
Experience
- 7+ years of industry experience in data engineering, data science, or related field with a good understanding of cloud services
- Experience working on warehousing systems, and an ability to contribute towards implementing end-to-end, loosely coupled/decoupled technology solutions for data ingestion and processing, data storage, data access, and integration with business
user centric analytics/BI frameworks
Skills
Technical/Domain
Must have skills:
- Hands-on experience with Python, advanced level SQL or NoSQL (Google Big Table, MongoDB, or similar)
- Any of data warehousing systems (Redshift, Big Query, Snowflake, or similar), ETL tools (Talend, SSIS, Informatica PowerCenter, or similar)
- At least one cloud environment and related services (AWS, GCP, Azure, Dataproc, or similar), and one or more CI/CD tools (Git, Bitbucket,
Jenkins, Docker, or similar)
Nice to have skills:
- Hands-on experience with workflow orchestration tools, setting up CI/CD pipeline to different environment (dev, stage, prod), infrastructure provisioning (IaaS) tools, data visualization tools (Tableau, PowerBI, or similar)
- Hands-on experience with other programming languages such as Pyspark, Scala, Node.js, or similar
Mindset and Competencies
Behavioral
- Client centric and result-oriented, with a learning mindset, attention to detail, and ability to prioritize one's work
Other Information
Travel
- Travel (national/international) only if required
- Should be willing to take short/long term transfers across different geographies and/or roles
Time Zone
- Working in client specific time zones if required (PST, CST, EST, GMT)
Location
- Mumbai / Bengaluru / Coimbatore / Pune (remote for now)
Reporting relationship:
Reporting to
Sr. Manager / Asst. Vice President