Our team is young, creative, and passionate. We have more than 50 employees with their diverse experience working in start-up companies.
Description:
About your responsibilities for the Role:
- Perform data exploration, data cleaning, data imputation, and feature engineering on unstructured and structured data.
- Build the infrastructure for optimal extraction, transformation, and loading (ETL) of data from a wide variety of data sources.
- Develop and maintain optimal data pipeline architecture for training statistical and machine learning models such as regression and classification.
- Develop and maintain evaluations to measure the effectiveness of training data. This includes measuring the capabilities of models on a variety of tasks and domains.
- Collaborate with data scientists and machine learning engineers to develop a comprehensive data science/machine learning solution pipeline.
Requirements:
What you need to have (Minimum Qualifications):
- Bachelor’s degree from computer science or related fields, or equivalent software engineering experience.
- Proficiency in Python programming language
- Experience in dataset processing and feature engineering using tools such as Numpy, Pandas, and Scikit-Learn
- Visualization skills using tools such as Matplotlib, Seaborn, and Bokeh
- Understanding of deep learning frameworks such as Pytorch and TensorFlow
- Understanding of SQL and NoSQL
- Understands Hadoop / Spark / Kafka / Hive / Presto
- Proficiency in source control i.e. Git
Preferred:
What would make you stand out from the crowd (Preferred Qualifications):
- Deep understanding of Object-Oriented Programming (OOP) concepts such as inheritance, delegation, and abstract class
- Understanding of cloud-native technologies such as AWS, GCP, and Azure
- Experience in using Docker
- Experience in using AWS services such as S3, EC2, Glue, Sagemaker
- Experience in AWS Step Function and/or AWS Lambda is even better
- Proficiency in Scala and Java programming languages
- Enjoy iterating quickly with research prototypes and learning new technologies