Key Responsibilities-
Designing, implementation & maintenance of various data and machine learning-based components and applications for the website.
Work with large, complex data sets. Conduct end-to-end data analysis that includes data gathering and requirements specification, processing, analysis, visualization and delivering the necessary output.
Build and prototype analysis pipelines iteratively to provide insights at scale.
Understanding various data structures used across various products within the organization.
Making business recommendations for the effective use of data across various business functions.
Day to day work involves NLP tasks like Information Extraction, Keyword Extraction or Keyword scoring as well as Recommendation & Personalization activities across various channels by building user profiles.
The data models built must also be tested in real-world environments for accuracy.
Computational performance in running the models on various devices also needs to be achieved prior to delivering the models for deployment. Articulating business questions and using mathematical techniques to arrive at an answer using available data.
Experience of working with Data Science tools like Python – Pandas, Matplotlib, sci-kit-learn, Nltk, NumPyb, Keras, Tensorflow, OpenCV, FastText, etc.
Should have actively built and deployed working data models as a Restful API.
Exposure towards deep learning using Tensorflow/CNTK.
Command over programming in Python/Scala/R but we generally like people who are programming language agnostic.
Exposure over working with Audio (speech), Image, Video or any other non-textual data would be a huge plus.
Working knowledge of a query language SQL/MySQL.
Strong knowledge of the basics of data science incl computer science from basic Linear Regression to various RNNs.
Should have applied experience with machine learning on large datasets.
Must be able to identify and remove noise from data. – Ability to select the right statistical tools in a given data analysis problem.
The penchant to keep an eye on the latest happenings in the world of data science engineering across the world.
Creating automated anomaly detection systems and constant tracking of its performance.