How do you evaluate and select the right Data Science and Machine Learning platform?

  1. Who will be the primary user of the ML platform? The Data Science team, application developers, or the BI and analytics team?
  2. What are the skill-level and data science expertise of the primary user? Are they expert data scientists with several years of experience or just starting?
  3. Which programming language is most used and preferred by the intended users — Python, Scala, R, or something else?
Data Science workflow and role of automation
  1. Data Ingestion and Preparation: How much manipulation of data must be performed before it is ready for ingestion by the DSML platform? Can you upload data to the platform without having to write additional SQL code?
  2. Feature Engineering Automation: How much manual work is involved in Feature Engineering? Will the platform support automated feature engineering and can the AI engine automatically explore all available database entity relationships and discover and evaluate features based on available columns and relationships?
  3. Machine Learning: Does the system support automated machine learning, state-of-the-art ML algorithms like scikit-learn, XGBoost, LightGBM, TensorFlow, and PyTorch? Can the users perform an automated hyper-parameter search of ML algorithms?
  4. ML Operationalization: How easy is it to deploy ML models in a production environment? Can you monitor models, discover model drift, and quickly retrain models if production data changes over time?
  5. Platform Integration, Ease of Use, and Deployment Flexibility: Can all steps of the data science process be executed seamlessly within a single platform without the need for moving between systems and applications?

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store