REVIEW NOTES: DATA SCIENCE FOR BUSINESS BY PROVOST & FAWCETT: CHAPTER 7
I enjoyed reading this chapter. It’s insightful and well explained with detailed examples, diagrams and graphics, on a few data science topics that correspond directly to conventional scientific research in computer science. That makes me happy, because these are crucial points, yet rarely are the focus of Kaggle Competitions, books on Machine Learning or Statistics, the latest and greatest in TensorFlow, PyTorch, AutoML libraries (etc, etc) and too infrequently discussed in DL/AI/ML social posts and blogs. Below I have written about the points that are well worth taking home. These topics are broadly on:
Careful consideration of what is desired from data science results.
Expected value as a key evaluation framework.
Consideration of appropriate comparative baselines, in machine learning models.
News from April 10th 2019 is the release of Google’s collaborative AI platform for Data Science teams, for execution on cloud or on premises. Google’s platform joins Alibaba‘s similar platform called PAI 2.0 announced in March 29th 2017. While comprehensive information on Alibaba’s platform is sparse in non-Chinese, the Google AI Platform does give samples and tutorials. Two others ClusterOne for the DevOps of data science and DeterminedAI for collaboration each had funding announcements earlier this year. Google and Alibaba’s platforms give a clear separation for team roles to collaborate at each stage of the process (as is indicated for the two yet-to-be-released others). The concept is well worth a mention because they are collaborative frameworks pushing forward the methodologies of data science, engineering and in essence, social intelligence..