6. Data science is not science
Just because we call data ‘science’ science does not make it so. It is not science or even a type of science, if a ‘type’ of science were indeed possible. (there are fields and subfields of science but ‘data’ is not one of them). Something is either science or it is not, and data science is not. It is a tool of science, a powerful tool indeed, but nothing more than that. Yes it is cool and fun to call yourself a scientist when your friends and family ask you what you do for a living, but don’t kid yourself, you are not a scientist. All science is data science in the sense that all science uses parts of some of the tools that fall under the rubric of “data science” to analyze data that is generated by the experiments that are conducted in that particular field or, in the case of purely theoretical fields, to analyze data sets that are relevant to the theory. Statistics and mathematics have always been useful tools for the scientist and they always will be, calling them data science does not make them science. More recently, powerful computers, and clever algorithms have greatly expanded the insights to be gained from even the most mundane of data sets. Moreover, the size of data sets that can be analyzed has expanded allowing for ever more powerful and increasingly accurate predictions (in some cases though not nearly as many as once thought likely). That said, using computers running algorithms to analyze data (no matter how mundane and small or how interesting and large) is not science, computers are still only tools of science and thus not science. A good, though imperfect analogy is a calculator in mathematics. It is a very useful tool for doing math, yet there is no field of “calculator mathematics.” Yes the analogy fails in many respects but the point of the example is clear, a tool of a thing is not the thing itself and never can be.