A pragmatic view on the role of statistics and statisticians in modern data analytics

Ron S. Kenett (KPA Ltd., Raanana, Samuel Neaman Institute, Technion, Haifa and Institute for Drug Development, The Hebrew University, Jerusalem)


Note: This is an updated version of an older post put on Professor Kenett’s LinkedIn.

I have been following and contributing to two interesting blogs in the past year or so discussing various aspects of statistical inference and statistical analysis. These are:

Based on this, I am developing a somehow pessimistic but pragmatic view on the role of statistics in modern data analytics. Below are 10 points summarizing this perspective:

  1. The background for my views is my JRSS(A) paper and Wiley book on information quality with Galit Shmueli, and several dozen applications in a variety of areas such as official statistics, chemical engineering, the publications reviewing process and GDPR EU regulations.
  2. A follow up to this has been work and publications in the area of data integration and generalization of findings, with applications to clinical research and translational medicine.
  3. This got me to interesting discussion with Ron Wasserstein who invited me to the 2017 ASA SSI event that I reported in the ISBIS blog.
  4. First red light: At that point in time, the journals in psychology were rethinking hypothesis testing, banning p-values and confidence intervals and any inferential statistical procedures.
  5. At JSM 2018, I organized a panel on a life cycle perspective of statistics which was very well attended and attracted lots of interest.
  6. This was also the topic of my keynote speech at the 2014 Stu Hunter conference in Phoenix where I proposed a life cycle view of the statistics perspective.
  7. To me, both a broad perspective on statistical analysis and an emphasis on information quality and its 8 dimensions is how statistical applications should be considered. For more see my Box award lecture in Nancy at ENBIS 2018. An interview where I explain (in Italian) why communication and operationalization of findings (the 7th and 8th information quality dimensions) are important to statistical work can be watched here.
  8. Second red light: What started with BASP in psychology has now moved to journals on information systems.
  9. So, basically, it seems that statistics is losing the battle for a leadership position in the data analytic space. ASA I and ASA II, as labeled by Mayo, have added to the chaos. Narrow focused authors, pushing for a point they strongly believe in, got their audience to lose the big picture you have when you work with customers and companies. I made a proposal for addressing this in a note started 10 years ago which basically got no attention.
  10. Many of the above is discussed in the two video recorded presentations you can access through the ISBIS blog and the interview published here.

Overall, the idea is that, for survival, statistics needs to contribute to the data science movement by looking hard at how it can push forward unique selling points (USPs). The comments above map possible directions for statistics to take in order to keep playing an important role in modern data analytics.

Will it?

Leave a comment