by Ankur Gupta, Machine Learning Scientist, Premise Data, San Francisco. Increased internet connectivity has allowed large numbers of people to work towards a single goal in a distributed fashion. This practice is called crowdsourcing and we see successful examples of crowdsourcing everywhere. The most famous example is perhaps Wikipedia, which allows anyone in the world … Continue reading Using statistics and data science to build a crowdsourcing data platform

David Steinberg. Most of us come across new ideas and interesting research by attending conferences and reading journals. Naturally, we begin with those meetings and journals that are in our own field. However, many articles with interesting statistical content appear in other journals and meetings. This should not be surprising: statistics is a part of … Continue reading Highlighting Interesting Articles that are NOT in the Statistics Literature

David Steinberg. This paper presents a new convolutional neural network-based time-series model. Typical convolutional neural network (CNN) architectures rely on the use of max-pooling operators in between layers, which leads to reduced resolution at the top layers. Instead, this work considers a fully convolutional network (FCN) architecture that uses causal filtering operations, and allows for … Continue reading Time-series modeling with undecimated fully convolutional neural networks

David Steinberg. Quality by Design (QbD) in the pharmaceutical industry is a systematic approach to development of drug products and drug manufacturing processes that begins with predefined objectives, emphasizes product and process understanding and sets up process control based on sound science and quality risk management. The FDA and similar regulatory groups elsewhere are actively … Continue reading A quality by design (QbD) case study on liposomes containing hydrophilic API

David Steinberg. The principle of peer review is central to the evaluation of research, by ensuring that only high-quality items are funded or published. But peer review has also received criticism, as the selection of reviewers may introduce biases in the system. In 2014, the organizers of the Neural Information Processing Systems conference conducted an … Continue reading Arbitrariness of peer review: A Bayesian analysis of the NIPS experiment

David Steinberg. Azmak et al. introduce the Kavli HUMAN Project (KHP), a unique and ambitious attempt to exploit big-data health analytics to study factors that contribute to good health. The KHP differs dramatically from typical large scale health studies in the depth of data that will be collected. Most such programs focus on very specific … Continue reading Using Big Data to Understand the Human Condition: The Kavli Human Project

David Steinberg. Dew and Ansari look at how to automate customer analytics. This can be a crucial activity for companies that manage distinct customer bases. In these data-rich and dynamic settings, visualization is essential for understanding events of interest. The value of visualization has led to the popularity of analytics dashboards. Although popular in practice, … Continue reading A Bayesian Semiparametric Framework for Understanding and Predicting Customer Base Dynamics

David Steinberg. Wytock and Kolter take on a challenging problem in time series analysis – segmentation. The goal is to partition the time series into subsets of observations which can be regarded as generated by the same distribution. A particular example is the popular hidden Markov model, in which the partition is given by the … Continue reading Probabilistic Segmentation via Total Variation Regularization

David Steinberg. Kalman Filters are a popular and influential approach for modeling time-varying phenomena. They admit an intuitive probabilistic interpretation, have a simple functional form, and have been successfully applied in a wide variety of disciplines. The classic Kalman filter is a generative dynamic model in which the state of the system evolves over time … Continue reading Deep Kalman Filters

David Steinberg. I want to devote this issue to some articles both in and outside the statistical literature discussing the use of p-values and related questions of selective inference. The p-value (and in fact much of the standard statistical paradigm for science) has been a focal point for controversy in recent years. The article that … Continue reading Articles about reproducible research and p-values