Wallenius Naïve Bayes

David Steinberg. One of the simplest methods for two-group classification is naïve Bayes, in which predictors are treated as though they provide independent information. Traditional event models underlying naive Bayes classifiers assume probability distributions that are not appropriate for binary data generated by human behavior. This paper develops a new event model, based on a … Continue reading Wallenius Naïve Bayes

Advertisements

FLAME: A Fast Large-scale Almost Matching Exactly Approach to Causal Inference

David Steinberg. This paper addresses a classical problem in causal inference: matching, where treatment units need to be matched to control units. Some of the main challenges in developing matching methods arise from the tension among (i) inclusion of as many covariates as possible in defining the matched groups, (ii) having matched groups with enough … Continue reading FLAME: A Fast Large-scale Almost Matching Exactly Approach to Causal Inference

Mind the Gap: Accounting for Measurement Error and Misclassification in Variables Generated via Data Mining

David Steinberg. Yang et al. consider the application of predictive data mining techniques in Information Systems research. Their focus is on the impact of data errors and misclassification on the subsequent data analysis by econometric models. Typically, data mining methods are first used to generate new variables (e.g., text sentiment), which are added into subsequent … Continue reading Mind the Gap: Accounting for Measurement Error and Misclassification in Variables Generated via Data Mining

The Surprising Power of Online Experiments

David Steinberg. One of the hot topics in internet commerce is A/B testing – the use of designed experiments to maximize revenue from web sites. The fact that experimental design is a great way to test ideas should not be a surprise to readers of this column. And many businesses have caught on to the … Continue reading The Surprising Power of Online Experiments

Highlighting Interesting Articles that are NOT in the Statistics Literature

David Steinberg.  Most of us come across new ideas and interesting research by attending conferences and reading journals. Naturally, we begin with those meetings and journals that are in our own field. However, many articles with interesting statistical content appear in other journals and meetings. This should not be surprising: statistics is a part of … Continue reading Highlighting Interesting Articles that are NOT in the Statistics Literature

Time-series modeling with undecimated fully convolutional neural networks

David Steinberg.  This paper presents a new convolutional neural network-based time-series model. Typical convolutional neural network (CNN) architectures rely on the use of max-pooling operators in between layers, which leads to reduced resolution at the top layers. Instead, this work considers a fully convolutional network (FCN) architecture that uses causal filtering operations, and allows for … Continue reading Time-series modeling with undecimated fully convolutional neural networks

A quality by design (QbD) case study on liposomes containing hydrophilic API

David Steinberg.  Quality by Design (QbD) in the pharmaceutical industry is a systematic approach to development of drug products and drug manufacturing processes that begins with predefined objectives, emphasizes product and process understanding and sets up process control based on sound science and quality risk management. The FDA and similar regulatory groups elsewhere are actively … Continue reading A quality by design (QbD) case study on liposomes containing hydrophilic API

Arbitrariness of peer review: A Bayesian analysis of the NIPS experiment

David Steinberg.  The principle of peer review is central to the evaluation of research, by ensuring that only high-quality items are funded or published. But peer review has also received criticism, as the selection of reviewers may introduce biases in the system. In 2014, the organizers of the Neural Information Processing Systems conference conducted an … Continue reading Arbitrariness of peer review: A Bayesian analysis of the NIPS experiment

Using Big Data to Understand the Human Condition: The Kavli Human Project

David Steinberg. Azmak et al. introduce the Kavli HUMAN Project (KHP), a unique and ambitious attempt to exploit big-data health analytics to study factors that contribute to good health. The KHP differs dramatically from typical large scale health studies in the depth of data that will be collected. Most such programs focus on very specific … Continue reading Using Big Data to Understand the Human Condition: The Kavli Human Project

A Bayesian Semiparametric Framework for Understanding and Predicting Customer Base Dynamics

David Steinberg.  Dew and Ansari look at how to automate customer analytics. This can be a crucial activity for companies that manage distinct customer bases. In these data-rich and dynamic settings, visualization is essential for understanding events of interest. The value of visualization has led to the popularity of analytics dashboards. Although popular in practice, … Continue reading A Bayesian Semiparametric Framework for Understanding and Predicting Customer Base Dynamics