Data Science at Alibaba

Hongxia Yang, Director & Senior Staff Data Scientist, Alibaba Group.  What is Data Science? And what is Data Science in a leading global company? Let me share a few thoughts based on my industry experience over the past few years and my current job at Alibaba Inc. We have all heard of this company as … Continue reading Data Science at Alibaba

Advertisements

“To p or not to p” –my thoughts on the ASA Symposium on Statistical Inference.

Ron S. Kenett, KPA Group, Samuel Neaman Institute, Technion and Institute for Drug Research, Hebrew University, Jerusalem, Israel. (ron@kpa-group.com). An eminent statistician labeled the American Statistical Association (ASA) statement on p values with the title of this blog. This post is about the ASA SSI held in Bethesda Maryland on October 11-12th, 2017, and my own … Continue reading “To p or not to p” –my thoughts on the ASA Symposium on Statistical Inference.

Using statistics and data science to build a crowdsourcing data platform

by Ankur Gupta, Machine Learning Scientist, Premise Data, San Francisco.   Increased internet connectivity has allowed large numbers of people to work towards a single goal in a distributed fashion. This practice is called crowdsourcing and we see successful examples of crowdsourcing everywhere. The most famous example is perhaps Wikipedia, which allows anyone in the world … Continue reading Using statistics and data science to build a crowdsourcing data platform

Highlighting Interesting Articles that are NOT in the Statistics Literature

David Steinberg.  Most of us come across new ideas and interesting research by attending conferences and reading journals. Naturally, we begin with those meetings and journals that are in our own field. However, many articles with interesting statistical content appear in other journals and meetings. This should not be surprising: statistics is a part of … Continue reading Highlighting Interesting Articles that are NOT in the Statistics Literature

50 Years of Data Science

By David Steinberg.  Data Science has become a rallying cry for universities, research organizations, and many commercial and industrial companies. We are surrounded by ever increasing amounts of data and by myriad methods and algorithms to take advantage of them. Rallying cry aside, no one seems to be very clear about just what IS data … Continue reading 50 Years of Data Science

Time-series modeling with undecimated fully convolutional neural networks

David Steinberg.  This paper presents a new convolutional neural network-based time-series model. Typical convolutional neural network (CNN) architectures rely on the use of max-pooling operators in between layers, which leads to reduced resolution at the top layers. Instead, this work considers a fully convolutional network (FCN) architecture that uses causal filtering operations, and allows for … Continue reading Time-series modeling with undecimated fully convolutional neural networks

A quality by design (QbD) case study on liposomes containing hydrophilic API

David Steinberg.  Quality by Design (QbD) in the pharmaceutical industry is a systematic approach to development of drug products and drug manufacturing processes that begins with predefined objectives, emphasizes product and process understanding and sets up process control based on sound science and quality risk management. The FDA and similar regulatory groups elsewhere are actively … Continue reading A quality by design (QbD) case study on liposomes containing hydrophilic API

Arbitrariness of peer review: A Bayesian analysis of the NIPS experiment

David Steinberg.  The principle of peer review is central to the evaluation of research, by ensuring that only high-quality items are funded or published. But peer review has also received criticism, as the selection of reviewers may introduce biases in the system. In 2014, the organizers of the Neural Information Processing Systems conference conducted an … Continue reading Arbitrariness of peer review: A Bayesian analysis of the NIPS experiment

Using Big Data to Understand the Human Condition: The Kavli Human Project

David Steinberg. Azmak et al. introduce the Kavli HUMAN Project (KHP), a unique and ambitious attempt to exploit big-data health analytics to study factors that contribute to good health. The KHP differs dramatically from typical large scale health studies in the depth of data that will be collected. Most such programs focus on very specific … Continue reading Using Big Data to Understand the Human Condition: The Kavli Human Project

A Bayesian Semiparametric Framework for Understanding and Predicting Customer Base Dynamics

David Steinberg.  Dew and Ansari look at how to automate customer analytics. This can be a crucial activity for companies that manage distinct customer bases. In these data-rich and dynamic settings, visualization is essential for understanding events of interest. The value of visualization has led to the popularity of analytics dashboards. Although popular in practice, … Continue reading A Bayesian Semiparametric Framework for Understanding and Predicting Customer Base Dynamics

Probabilistic Segmentation via Total Variation Regularization

David Steinberg.  Wytock and Kolter take on a challenging problem in time series analysis – segmentation. The goal is to partition the time series into subsets of observations which can be regarded as generated by the same distribution. A particular example is the popular hidden Markov model, in which the partition is given by the … Continue reading Probabilistic Segmentation via Total Variation Regularization