Alex Deng, Jiannan Lu, Jonathan Litz . WSDM 2017
A/B tests (or randomized controlled experiments) play an integral role in the research and development cycles of technology companies. As in classic randomized experiments (e.g., clinical trials), the underlying statistical analysis of A/B tests is based on assuming the randomization unit is independent and identically distributed (i.i.d.). However, the randomization mechanisms utilized in online A/B tests can be quite complex and may render this assumption invalid. Analysis that unjustifiably relies on this assumption can yield untrustworthy results and lead to incorrect conclusions. Motivated by challenging problems arising from actual online experiments, we propose a new method of variance estimation that relies only on practically plausible assumptions, is directly applicable to a wide of range of randomization mechanisms, and can be implemented easily. We examine its performance and illustrate its advantages over two commonly used methods of variance estimation on both simulated and empirical datasets. Our results lead to a deeper understanding of the conditions under which the randomization unit can be treated as i.i.d. In particular, we show that for purposes of variance estimation, the randomization unit can be approximated as i.i.d. when the individual treatment effect variation is small; however, this approximation can lead to variance under-estimation when the individual treatment effect variation is large.
Alex Deng, Jiannan Lu, Shouyuan Chen. DSAA 2016
A/B testing is one of the most successful applications of statistical theory in the Internet age. A crucial problem of Null Hypothesis Statistical Testing (NHST), the backbone of A/B testing methodology, is that experimenters are not allowed to continuously monitor the results and make decisions in real time. Many people see this restriction as a setback against the trend in the technology toward real time data analytics. Recently, Bayesian Hypothesis Testing, which intuitively is more suitable for real time decision making, attracted growing interest as a viable alternative to NHST. While corrections of NHST for the continuous monitoring setting are well established in the existing literature and known in A/B testing community, the debate over the issue of whether continuous monitoring is a proper practice in Bayesian testing exists among both academic researchers and general practitioners. In this paper, we formally prove the validity of Bayesian testing under proper stopping rules, and illustrate the theoretical results with concrete simulation illustrations. We point out common bad practices where stopping rules are not proper, and discuss how priors can be learned objectively. General guidelines for researchers and practitioners are also provided.
0<\alpha\leq\alpha_{n}\leq\beta<1 ( \alpha, \beta\in (0,1) );
\liminf_{n\rightarrow\infty}r_{n}>0 \lim_{n\rightarrow\infty }|r_{n+1}-r_{n}|=0 .
\{(x_{n}, y_{n})\}
(
).
, , -, \{ (x_{n},y_{n})\} ( 4.2 ).
In Theorem 4.1 taking B=I and H_{2}=H_{3} , from Theorem 4.1 we can obtain the following convergence theorem for split mixed variational inequality problem SMVIP(\phi,\varphi) .
0<\alpha\leq\alpha_{n}\leq\beta<1 ( \alpha, \beta\in (0,1) );
\liminf_{n\rightarrow\infty}r_{n}>0 \lim_{n\rightarrow\infty }|r_{n+1}-r_{n}|=0 .
\{(x_{n}, y_{n})\}
(
).
, , -, \{ (x_{n},y_{n})\} ( 4.3 ).
It is easy to see that the split equality mixed equilibrium problem ( 1.12 ) reduces to the split equality convex minimization problem ( 1.13 ) as F=0 and G=0 . Therefore, Theorem 3.1 can be used to solve split equality convex minimization problem ( 1.13 ), and the following result can be directly deduced from Theorem 3.1 .
