Ball divergence

Ball Divergence (BD) is a novel nonparametric two‐sample statistic that quantifies the discrepancy between two probability measures $\mu$ and $\nu$ on a metric space $(V,\rho )$ .^[1] It is defined by integrating the squared difference of the measures over all closed balls in $V$ . Let ${\overline {B}}(u,r)=\{w\in V\mid \rho (u,w)\leq r\}$ be the closed ball of radius $r\geq 0$ centered at $u\in V$ . Equivalently, one may set $r=\rho (u,v)$ and write ${\overline {B}}{\bigl (}u,\rho (u,v){\bigr )}$ . The Ball divergence is then defined by $BD(\mu ,\nu )=\iint _{V\times V}{\bigl [}\mu ({\overline {B}}(u,\rho (u,v)))-\nu ({\overline {B}}(u,\rho (u,v))){\bigr ]}^{2}\;{\bigl [}\mu (du)\,\mu (dv)+\nu (du)\,\nu (dv){\bigr ]}.$ This measure can be seen as a integral of the Harald Cramér's distance over all possible pairs of points. By summing squared differences of $\mu$ and $\nu$ over balls of all scales, BD captures both global and local discrepancies between distributions, yielding a robust, scale-sensitive comparison. Moreover, since BD is defined as the integral of a squared measure difference, it is always non-negative, and $BD(\mu ,\nu )=0$ if and only if $\mu =\nu$ .

Testing for equal distributions

Next, we will try to give a sample version of Ball Divergence. For convenience, we can decompose the Ball Divergence into two parts: $A=\iint _{V\times V}[\mu -\nu ]^{2}({\bar {B}}(u,\rho (u,v)))\mu (du)\mu (dv),$ and $C=\iint _{V\times V}[\mu -\nu ]^{2}({\bar {B}}(u,\rho (u,v)))\nu (du)\nu (dv).$ Thus $BD(\mu ,\nu )=A+C.$

Let $\delta (x,y,z)=I(z\in {\bar {B}}(x,\rho (x,y)))$ denote whether point $z$ locates in the ball ${\bar {B}}(x,\rho (x,y))$ . Given two independent samples $\{X_{1},\ldots ,X_{n}\}$ form $\mu$ and $\{Y_{1},\ldots ,Y_{m}\}$ form $\nu$

${\begin{aligned}&A_{ij}^{X}={\frac {1}{n}}\sum _{u=1}^{n}\delta \left(X_{i},X_{j},X_{u}\right),A_{ij}^{Y}={\frac {1}{m}}\sum _{v=1}^{m}\delta \left(X_{i},X_{j},Y_{v}\right),\\&C_{kl}^{X}={\frac {1}{n}}\sum _{u=1}^{n}\delta \left(Y_{k},Y_{l},X_{u}\right),C_{ij}^{Y}={\frac {1}{m}}\sum _{v=1}^{m}\delta \left(Y_{k},Y_{l},Y_{v}\right),\end{aligned}}$ where $A_{ij}^{X}$ means the proportion of samples from the probability measure $\mu$ located in the ball ${\bar {B}}\left(X_{i},\rho \left(X_{i},X_{j}\right)\right)$ and $A_{ij}^{Y}$ means the proportion of samples from the probability measure $\nu$ located in the ball ${\bar {B}}\left(X_{i},\rho \left(X_{i},X_{j}\right)\right)$ . Meanwhile, $C_{ij}^{X}$ and $C_{ij}^{Y}$ means the proportion of samples from the probability measure $\mu$ and $\nu$ located in the ball ${\bar {B}}\left(Y_{i},\rho \left(Y_{i},Y_{j}\right)\right)$ . The sample versions of $A$ and $C$ are as follows

$A_{n,m}={\frac {1}{n^{2}}}\sum _{i,j=1}^{n}\left(A_{ij}^{X}-A_{ij}^{Y}\right)^{2},\qquad C_{n,m}={\frac {1}{m^{2}}}\sum _{k,l=1}^{m}\left(C_{kl}^{X}-C_{kl}^{Y}\right)^{2}.$

Finally, we can give the sample ball divergence

$BD_{n,m}=A_{n,m}+C_{n,m}.$

It can be proved that $BD_{n,m}$ is a consistent estimator of BD. Moreover, if ${\tfrac {n}{n+m}}\to \tau$ for some $\tau \in [0,1]$ , then under the null hypothesis $BD_{n,m}$ converges in distribution to a mixture of chi-squared distributions, whereas under the alternative hypothesis it converges to a normal distribution.

Properties

1. The square root of Ball Divergence is a symmetric divergence but not a metric, because it does not satisfy the triangle inequality.

2. It can be shown that Ball divergence, energy distance test^[2], and MMD^[3] are unified within the variogram framework; for details see Remark 2.4 in ^[1].

Homogeneity Test

Ball divergence admits a straightforward extension to the K-sample setting. Suppose $\mu _{1},\dots ,\mu _{K}$ are $K(\geq 2)$ probability measures on a Banach space $(V,\|\cdot \|)$ . Define the K-sample BD by

$D(\mu _{1},\dots ,\mu _{K})=\sum _{1\leq k<l\leq K}\iint _{V\times V}{\bigl [}\mu _{k}{\bigl (}{\overline {B}}(u,\rho (u,v)){\bigr )}-\mu _{l}{\bigl (}{\overline {B}}(u,\rho (u,v)){\bigr )}{\bigr ]}^{2}\;{\bigl [}\mu _{k}(du)\,\mu _{k}(dv)+\mu _{l}(du)\,\mu _{l}(dv){\bigr ]}.$

It then follows from Theorems 1 and 2 that $D(\mu _{1},\dots ,\mu _{K})=0$ if and only if $\mu _{1}=\mu _{2}=\cdots =\mu _{K}.$

By employing closed balls to define a metric distribution function, one obtains an alternative homogeneity measure.^[4]

Given a probability measure ${\tilde {\mu }}$ on a metric space $(V,\rho )$ , its metric distribution function is defined by

$F_{\tilde {\mu }}^{M}(u,v)={\tilde {\mu }}{\bigl (}{\overline {B}}(u,\rho (u,v)){\bigr )}=\mathbb {E} {\bigl [}\delta (u,v,X){\bigr ]},\quad u,v\in V,$

where ${\overline {B}}(u,r)=\{\,w\in V:d(u,w)\leq r\}$ is the closed ball of radius $r\geq 0$ centered at $u$ , and $\delta (u,v,X)=\prod _{k=1}^{K}\mathbf {1} \{X^{(k)}\in {\overline {B}}_{k}(u_{k},\rho _{k}(u_{k},v_{k}))\}.$

If $(X_{1},\dots ,X_{N})$ are i.i.d. draws from $({\tilde {\mu }})$ , the empirical version is

$F_{{\tilde {\mu }},N}^{M}(u,v)={\frac {1}{N}}\sum _{i=1}^{N}\delta (u,v,X_{i}).$

Based on these, the homogeneity measure based on MDF, also called metric Cramér-von Mises (MCVM) is $\mathrm {MCVM} {\bigl (}\mu _{k}\parallel \mu {\bigr )}=\int _{V\times V}p_{k}^{2}\,w(u,v)\,{\bigl [}F_{\mu _{k}}^{M}(u,v)-F_{\mu }^{M}(u,v){\bigr ]}^{2}\,d\mu _{k}(u)\,d\mu _{k}(v),$

where $\mu =\sum _{k=1}^{K}p_{k}\,\mu _{k}$ be their mixture with weights $p_{1},\dots ,p_{K}$ , and $w(u,v)=\exp {\bigl \{}-{\tfrac {d(u,v)^{2}}{2\sigma ^{2}}}{\bigr \}}$ . The overall MCVM is then

$\mathrm {MCVM} (\mu _{1},\dots ,\mu _{K})=\sum _{k=1}^{K}p_{k}^{2}\,\mathrm {MCVM} {\bigl (}\mu _{k}\parallel \mu {\bigr )}.$

The empirical MCVM is given by

${\widehat {\mathrm {MCVM} }}{\bigl (}\mu _{k}\parallel \mu {\bigr )}={\frac {1}{n_{k}^{2}}}\sum _{X_{i}^{(k)},X_{j}^{(k)}\in {\mathcal {X}}_{k}}w{\bigl (}X_{i}^{(k)},X_{j}^{(k)}{\bigr )}\,{\bigl [}F_{\mu _{k},n_{k}}^{M}{\bigl (}X_{i}^{(k)},X_{j}^{(k)}{\bigr )}-F_{\mu ,n}^{M}{\bigl (}X_{i}^{(k)},X_{j}^{(k)}{\bigr )}{\bigr ]}^{2}.$

where ${\mathcal {X}}_{k}=\{X_{1}^{(k)},\dots ,X_{n_{k}}^{(k)}\}$ be an i.i.d. sample from $\mu _{k}$ , and ${\hat {p}}_{k}={\frac {n_{k}}{\sum _{\ell =1}^{K}n_{\ell }}}.$ A practical choice for $\sigma ^{2}$ is the median of the squared distances $\{\,d(X,X')^{2}:X,X'\in \bigcup _{k=1}^{K}{\mathcal {X}}_{k}\}.$

References

^ ^a ^b Pan, Wenliang; Tian, Yuan; Wang, Xueqin; Zhang, Heping (2018-06-01). "Ball Divergence: Nonparametric two sample test". The Annals of Statistics. 46 (3): 1109–1137. doi:10.1214/17-AOS1579. ISSN 0090-5364. PMC 6192286. PMID 30344356.
^ Székely, Gábor J.; Rizzo, Maria L. (August 2013). "Energy statistics: A class of statistics based on distances". Journal of Statistical Planning and Inference. 143 (8): 1249–1272. doi:10.1016/j.jspi.2013.03.018. ISSN 0378-3758.
^ Gretton, Arthur; Borgwardt, Karsten M.; Rasch, Malte; Schölkopf, Bernhard; Smola, Alexander J. (2007-09-07), "A Kernel Method for the Two-Sample-Problem", Advances in Neural Information Processing Systems 19, The MIT Press, pp. 513–520, doi:10.7551/mitpress/7503.003.0069, hdl:1885/37327, ISBN 978-0-262-25691-9, retrieved 2024-06-28
^ Wang, X., Zhu, J., Pan, W., Zhu, J., & Zhang, H. (2023). Nonparametric Statistical Inference via Metric Distribution Function in Metric Spaces. Journal of the American Statistical Association, 119(548), 2772–2784. https://doi.org/10.1080/01621459.2023.2277417

[Pan._BD_2002-1] Pan, Wenliang; Tian, Yuan; Wang, Xueqin; Zhang, Heping (2018-06-01). "Ball Divergence: Nonparametric two sample test". The Annals of Statistics. 46 (3): 1109–1137. doi:10.1214/17-AOS1579. ISSN 0090-5364. PMC 6192286. PMID 30344356.

[2] Székely, Gábor J.; Rizzo, Maria L. (August 2013). "Energy statistics: A class of statistics based on distances". Journal of Statistical Planning and Inference. 143 (8): 1249–1272. doi:10.1016/j.jspi.2013.03.018. ISSN 0378-3758.

[3] Gretton, Arthur; Borgwardt, Karsten M.; Rasch, Malte; Schölkopf, Bernhard; Smola, Alexander J. (2007-09-07), "A Kernel Method for the Two-Sample-Problem", Advances in Neural Information Processing Systems 19, The MIT Press, pp. 513–520, doi:10.7551/mitpress/7503.003.0069, hdl:1885/37327, ISBN 978-0-262-25691-9, retrieved 2024-06-28

[4] Wang, X., Zhu, J., Pan, W., Zhu, J., & Zhang, H. (2023). Nonparametric Statistical Inference via Metric Distribution Function in Metric Spaces. Journal of the American Statistical Association, 119(548), 2772–2784. https://doi.org/10.1080/01621459.2023.2277417

[1]

[2]

[3]

[4]