License: Creative Commons Attribution-NoDerivs 3.0 Unported license (CC BY-ND 3.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.STACS.2010.2449
URN: urn:nbn:de:0030-drops-24496
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2010/2449/
Go to the corresponding LIPIcs Volume Portal


Braverman, Vladimir ; Chung, Kai-Min ; Liu, Zhenming ; Mitzenmacher, Michael ; Ostrovsky, Rafail

AMS Without 4-Wise Independence on Product Domains

pdf-format:
1001.BravermanVladimir.2449.pdf (0.3 MB)


Abstract

In their seminal work, Alon, Matias, and Szegedy introduced several sketching techniques, including showing that $4$-wise independence is sufficient to obtain good approximations of the second frequency moment. In this work, we show that their sketching technique can be extended to product domains $[n]^k$ by using the product of $4$-wise independent functions on $[n]$.

Our work extends that of Indyk and McGregor, who showed the result for $k = 2$. Their primary motivation was the problem of identifying correlations in data streams. In their model, a stream of pairs $(i,j) \in [n]^2$ arrive, giving a joint distribution $(X,Y)$, and they find approximation algorithms for how close the joint distribution is to the product of the marginal distributions under various metrics, which naturally corresponds to how close $X$ and $Y$ are to being independent. By using our technique, we obtain a new result for the problem of approximating the $\ell_2$ distance between the joint distribution and the product of the marginal distributions for $k$-ary vectors, instead of just pairs, in a single pass. Our analysis gives a randomized algorithm that is a $(1\pm \epsilon)$ approximation (with probability $1-\delta$) that requires space logarithmic in $n$ and $m$ and proportional to $3^k$.

BibTeX - Entry

@InProceedings{braverman_et_al:LIPIcs:2010:2449,
  author =	{Vladimir Braverman and Kai-Min Chung and Zhenming Liu and Michael Mitzenmacher and Rafail Ostrovsky},
  title =	{{AMS Without 4-Wise Independence on Product Domains}},
  booktitle =	{27th International Symposium on Theoretical Aspects of Computer Science},
  pages =	{119--130},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-939897-16-3},
  ISSN =	{1868-8969},
  year =	{2010},
  volume =	{5},
  editor =	{Jean-Yves Marion and Thomas Schwentick},
  publisher =	{Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{http://drops.dagstuhl.de/opus/volltexte/2010/2449},
  URN =		{urn:nbn:de:0030-drops-24496},
  doi =		{10.4230/LIPIcs.STACS.2010.2449},
  annote =	{Keywords: Data Streams, Randomized Algorithms, Streaming Algorithms, Independence, Sketches}
}

Keywords: Data Streams, Randomized Algorithms, Streaming Algorithms, Independence, Sketches
Collection: 27th International Symposium on Theoretical Aspects of Computer Science
Issue Date: 2010
Date of publication: 09.03.2010


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI