License: Creative Commons Attribution 3.0 Unported license (CC BY 3.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.APPROX-RANDOM.2019.29
URN: urn:nbn:de:0030-drops-112443
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2019/11244/
Go to the corresponding LIPIcs Volume Portal


Jayaram, Rajesh ; Woodruff, David P.

Towards Optimal Moment Estimation in Streaming and Distributed Models

pdf-format:
LIPIcs-APPROX-RANDOM-2019-29.pdf (0.6 MB)


Abstract

One of the oldest problems in the data stream model is to approximate the p-th moment ||X||_p^p = sum_{i=1}^n X_i^p of an underlying non-negative vector X in R^n, which is presented as a sequence of poly(n) updates to its coordinates. Of particular interest is when p in (0,2]. Although a tight space bound of Theta(epsilon^-2 log n) bits is known for this problem when both positive and negative updates are allowed, surprisingly there is still a gap in the space complexity of this problem when all updates are positive. Specifically, the upper bound is O(epsilon^-2 log n) bits, while the lower bound is only Omega(epsilon^-2 + log n) bits. Recently, an upper bound of O~(epsilon^-2 + log n) bits was obtained under the assumption that the updates arrive in a random order.
We show that for p in (0, 1], the random order assumption is not needed. Namely, we give an upper bound for worst-case streams of O~(epsilon^-2 + log n) bits for estimating |X |_p^p. Our techniques also give new upper bounds for estimating the empirical entropy in a stream. On the other hand, we show that for p in (1,2], in the natural coordinator and blackboard distributed communication topologies, there is an O~(epsilon^-2) bit max-communication upper bound based on a randomized rounding scheme. Our protocols also give rise to protocols for heavy hitters and approximate matrix product. We generalize our results to arbitrary communication topologies G, obtaining an O~(epsilon^2 log d) max-communication upper bound, where d is the diameter of G. Interestingly, our upper bound rules out natural communication complexity-based approaches for proving an Omega(epsilon^-2 log n) bit lower bound for p in (1,2] for streaming algorithms. In particular, any such lower bound must come from a topology with large diameter.

BibTeX - Entry

@InProceedings{jayaram_et_al:LIPIcs:2019:11244,
  author =	{Rajesh Jayaram and David P. Woodruff},
  title =	{{Towards Optimal Moment Estimation in Streaming and Distributed Models}},
  booktitle =	{Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2019)},
  pages =	{29:1--29:21},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-125-2},
  ISSN =	{1868-8969},
  year =	{2019},
  volume =	{145},
  editor =	{Dimitris Achlioptas and L{\'a}szl{\'o} A. V{\'e}gh},
  publisher =	{Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{http://drops.dagstuhl.de/opus/volltexte/2019/11244},
  URN =		{urn:nbn:de:0030-drops-112443},
  doi =		{10.4230/LIPIcs.APPROX-RANDOM.2019.29},
  annote =	{Keywords: Streaming, Sketching, Message Passing, Moment Estimation}
}

Keywords: Streaming, Sketching, Message Passing, Moment Estimation
Collection: Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2019)
Issue Date: 2019
Date of publication: 17.09.2019


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI