License: Creative Commons Attribution 4.0 International license (CC BY 4.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.FSTTCS.2021.27
URN: urn:nbn:de:0030-drops-155381
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2021/15538/
Go to the corresponding LIPIcs Volume Portal


Li, Xin ; Zheng, Yu

Lower Bounds and Improved Algorithms for Asymmetric Streaming Edit Distance and Longest Common Subsequence

pdf-format:
LIPIcs-FSTTCS-2021-27.pdf (0.8 MB)


Abstract

In this paper, we study edit distance (ED) and longest common subsequence (LCS) in the asymmetric streaming model, introduced by Saks and Seshadhri [Saks and Seshadhri, 2013]. As an intermediate model between the random access model and the streaming model, this model allows one to have streaming access to one string and random access to the other string. Meanwhile, ED and LCS are both fundamental problems that are often studied on large strings, thus the (asymmetric) streaming model is ideal for studying these problems.
Our first main contribution is a systematic study of space lower bounds for ED and LCS in the asymmetric streaming model. Previously, there are no explicitly stated results in this context, although some lower bounds about LCS can be inferred from the lower bounds for longest increasing subsequence (LIS) in [Sun and Woodruff, 2007; Gál and Gopalan, 2010; Ergun and Jowhari, 2008]. Yet these bounds only work for large alphabet size. In this paper, we develop several new techniques to handle ED in general and LCS for small alphabet size, thus establishing strong lower bounds for both problems. In particular, our lower bound for ED provides an exponential separation between edit distance and Hamming distance in the asymmetric streaming model. Our lower bounds also extend to LIS and longest non-decreasing subsequence (LNS) in the standard streaming model. Together with previous results, our bounds provide an almost complete picture for these two problems.
As our second main contribution, we give improved algorithms for ED and LCS in the asymmetric streaming model. For ED, we improve the space complexity of the constant factor approximation algorithms in [Farhadi et al., 2020; Cheng et al., 2020] from Õ({n^δ}/δ) to O({d^δ}/δ polylog(n)), where n is the length of each string and d is the edit distance between the two strings. For LCS, we give the first 1/2+ε approximation algorithm with space n^δ for any constant δ > 0, over a binary alphabet. Our work leaves a plethora of intriguing open questions, including establishing lower bounds and designing algorithms for a natural generalization of LIS and LNS, which we call longest non-decreasing subsequence with threshold (LNST).

BibTeX - Entry

@InProceedings{li_et_al:LIPIcs.FSTTCS.2021.27,
  author =	{Li, Xin and Zheng, Yu},
  title =	{{Lower Bounds and Improved Algorithms for Asymmetric Streaming Edit Distance and Longest Common Subsequence}},
  booktitle =	{41st IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2021)},
  pages =	{27:1--27:23},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-215-0},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{213},
  editor =	{Boja\'{n}czy, Miko{\l}aj and Chekuri, Chandra},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/opus/volltexte/2021/15538},
  URN =		{urn:nbn:de:0030-drops-155381},
  doi =		{10.4230/LIPIcs.FSTTCS.2021.27},
  annote =	{Keywords: Asymmetric Streaming Model, Edit Distance, Longest Common Subsequence, Space Lower Bound}
}

Keywords: Asymmetric Streaming Model, Edit Distance, Longest Common Subsequence, Space Lower Bound
Collection: 41st IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2021)
Issue Date: 2021
Date of publication: 29.11.2021


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI