English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
 
 
DownloadE-Mail
  The Space Complexity of Pass-Efficient Algorithms for Clustering

Chang, K. L., & Kannan, R. (2006). The Space Complexity of Pass-Efficient Algorithms for Clustering. In Proceedings of the Seventeenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA'06 (pp. 1157-1166). New York, USA: ACM / Siam.

Item is

Files

show Files
hide Files
:
census-soda.ps (Any fulltext), 424KB
 
File Permalink:
-
Name:
census-soda.ps
Description:
-
OA-Status:
Visibility:
Private
MIME-Type / Checksum:
application/postscript
Technical Metadata:
Copyright Date:
-
Copyright Info:
Copyright 2006 ACM. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.
License:
-

Locators

show

Creators

show
hide
 Creators:
Chang, Kevin L.1, Author           
Kannan, Ravi, Author
Affiliations:
1Algorithms and Complexity, MPI for Informatics, Max Planck Society, ou_24019              

Content

show
hide
Free keywords: -
 Abstract: We present multiple pass streaming algorithms for a basic clustering problem for massive data sets. If our algorithm is allotted 2l passes, it will produce an approximation with error at most ε using Õ(k3/ε2/l) bits of memory, the most critical resource for streaming computation. We demonstrate that this tradeoff between passes and memory allotted is intrinsic to the problem and model of computation by proving lower bounds on the memory requirements of any l pass randomized algorithm that are nearly matched by our upper bounds. To the best of our knowledge, this is the first time nearly matching bounds have been proved for such an exponential tradeoff for randomized computation.In this problem, we are given a set of n points drawn randomly according to a mixture of k uniform distributions and wish to approximate the density function of the mixture. The points are placed in a datastream (possibly in adversarial order), which may only be read sequentially by the algorithm. We argue that this models, among others, the datastream produced by a national census of the incomes of all citizens.

Details

show
hide
Language(s): eng - English
 Dates: 2007-04-162006
 Publication Status: Issued
 Pages: -
 Publishing info: New York, USA : ACM / Siam
 Table of Contents: -
 Rev. Type: -
 Identifiers: eDoc: 314430
Other: Local-ID: C1256428004B93B8-42E27E9B111F2537C1257289003D2F0F-Chang2006
 Degree: -

Event

show
hide
Title: Untitled Event
Place of Event: Miami, USA
Start-/End Date: 2007-02-21

Legal Case

show

Project information

show

Source 1

show
hide
Title: Proceedings of the Seventeenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA'06
Source Genre: Proceedings
 Creator(s):
Affiliations:
Publ. Info: New York, USA : ACM / Siam
Pages: - Volume / Issue: - Sequence Number: - Start / End Page: 1157 - 1166 Identifier: ISBN: 0-89871-605-5