Universitätsverlag Potsdam

          Details zur ausgewählten Publikation

Abedjan, Ziawasch:
Advancing the discovery of unique column combinations
/ Ziawasch Abedjan ; Felix Naumann. - Potsdam : Universitätsverlag Potsdam, 2011. - 25 S. : graph. Darst.
(Technische Berichte des Hasso-Plattner-Instituts für Softwaresystemtechnik an der Universität Potsdam ; 51)
ISSN (print) 1613-5652
ISSN (online) 2191-1665
ISBN 978-3-86956-148-6
Preis: kostenlos

Auf dem Publikationsserver der Universität unter:


Unique column combinations of a relational database table are sets of columns that contain only unique values. Discovering such combinations is a fundamental research problem and has many different data management and knowledge discovery applications. Existing discovery algorithms are either brute force or have a high memory load and can thus be applied only to small datasets or samples. In this paper, the wellknown GORDIAN algorithm and "Apriori-based" algorithms are compared and analyzed for further optimization. We greatly improve the Apriori algorithms through efficient candidate generation and statistics-based pruning methods. A hybrid solution HCAGORDIAN combines the advantages of GORDIAN and our new algorithm HCA, and it significantly outperforms all previous work in many situations.

