Hauptnavigation

myHHH 0.1

During the course of his diploma thesis Peter Fricke currently implements a tool called myHHH. It will calculate multi-dimensional Hierarchical Heavy Hitters from data streams in the domain of system call analysis.

Hierarchical Heavy Hitters

Hierarchical Heavy Hitters represent a data stream by a condensed form. A basic underlying assumption is that the data stream can be described by frequent sets of stream elements. Algorithms calculating Hierarchical Heavy Hitters take into account the hierarchical structure of stream elements, in contrast to flat methods like Lossy Counting.

A simplified example: IP address spaces are structured hierarchically. Each address identifies a network node, while prefixes of an address identify a subnet. For a stream of IP addresses one can find whole subnets (prefixes) containing most addresses of the stream, not only single frequent addresses. The Hierarchical Heavy Hitters are such frequent prefixes.

The tool myHHH is based on the algorithm for the multi-dimensional case introduced by Cormode et. al (2008). The tool also includes the exact algorithm (overlap case) introduced by Cormode et al. (2004) and similarity measures for sets of objects containing hierarchical information (Ganesan et al. 2003).

Literatur

  • Cormode, G. ; Korn, F. ; Muthukrishnan, S. ; Srivastava, D. : Finding Hierarchical Heavy Hitters in Streaming Data. In: ACM Trans. Knowl. Discov. Data 1 (2008), Nr. 4, S. 1 - 48.
  • Cormode, G. ; Korn, F. ; Muthukrishnan, S. ; Srivastava, D. : Diamond in the Rough: Finding Hierarchical Heavy Hitters in Multi-dimensional Data. In: Proceedings of the 2004 ACM IGMOD International Conference on Management of Data, 2004, S. 155 - 166.
  • Ganesan, P. ; Garcia-Molina, H. ; Widom, J. : Exploiting Hierarchical Domain Structure to Compute Similarity. In: ACM Transactions on Information Systems 21 (2003), Nr. 1, S. 64 - 93.

Download and Installation

Warning: The software myHHH 0.1 which you can download below is only a pre-release consisting mainly of Java sources. During the ongoing development of the code the interfaces might change. Further instructions (at the current time only in German) can be found in the README file of the package. The most important source file containing an application example is Beispiel.java.