Use the following table to compare the capabilities of the different sketch families.
All sketches have a posteriori error bounds methods.
| Sketch | Languages | Set Operations | System Integrations | Misc. | ||||||||||||||
| Type | Class Name | Java | C++ | Python7 | Union | Inter-section | Difference | Jaccard | Hive | Pig | Druid1 | Spark2 | PostgreSQL (C++) | Con-current | Compact | Off Java Heap | ||
| Major Sketches | ||||||||||||||||||
| Cardinality/CPC | CpcSketch | Y | Y | Y | Y | Y | Y | Y | Y | |||||||||
| Cardinality/HLL | HllSketch | Y | Y | Y | Y | Y | Y | Y | Y | Y | ||||||||
| Cardinality/Theta | Sketch | Y | Y | Y | Y | Y | Y | Y4 | Y | Y | Y | Y | Y | Y | Y | Y | ||
| Cardinality/Tuple | Sketch<S> | Y | Y | Y | Y | Y | Y | Y | ||||||||||
| Quantiles/Cormode | DoublesSketch | Y | Y | Y | Y | Y | Y | Y | ||||||||||
| Quantiles/Cormode | ItemsSketch<T> | Y | Y | Y | Y | |||||||||||||
| Quantiles/KLL | KllDoublesSketch | Y | Y | Y6 | Y | Y | Y | Y | Y | |||||||||
| Quantiles/KLL | KllFloatsSketch | Y | Y | Y6 | Y | Y | Y | Y | Y | Y | ||||||||
| Quantiles/KLL | KLLSketch<T> | Y | Y | |||||||||||||||
| Quantiles/REQ | FloatsSketch | Y | Y | Y6 | ||||||||||||||
| Frequencies | LongsSketch | Y | Y | Y | Y | |||||||||||||
| Frequencies | ItemsSketch<T> | Y | Y | Y | Y | Y | Y | Y5 | ||||||||||
| Sampling/Reservior | ReservoirLongsSketch | Y | Y | |||||||||||||||
| Sampling/Reservoir | ReserviorItemsSketch<T> | Y | Y | Y | ||||||||||||||
| Sampling/VarOpt | VarOptItemsSketch<T> | Y | Y | Y | Y | Y | ||||||||||||
| Specialty Sketches | ||||||||||||||||||
| Cardinality/FM85 | UniqueCountMap | Y | ||||||||||||||||
| Cardinality/Tuple | FdtSketch | Y | Y | Y | Y | |||||||||||||
| Cardinality/Tuple | ArrayOfDoublesSketch | Y | Y | Y | Y | Y | Y | Y | Y | Y | ||||||||
| Cardinality/Tuple | DoubleSketch | Y | Y | Y | Y | |||||||||||||
| Cardinality/Tuple | IntegerSketch | Y | Y | Y | Y | |||||||||||||
| Cardinality/Tuple | ArrayOfStringsSketch | Y | Y | Y | Y | |||||||||||||
| Cardinality/Tuple | EngagementTest3 | Y | Y | Y | Y | |||||||||||||
1 Integrated into Druid.
2 Spark Example Code on website. Theta Sketch is the only one we have tried in Spark, it doesn’t mean other sketches cannot be used.
3 Tuple Sketch: Example Code in test/…/tuple/aninteger.
4 Theta Sketch: C++/Python has no implementaion of the Jaccard, yet.
5 Frequent Items Sketch: PostgreSQL implemented for Strings only.
6 KLL & REQ Sketch: Python implemented for both just floats and ints.
7 See Python Install Instructions
See Research/References for references in […]