See: Description
Class | Description |
---|---|
ArrayOfTuplesSerDe |
This ArrayOfItemsSerDe implementation takes advantage of the Pig methods used in
Pig's own BinStorage to serialize arbitrary Tuple data.
|
DataToVarOptSketch |
Creates a binary version of a VarOpt sampling over input tuples.
|
GetVarOptSamples |
This UDF extracts samples from the binary image of a VarOpt<Tuple> sketch.
|
ReservoirSampling |
This is a Pig UDF that applies reservoir sampling to input tuples.
|
ReservoirSampling.Initial | |
ReservoirSampling.IntermediateFinal | |
ReservoirUnion |
This is a Pig UDF that unions reservoir samples.
|
VarOptSampling |
Applies VarOpt sampling to input tuples.
|
VarOptSampling.Final | |
VarOptUnion |
Accepts binary VarOpt sketch images and unions them into a single binary output sketch.
|
This package is dedicated to streaming algorithms that enable fixed size, uniform sampling of unweighted items from a stream.
These sketches are mergeable, but do not serialize to a compact form.
Copyright © 2015–2019 The Apache Software Foundation. All rights reserved.