U
- Update typeS
- Summary typepublic abstract class DataToSketch<U,S extends org.apache.datasketches.tuple.UpdatableSummary<U>>
extends org.apache.pig.EvalFunc<org.apache.pig.data.Tuple>
implements org.apache.pig.Accumulator<org.apache.pig.data.Tuple>
Constructor and Description |
---|
DataToSketch(int sketchSize,
float samplingProbability,
org.apache.datasketches.tuple.SummaryFactory<S> summaryFactory)
Constructs a function given a sketch size, sampling probability and summary factory
|
DataToSketch(int sketchSize,
org.apache.datasketches.tuple.SummaryFactory<S> summaryFactory)
Constructs a function given a sketch size, summary factory and default
sampling probability of 1.
|
DataToSketch(org.apache.datasketches.tuple.SummaryFactory<S> summaryFactory)
Constructs a function given a summary factory, default sketch size and default
sampling probability of 1.
|
Modifier and Type | Method and Description |
---|---|
void |
accumulate(org.apache.pig.data.Tuple inputTuple) |
void |
cleanup() |
org.apache.pig.data.Tuple |
exec(org.apache.pig.data.Tuple inputTuple) |
org.apache.pig.data.Tuple |
getValue() |
allowCompileTimeCalculation, finish, getArgToFuncMapping, getCacheFiles, getInputSchema, getLoadCaster, getLogger, getPigLogger, getReporter, getReturnType, getSchemaName, getSchemaType, getShipFiles, isAsynchronous, needEndOfAllInputProcessing, outputSchema, progress, setEndOfAllInput, setInputSchema, setPigLogger, setReporter, setUDFContextSignature, warn
public DataToSketch(org.apache.datasketches.tuple.SummaryFactory<S> summaryFactory)
summaryFactory
- an instance of SummaryFactorypublic DataToSketch(int sketchSize, org.apache.datasketches.tuple.SummaryFactory<S> summaryFactory)
sketchSize
- parameter controlling the size of the sketch and the accuracy.
It represents nominal number of entries in the sketch. Forced to the nearest power of 2
greater than given value.summaryFactory
- an instance of SummaryFactorypublic DataToSketch(int sketchSize, float samplingProbability, org.apache.datasketches.tuple.SummaryFactory<S> summaryFactory)
sketchSize
- parameter controlling the size of the sketch and the accuracy.
It represents nominal number of entries in the sketch. Forced to the nearest power of 2
greater than given value.samplingProbability
- parameter from 0 to 1 inclusivesummaryFactory
- an instance of SummaryFactorypublic void accumulate(org.apache.pig.data.Tuple inputTuple) throws IOException
accumulate
in interface org.apache.pig.Accumulator<org.apache.pig.data.Tuple>
IOException
public void cleanup()
cleanup
in interface org.apache.pig.Accumulator<org.apache.pig.data.Tuple>
public org.apache.pig.data.Tuple getValue()
getValue
in interface org.apache.pig.Accumulator<org.apache.pig.data.Tuple>
public org.apache.pig.data.Tuple exec(org.apache.pig.data.Tuple inputTuple) throws IOException
exec
in class org.apache.pig.EvalFunc<org.apache.pig.data.Tuple>
IOException
Copyright © 2015–2019 The Apache Software Foundation. All rights reserved.