T - type of itempublic abstract class DataToItemsSketch<T>
extends org.apache.pig.EvalFunc<org.apache.pig.data.Tuple>
implements org.apache.pig.Accumulator<org.apache.pig.data.Tuple>, org.apache.pig.Algebraic
| Modifier and Type | Class and Description | 
|---|---|
| static class  | DataToItemsSketch.DataToItemsSketchInitialClass used to calculate the initial pass of an Algebraic sketch operation. | 
| static class  | DataToItemsSketch.DataToItemsSketchIntermediateFinal<T>Class used to calculate the intermediate or final combiner pass of an Algebraic sketch
 operation. | 
| Constructor and Description | 
|---|
| DataToItemsSketch(int k,
                 Comparator<T> comparator,
                 org.apache.datasketches.ArrayOfItemsSerDe<T> serDe)Base constructor. | 
| Modifier and Type | Method and Description | 
|---|---|
| void | accumulate(org.apache.pig.data.Tuple inputTuple)An Accumulator version of the standard exec() method. | 
| void | cleanup()Cleans up the UDF state after being called using the  Accumulatorinterface. | 
| org.apache.pig.data.Tuple | exec(org.apache.pig.data.Tuple inputTuple)Top-level exec function. | 
| protected T | extractValue(Object object)Override this if it takes more than a cast to convert from Pig type to type T | 
| org.apache.pig.data.Tuple | getValue()Returns the result of the Union that has been built up by multiple calls to  accumulate(org.apache.pig.data.Tuple). | 
| org.apache.pig.impl.logicalLayer.schema.Schema | outputSchema(org.apache.pig.impl.logicalLayer.schema.Schema input) | 
allowCompileTimeCalculation, finish, getArgToFuncMapping, getCacheFiles, getInputSchema, getLoadCaster, getLogger, getPigLogger, getReporter, getReturnType, getSchemaName, getSchemaType, getShipFiles, isAsynchronous, needEndOfAllInputProcessing, progress, setEndOfAllInput, setInputSchema, setPigLogger, setReporter, setUDFContextSignature, warnpublic DataToItemsSketch(int k,
                         Comparator<T> comparator,
                         org.apache.datasketches.ArrayOfItemsSerDe<T> serDe)
k - parameter that determines the accuracy and size of the sketch.
 The value of 0 means the default k, whatever it is in the sketches-core librarycomparator - for items of type TserDe - an instance of ArrayOfItemsSerDe for type Tpublic org.apache.pig.data.Tuple exec(org.apache.pig.data.Tuple inputTuple)
                               throws IOException
If a large number of calls is anticipated, leveraging either the Algebraic or Accumulator interfaces is recommended. Pig normally handles this automatically.
Internally, this method presents the inner Datum Tuples to a new Union, which is returned as a Sketch Tuple
Types below are in the form: Java data type: Pig DataType
Input Tuple
exec in class org.apache.pig.EvalFunc<org.apache.pig.data.Tuple>inputTuple - A tuple containing a single bag, containing Datum Tuples.IOException - from Pig.public org.apache.pig.impl.logicalLayer.schema.Schema outputSchema(org.apache.pig.impl.logicalLayer.schema.Schema input)
outputSchema in class org.apache.pig.EvalFunc<org.apache.pig.data.Tuple>public void accumulate(org.apache.pig.data.Tuple inputTuple)
                throws IOException
accumulate in interface org.apache.pig.Accumulator<org.apache.pig.data.Tuple>inputTuple - A tuple containing a single bag, containing Datum Tuples.IOException - by Pigexec(org.apache.pig.data.Tuple), 
"org.apache.pig.Accumulator.accumulate(org.apache.pig.data.Tuple)"public org.apache.pig.data.Tuple getValue()
accumulate(org.apache.pig.data.Tuple).getValue in interface org.apache.pig.Accumulator<org.apache.pig.data.Tuple>exec(org.apache.pig.data.Tuple) for return tuple format)public void cleanup()
Accumulator interface.cleanup in interface org.apache.pig.Accumulator<org.apache.pig.data.Tuple>Copyright © 2015–2019 The Apache Software Foundation. All rights reserved.