ColumnFeatureFunction (Lens 2.1.2-inm API)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.lens.ml.spark
Class ColumnFeatureFunction

java.lang.Object
  org.apache.lens.ml.spark.FeatureFunction
      org.apache.lens.ml.spark.ColumnFeatureFunction

All Implemented Interfaces:: Serializable, org.apache.spark.api.java.function.Function<scala.Tuple2<org.apache.hadoop.io.WritableComparable,org.apache.hive.hcatalog.data.HCatRecord>,org.apache.spark.mllib.regression.LabeledPoint>

public class ColumnFeatureFunction
extends FeatureFunction
extends FeatureFunction

A feature function that directly maps an HCatRecord to a feature vector. Each column becomes a feature in the vector, with the value of the feature obtained using the value mapper for that column

See Also:: Serialized Form

Field Summary
`static org.apache.log4j.Logger`	`LOG` The Constant LOG.

Constructor Summary
`ColumnFeatureFunction(int[] featurePositions, FeatureValueMapper[] valueMappers, int labelColumnPos, int numFeatures, double defaultLabel)` Feature positions and value mappers are parallel arrays.

Method Summary
`org.apache.spark.mllib.regression.LabeledPoint`	`call(scala.Tuple2<org.apache.hadoop.io.WritableComparable,org.apache.hive.hcatalog.data.HCatRecord> tuple)`

Methods inherited from class java.lang.Object
`equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Field Detail

LOG

public static final org.apache.log4j.Logger LOG

The Constant LOG.

Constructor Detail

ColumnFeatureFunction

public ColumnFeatureFunction(int[] featurePositions,
                             FeatureValueMapper[] valueMappers,
                             int labelColumnPos,
                             int numFeatures,
                             double defaultLabel)

Feature positions and value mappers are parallel arrays. featurePositions[i] gives the position of ith feature in the HCatRecord, and valueMappers[i] gives the value mapper used to map that feature to a Double value

Parameters:: featurePositions - position number of feature column in the HCatRecord; valueMappers - mapper for each column position; labelColumnPos - position of the label column; numFeatures - number of features in the feature vector; defaultLabel - default lable to be used for null records

Method Detail

call

public org.apache.spark.mllib.regression.LabeledPoint call(scala.Tuple2<org.apache.hadoop.io.WritableComparable,org.apache.hive.hcatalog.data.HCatRecord> tuple)
                                                    throws Exception

Specified by:: call in interface org.apache.spark.api.java.function.Function<scala.Tuple2<org.apache.hadoop.io.WritableComparable,org.apache.hive.hcatalog.data.HCatRecord>,org.apache.spark.mllib.regression.LabeledPoint>
Specified by:: call in class FeatureFunction

Throws:: Exception