This layer provides options for condensing data into a categorical encoding when the total number of tokens are known in advance. It accepts integer values as inputs, and it outputs a dense or sparse representation of those inputs. For integer inputs where the total number of tokens is not known, use layer_integer_lookup() instead.

layer_category_encoding(
  object,
  num_tokens = NULL,
  output_mode = "multi_hot",
  sparse = FALSE,
  ...
)

Arguments

object

What to call the new Layer instance with. Typically a keras Model, another Layer, or a tf.Tensor/KerasTensor. If object is missing, the Layer instance is returned, otherwise, layer(object) is returned.

num_tokens

The total number of tokens the layer should support. All inputs to the layer must integers in the range 0 <= value < num_tokens, or an error will be thrown.

output_mode

Specification for the output of the layer. Defaults to "multi_hot". Values can be "one_hot", "multi_hot" or "count", configuring the layer as follows:

  • "one_hot": Encodes each individual element in the input into an array of num_tokens size, containing a 1 at the element index. If the last dimension is size 1, will encode on that dimension. If the last dimension is not size 1, will append a new dimension for the encoded output.

  • "multi_hot": Encodes each sample in the input into a single array of num_tokens size, containing a 1 for each vocabulary term present in the sample. Treats the last dimension as the sample dimension, if input shape is (..., sample_length), output shape will be (..., num_tokens).

  • "count": Like "multi_hot", but the int array contains a count of the number of times the token at that index appeared in the sample.

For all output modes, currently only output up to rank 2 is supported.

sparse

Boolean. If TRUE, returns a SparseTensor instead of a dense Tensor. Defaults to FALSE.

...

standard layer arguments.

See also