Search code examples
machine-learningdeep-learningdeeplearning4jdl4j

How can I concatenate mixed type input into multi layer network with deeplearning4j?


I have a dataset where some features are numerical, some categorical, and some are strings (e.g. description). To give an example, lets say I have three features:

| Number | Type | Comment                               |
---------------------------------------------------------
| 1.23   | 1    | Some comment, up to 10000 characters  |
| 2.34   | 2    | Different comment many words          |
... 

Can I have all of them as input to a multi-layer network in dl4j, where numerical and categorical would be regular input features, but string comment feature will be processed first as word-series by a simple RNN (e.g. Embedding -> LSTM)? In other words, architecture should look something like this:

"Number"  "Type"  "Comment"
  |         |         |
  |         |      Embedding
  |         |         |
  |         |       LSTM
  |         |         |
 Main Multi-Layer Network
          | 
        Dense
          |
         ...
          |
       Output

I think in Keras this can be achieved by Concatenate layer. Is there something like this in DL4J?


Solution

  • Dl4j has 99% keras import coverage. We have concatneate layers as well. Take a look at the various vertices. Whatever you can do in keras should be do able in dl4j, save for very specific cases. More here: https://deeplearning4j.org/docs/latest/deeplearning4j-nn-computationgraph You want a MergeVertex.