-
Quick start
-
API
-
-
-
-
-
-
- Dense
- PReLU 2D
- PReLU 3D
- PReLU 4D
- PReLU 5D
- AdditiveAttention
- Attention
- MutiHeadAttention
- Conv1D
- Conv2D
- Conv3D
- ConvLSTM1D
- ConvLSTM2D
- ConvLSTM3D
- Conv1DTranspose
- Conv2DTranspose
- Conv3DTranspose
- DepthwiseConv2D
- SeparableConv1D
- SeparableConv2D
- Embedding
- BatchNormalization
- LayerNormalization
- Bidirectional
- GRU
- LSTM
- SimpleRNN
- Show All Articles ( 12 ) Collapse Articles
-
- Dense
- PReLU 2D
- PReLU 3D
- PReLU 4D
- PReLU 5D
- AdditiveAttention
- Attention
- MultiHeadAttention
- Conv1D
- Conv2D
- Conv3D
- ConvLSTM1D
- ConvLSTM2D
- ConvLSTM3D
- Conv1DTranspose
- Conv2DTranspose
- Conv3DTranspose
- DepthwiseConv2D
- SeparableConv1D
- SeparableConv2D
- Embedding
- BatchNormalization
- LayerNormalization
- Bidirectional
- GRU
- LSTM
- SimpleRNN
- Show All Articles ( 12 ) Collapse Articles
-
-
- Dense
- AdditiveAttention
- Attention
- MultiHeadAttention
- BatchNormalization
- LayerNormalization
- Bidirectional
- GRU
- LSTM
- SimpleRNN
- Conv1D
- Conv2D
- Conv3D
- Conv1DTranspose
- Conv2DTranspose
- Conv3DTranspose
- ConvLSTM1D
- ConvLSTM2D
- ConvLSTM3D
- DepthwiseConv2D
- SeparableConv1D
- SeparableConv2D
- Embedding
- PReLU 2D
- PReLU 3D
- PReLU 4D
- PReLU 5D
- Show All Articles ( 12 ) Collapse Articles
-
-
- Dense
- Embedding
- AdditiveAttention
- Attention
- MultiHeadAttention
- Conv1D
- Conv2D
- Conv3D
- ConvLSTM1D
- ConvLSTM2D
- ConvLSTM3D
- Conv1DTranspose
- Conv2DTranspose
- Conv3DTranspose
- DepthwiseConv2D
- SeparableConv1D
- SeparableConv2D
- BatchNormalization
- LayerNormalization
- PReLU 2D
- PReLU 3D
- PReLU 4D
- PReLU 5D
- Bidirectional
- GRU
- LSTM
- RNN (GRU)
- RNN (LSTM)
- RNN (SimpleRNN)
- SimpleRNN
- Show All Articles ( 15 ) Collapse Articles
-
- Dense
- Embedding
- AdditiveAttention
- Attention
- MultiHeadAttention
- Conv1D
- Conv2D
- Conv3D
- ConvLSTM1D
- ConvLSTM2D
- ConvLSTM3D
- Conv1DTranspose
- Conv2DTranspose
- Conv3DTranspose
- DepthwiseConv2D
- SeparableConv1D
- SeparableConv2D
- BatchNormalization
- LayerNormalization
- PReLU 2D
- PReLU 3D
- PReLU 4D
- PReLU 5D
- Bidirectional
- GRU
- LSTM
- RNN (GRU)
- RNN (LSTM)
- RNN (SimpleRNN)
- SimpleRNN
- Show All Articles ( 15 ) Collapse Articles
-
-
-
- Dense
- Embedding
- AdditiveAttention
- Attention
- MultiHeadAttention
- Conv1D
- Conv2D
- Conv3D
- ConvLSTM1D
- ConvLSTM2D
- ConvLSTM3D
- Conv1DTranspose
- Conv2DTranspose
- Conv3DTranspose
- DepthwiseConv2D
- SeparableConv1D
- SeparableConv2D
- BatchNormalization
- LayerNormalization
- PReLU 2D
- PReLU 3D
- PReLU 4D
- PReLU 5D
- Bidirectional
- GRU
- LSTM
- RNN (GRU)
- RNN (LSTM)
- RNN (SimpleRNN)
- SimpleRNN
- Show All Articles ( 15 ) Collapse Articles
-
- Dense
- Embedding
- AdditiveAttention
- Attention
- MultiHeadAttention
- Conv1D
- Conv2D
- Conv3D
- ConvLSTM1D
- ConvLSTM2D
- ConvLSTM3D
- Conv1DTranspose
- Conv2DTranspose
- Conv3DTranspose
- DepthwiseConv2D
- SeparableConv1D
- SeparableConv2D
- BatchNormalization
- LayerNormalization
- PReLU 2D
- PReLU 3D
- PReLU 4D
- PReLU 5D
- Bidirectional
- GRU
- LSTM
- RNN (GRU)
- RNN (LSTM)
- RNN (SimpleRNN)
- SimpleRNN
- Show All Articles ( 15 ) Collapse Articles
-
-
-
-
- Add
- AdditiveAttention
- AlphaDropout
- Attention
- Average
- AvgPool1D
- AvgPool2D
- AvgPool3D
- BatchNormalization
- Bidirectional
- Concatenate
- Conv1D
- Conv1DTranspose
- Conv2D
- Conv2DTranspose
- Conv3D
- Conv3DTranspose
- ConvLSTM1D
- ConvLSTM2D
- ConvLSTM3D
- Cropping1D
- Cropping2D
- Cropping3D
- Dense
- DepthwiseConv2D
- Dropout
- Embedding
- Flatten
- GaussianDropout
- GaussianNoise
- GlobalAvgPool1D
- GlobalAvgPool2D
- GlobalAvgPool3D
- GlobalMaxPool1D
- GlobalMaxPool2D
- GlobalMaxPool3D
- GRU
- Input
- LayerNormalization
- LSTM
- MaxPool1D
- MaxPool2D
- MaxPool3D
- MultiHeadAttention
- Multiply
- Permute3D
- Reshape
- RNN
- SeparableConv1D
- SeparableConv2D
- SimpleRNN
- SpatialDropout
- Substract
- TimeDistributed
- UpSampling1D
- UpSampling2D
- UpSampling3D
- ZeroPadding1D
- ZeroPadding2D
- ZeroPadding3D
- Show All Articles ( 45 ) Collapse Articles
-
- AlphaDropout
- AvgPool1D
- AvgPool2D
- AvgPool3D
- BatchNormalization
- Bidirectional
- Conv1D
- Conv1DTranspose
- Conv2D
- Conv2DTranspose
- Conv3D
- Conv3DTranspose
- Cropping1D
- Cropping2D
- Cropping3D
- Dense
- DepthwiseConv2D
- Dropout
- Embedding
- Flatten
- GaussianDropout
- GaussianNoise
- GlobalAvgPool1D
- GlobalAvgPool2D
- GlobalAvgPool3D
- GlobalMaxPool1D
- GlobalMaxPool2D
- GlobalMaxPool3D
- GRU
- LayerNormalization
- LSTM
- MaxPool1D
- MaxPool2D
- MaxPool3D
- Permute3D
- Reshape
- RNN
- SeparableConv1D
- SeparableConv2D
- SimpleRNN
- SpatialDropout
- UpSampling1D
- UpSampling2D
- UpSampling3D
- ZeroPadding1D
- ZeroPadding2D
- ZeroPadding3D
- Show All Articles ( 32 ) Collapse Articles
-
-
-
- Resume
- Accuracy
- BinaryAccuracy
- BinaryCrossentropy
- BinaryIoU
- CategoricalAccuracy
- CategoricalCrossentropy
- CategoricalHinge
- CosineSimilarity
- FalseNegatives
- FalsePositives
- Hinge
- Huber
- IoU
- KLDivergence
- LogCoshError
- Mean
- MeanAbsoluteError
- MeanAbsolutePercentageError
- MeanIoU
- MeanRelativeError
- MeanSquaredError
- MeanSquaredLogarithmicError
- MeanTensor
- OneHotIoU
- OneHotMeanIoU
- Poisson
- Precision
- PrecisionAtRecall
- Recall
- RecallAtPrecision
- RootMeanSquaredError
- SensitivityAtSpecificity
- SparseCategoricalAccuracy
- SparseCategoricalCrossentropy
- SparseTopKCategoricalAccuracy
- Specificity
- SpecificityAtSensitivity
- SquaredHinge
- Sum
- TopKCategoricalAccuracy
- TrueNegatives
- TruePositives
- Show All Articles ( 28 ) Collapse Articles
-
- Resume
- Constant
- GlorotNormal
- GlorotUniform
- HeNormal
- HeUniform
- Identity
- LecunNormal
- LecunUniform
- Ones
- Orthogonal
- RandomNormal
- RandomUnifom
- TruncatedNormal
- VarianceScaling
- Zeros
- Show All Articles ( 1 ) Collapse Articles
-
MultiHeadAttention
Description
Setup and add the multi head attention layer into the model during the definition graph step. Type : polymorphic.
Input parameters
Graphs in :Β array, model architecture. Must be query, value, key (key is optional).
Β parameters : layer parameters.
num_headsΒ : integer, number of attention heads.
key_dimΒ : integer, size of each attention head for query and key.
value_dimΒ : integer, size of each attention head for value.
Β use_bias? :Β boolean, whether the dense layers use bias vectors/matrices.
Default value βTrueβ.
kernel_initializerΒ : enum, initializer for dense layer kernels.
Default value βGlorotUniformβ.
bias_initializerΒ : enum, initializer for dense layer biases.
Default value βZerosβ.
Β optimizer :
Β algorithm :Β enum, (name of optimizer) for optimizer instance.
Default value βadamβ.
Β learning_rate :Β float, define the learning rate to use.
Default value β0.001β.
Β beta_1 :Β float, define the exponential decay rate for the 1st moment estimates.
Default value β0.9β.
Β beta_2 :Β float, define the exponential decay rate for the 2nd moment estimates.
Default value β0.999β.
Β training?Β :Β boolean, whether the layer is in training mode (can store data for backward).
Default value βTrueβ.
Β store?Β :Β boolean, whether the layer stores the last iteration gradient (accessible via the βget_gradientsβ function).
Default value βFalseβ.
Β update?Β :Β boolean, whether the layerβs variables should be updated during backward. Equivalent to freeze the layer.
Default value βTrueβ.
Β lda_coeff :Β float, defines the coefficient by which the loss derivative will be multiplied before being sent to the previous layer (since during the backward run we go backwards).
Default value β1β.
Β output_behaviorΒ :Β enum, setup if the layer is an output layer.
Default βNot Outputβββ.
name (optional) : string, name of the layer.
Output parameters
Graph out : model architecture.
Dimension
Input shape
List of the following tensors:
- query : Query TensorΒ of shape [batch_size, Tq, dim].
- value : Value TensorΒ of shape [batch_size, Tv, dim].
- key : Optional key TensorΒ of shape [batch_size, Tv, dim]. If not given, will useΒ valueΒ for bothΒ keyΒ andΒ value, which is the most common case.
Output shape
Attention outputs of shape [batch_size, Tq, dim].
Example
All these exemples are snippets PNG, you can drop these Snippet onto the block diagram and get the depicted code added to your VI (Do not forget to install HAIBAL library to run it).
MultiHeadAttention layer with two identical input layer shape
1 β Generate a set of data
We generate two array of data of type single and shape [batch_size = 10, Tq & Tv = 7, dim = 15] (same input shape).
2 β Define graph
We first define two input layers named “query_input” and “value_input”. This layers is setup as an input array shaped [Tq = 7, dim = 15] and [Tv = 7, dim = 15].
Finally, we construct an array of the two graphs generated at the input of MultiHeadAttention.
3 – Summarize graph
Returns the summary of the model in file text.
4 β Run graph
We call the forward method and retrieve the result with the βPrediction 3Dβ method.
This method returns two variables, the first one is the layer information (cluster composed of the layer name, the graph index and the shape of the output layer) and the second one is the prediction with a shape of [batch_size, Tq, dim].
MultiHeadAttention layer with two different input layer shape
1 β Generate a set of data
We generate two array of data of type single and shape1 [batch_size = 10, Tq = 7, dim = 15] and shape2 [batch_size = 10, Tv = 3, dim = 15] (different input shape).
We can only modify the first dimension (Tq or Tv) because the layer won’t accept different dimension between query, value and key.
2 β Define graph
We first define two input layers named “query_input” and “value_input”. This layers is setup as an input array shaped [Tq = 7, dim = 15] and [Tv = 3, dim = 15].
Finally, we construct an array of the two graphs generated at the input of MultiHeadAttentionAttention.
3 – Summarize graph
Returns the summary of the model in file text.
4 β Run graph
We call the forward method and retrieve the result with the βPrediction 3Dβ method.
This method returns two variables, the first one is the layer information (cluster composed of the layer name, the graph index and the shape of the output layer) and the second one is the prediction with a shape of [batch_size, Tq, dim].