Embedding layer #3999

kumarutkarsh1248 · 2025-08-31T19:14:24Z

No description provided.

rcurtin

Thanks for adapting this layer @kumarutkarsh1248! I have a bunch of comments---let me know what you think. As implemented, does this work in the mlpack-onnx networks you have put it into?

rcurtin · 2025-09-02T20:54:43Z

src/mlpack/methods/ann/layer/lookup.hpp

+ * @file methods/ann/layer/lookup.hpp
+ * @author Marcus Edel
+ *
+ * Definition of the Lookup (embedding) layer class.


Do you think we can just rename the layer Embedding because that's the more commonly used name?

rcurtin · 2025-09-02T20:54:54Z

src/mlpack/methods/ann/layer/lookup.hpp

@@ -0,0 +1,151 @@
+/**
+ * @file methods/ann/layer/lookup.hpp


If you adapted this out of not_adapted/, can you remove those files? Or, alternately, use git mv to move the old files to this directory and update them (that will preserve the git history for the files).

rcurtin · 2025-09-02T20:55:03Z

src/mlpack/methods/ann/layer/lookup.hpp

@@ -0,0 +1,151 @@
+/**
+ * @file methods/ann/layer/lookup.hpp
+ * @author Marcus Edel


Also feel free to add your name here :)

rcurtin · 2025-09-02T20:59:05Z

src/mlpack/methods/ann/layer/lookup_impl.hpp

+    const MatType& gy,
+    MatType& g)
+{
+  Log::Fatal << "Lookup cannot be used as an intermediate layer." << std::endl;


Suggested change

Log::Fatal << "Lookup cannot be used as an intermediate layer." << std::endl;

Log::Fatal << "Lookup cannot be used as an intermediate layer;"

<< " it must be the first layer in a network!" << std::endl;

Just a little clarification; this makes the error message a bit more actionable for the user (who may not immediately realize the meaning of the term "intermediate layer").

rcurtin · 2025-09-02T21:00:29Z

src/mlpack/methods/ann/layer/lookup_impl.hpp

+  MakeAlias(const_cast<CubeType&>(errorTemp), error, seqLength, embeddingSize,
+      batchSize, 0, false);
+
+  gradient.set_size(arma::size(weights));


Suggested change

gradient.set_size(arma::size(weights));

The gradient should already be set to the right size.

rcurtin · 2025-09-02T21:03:32Z

src/mlpack/tests/ann/layer/lookup.cpp

+  module.Parameters().randu();
+
+  // Test the Forward function.
+  input = arma::zeros(seqLength, batchSize);


Shouldn't we test input with different values? You could use arma::randi<>. It would also be good to add a test ensuring that an exception is thrown if an input has value greater than the vocabulary size.

rcurtin · 2025-09-02T21:03:47Z

src/mlpack/tests/ann/layer/lookup.cpp

+//   } function;
+
+//   REQUIRE(CheckGradient(function) <= 1e-6);
+// }


We should definitely uncomment this function before merge; does it not currently work?

rcurtin · 2025-09-02T21:05:19Z

src/mlpack/tests/ann/layer/lookup.cpp

+//       input.set_size(seqLength, batchSize);
+//       for (size_t i = 0; i < input.n_elem; ++i)
+//       {
+//         input(i) = math::RandInt(1, vocabSize);


So it looks here like the semantic is that the inputs should take values in [1, vocabSize]. But everything else in mlpack (and in C++) is zero-indexed---so I would expect that the index should take values in [0, vocabSize). Can you make that change? That would also remove the need for the - 1 that you are doing in the code, which could cause underflows if the user passes a 0 (which is actually what the test above is doing).

rcurtin · 2025-09-02T21:05:44Z

src/mlpack/methods/ann/layer/layer_types.hpp

@@ -44,6 +44,7 @@
 #include <mlpack/methods/ann/layer/linear_recurrent.hpp>
 #include <mlpack/methods/ann/layer/linear3d.hpp>
 #include <mlpack/methods/ann/layer/log_softmax.hpp>
+#include <mlpack/methods/ann/layer/lookup.hpp>


Don't forget to add the layer type to the list of serializable layers in serialization.hpp too. 👍

rcurtin · 2025-09-02T21:08:07Z

src/mlpack/methods/ann/layer/lookup.hpp

+   *     regularizer).
+   */
+  Lookup(const size_t vocabSize = 0, const size_t embeddingSize = 0,
+         RegularizerType regularizer = RegularizerType());


Is it ever valid to have a vocab size of 0 or an embedding size of 0? I think it might be better to remove those default parameters, and also remove the default constructor, so that the user must specify both the vocab size and the embedding dimension.

kumarutkarsh1248 added 5 commits August 30, 2025 17:21

add embedding layer

a393010

embedding layer updated and added one test case

633c817

small path fix

50872d5

small fix

8628732

fixed outdated parts

51d1854

rcurtin reviewed Sep 2, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Embedding layer #3999

Embedding layer #3999

kumarutkarsh1248 commented Aug 31, 2025

Uh oh!

rcurtin left a comment

Uh oh!

rcurtin Sep 2, 2025

Uh oh!

rcurtin Sep 2, 2025

Uh oh!

rcurtin Sep 2, 2025

Uh oh!

rcurtin Sep 2, 2025

Uh oh!

rcurtin Sep 2, 2025

Uh oh!

rcurtin Sep 2, 2025

Uh oh!

rcurtin Sep 2, 2025

Uh oh!

rcurtin Sep 2, 2025

Uh oh!

rcurtin Sep 2, 2025

Uh oh!

rcurtin Sep 2, 2025

Uh oh!

Uh oh!

	Log::Fatal << "Lookup cannot be used as an intermediate layer." << std::endl;
	Log::Fatal << "Lookup cannot be used as an intermediate layer;"
	<< " it must be the first layer in a network!" << std::endl;

Uh oh!

Embedding layer #3999

Are you sure you want to change the base?

Embedding layer #3999

Conversation

kumarutkarsh1248 commented Aug 31, 2025

Uh oh!

rcurtin left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!