Difference between Trunk and Embedder Model, Embedding Size and Evaluation? #407

bohniti · 2021-12-21T11:00:46Z

bohniti
Dec 21, 2021

Hi there,

First of all, thanks for the excellent library! Since the documentation is so lovely, it's a pleasure to work with it.

I want to distinguish between papyri-fragments for my project. I use the nearest neighbors based on accuracy for evaluation. To do so, I adapted the MNIST-Example.

However, I don't fully understand the difference between the trunk and the embedded model. Do I get it right that I create the embeddings where fragments from the same papyri are close together, and fragments from another papyrus are far apart while using the embedder Model? Why do I need a trunk model, and what does that mean?
Do I get It to write that it was a mistake to use just the vanilla EfficientNet without replacing the last layer and changing it to the desired embedding size (output of the last EfficientNet layer)?
I would like to evaluate the images while patching them into equally sized parts (e.g., using transforms.FiveCrop). Is it possible to use the tester in combination with an adapted collate_fn and adapted data_label_getter?

I would also be very thankful if you could suggest literature, tutorials, or similar.

Hard facts about my product:

Data: several very large images (max ~8000 x 5000)
Data: Around 2000 images from 600 papyri. So most papyri consist of fragments.
Model-Architecture: EfficientnetB7

KevinMusgrave · 2021-12-21T17:54:11Z

KevinMusgrave
Dec 21, 2021
Maintainer

However, I don't fully understand the difference between the trunk and the embedded model. Do I get it right that I create the embeddings where fragments from the same papyri are close together, and fragments from another papyrus are far apart while using the embedder Model? Why do I need a trunk model, and what does that mean?

Separating the model into "trunk" and "embedder" components can be helpful conceptually (see discussion #270).

Other than that, there's nothing special about the two terms. The convention in this library is that embeddings = embedder(trunk(data)). You can make embedder do nothing, by setting it to torch.nn.Identity(). You can also make trunk do nothing. It's also fine to ignore these terms and encapsulate your entire model in model.

Do I get It to write that it was a mistake to use just the vanilla EfficientNet without replacing the last layer and changing it to the desired embedding size (output of the last EfficientNet layer)?

Usually the last layer of these pretrained models is for classification, so before training you should replace the last layer with torch.nn.Identity() or a torch.nn.Linear that outputs the size of embedding that you want.

I would like to evaluate the images while patching them into equally sized parts (e.g., using transforms.FiveCrop). Is it possible to use the tester in combination with an adapted collate_fn and adapted data_label_getter?

Good question. Yes you can provide a custom collate_fn when you call tester.test.

As for the data_and_label_getter, you can see how that's used in compute_all_embeddings

If you're thinking of taking the 5 crops and computing the average embedding, then you might want to write a custom tester and modify get_embeddings_for_eval.

For example, something like:

from pytorch_metric_learning.testers import GlobalEmbeddingSpaceTester
from pytorch_metric_learning.utils import common_functions as c_f

class CustomTester(GlobalEmbeddingSpaceTester):
    def get_embeddings_for_eval(self, trunk_model, embedder_model, input_imgs):
        input_imgs = c_f.to_device(
            input_imgs, device=self.data_device, dtype=self.dtype
        )
        # from https://pytorch.org/vision/stable/transforms.html#torchvision.transforms.FiveCrop
        bs, ncrops, c, h, w = input_imgs.size()
        result = embedder(trunk(input_imgs.view(-1, c, h, w))) # fuse batch size and ncrops
        result_avg = result.view(bs, ncrops, -1).mean(1) # avg over crops
        return result_avg

I would also be very thankful if you could suggest literature, tutorials, or similar.

I found this related work: https://hal.archives-ouvertes.fr/hal-03260782/document

2 replies

bohniti Dec 22, 2021
Author

Okay, got it thanks for the quick replay. It helps a lot.
And thanks for the paper. I know the work from Pirone et al.
This is more or less exactly what I am trying to do and build on top of it with help of a few ideas for improvement. :)

KevinMusgrave Dec 22, 2021
Maintainer

I see, looks interesting!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Difference between Trunk and Embedder Model, Embedding Size and Evaluation? #407

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 2 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Difference between Trunk and Embedder Model, Embedding Size and Evaluation? #407

bohniti Dec 21, 2021

Replies: 1 comment · 2 replies

KevinMusgrave Dec 21, 2021 Maintainer

bohniti Dec 22, 2021 Author

KevinMusgrave Dec 22, 2021 Maintainer

bohniti
Dec 21, 2021

Replies: 1 comment 2 replies

KevinMusgrave
Dec 21, 2021
Maintainer

bohniti Dec 22, 2021
Author

KevinMusgrave Dec 22, 2021
Maintainer