model - How to use VGG19 transfer learning pretraining

Question

Welcome To Ask or Share your Answers For Others

model - How to use VGG19 transfer learning pretraining

asked Jan 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

model - How to use VGG19 transfer learning pretraining

I'm working on a VQA model, and I need some help as I'm new to this.

I want to use transfer learning from the VGG19 network before running the train, so when I start the train, I will have the image features ahead (trying to solve performance issue).

Does it possible to do so? If so, can someone please share an example with pytorch?

below is the relevant code:

class img_CNN(nn.Module):
  def __init__(self, img_size):

        super(img_CNN, self).__init__()
        self.model = models.vgg19(pretrained=True)
        self.in_features = self.model.classifier[-1].in_features
        self.model.classifier = nn.Sequential(*list(self.model.classifier.children())[:-1]) # remove vgg19 last layer
        self.fc = nn.Linear(in_features, img_size)

  def forward(self, image):
    #with torch.no_grad():
    img_feature = self.model(image) # (batch, channel, height, width)
    img_feature = self.fc(img_feature)   
    return img_feature

class vqamodel(nn.Module):
  def __init__(self, output_dim,input_dim, emb_dim, hid_dim, n_layers, dropout, answer_len, que_size, img_size,model_vgg,in_features):
    super(vqamodel,self).__init__()
    self.image=img_CNN(img_size)
    self.question=question_lstm(input_dim, emb_dim, hid_dim, n_layers, dropout,output_dim,que_size)
    self.tanh=nn.Tanh()
    self.relu=nn.ReLU()
    self.dropout=nn.Dropout(dropout)
    self.fc1=nn.Linear(que_size,answer_len) #the input to the linear network is equal to the combain vector
    self.softmax=nn.Softmax(dim=1)


  def forward(self, image, question):
    image_emb=self.image(image)
    question_emb=self.question(question) 
    combine =question_emb*image_emb
    out_feature=self.fc1(combine)
    out_feature=self.relu(out_feature)
      
    return (out_feature)

How can I take out the models.vgg19(pretrained=True),run it before the train on the image dataloader and save the image representation in NumPy array?

thank you!

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Answer

深蓝 · Answer 1 · 2021-01-24T02:57:47+0000

Yes, you can use a pretrained VGG model to extract embedding vectors from images. Here is a possible implementation, using torchvision.models.vgg*.

First retrieve the pretrained model

model = torchvision.models.vgg19(pretrained=True)

Its classifier is:

>>> model.classifier
(classifier): Sequential(
    (0): Linear(in_features=25088, out_features=4096, bias=True)
    (1): ReLU(inplace=True)
    (2): Dropout(p=0.5, inplace=False)
    (3): Linear(in_features=4096, out_features=4096, bias=True)
    (4): ReLU(inplace=True)
    (5): Dropout(p=0.5, inplace=False)
    (6): Linear(in_features=4096, out_features=1000, bias=True)
)

Depending on your finetuning strategy, you can either truncate it to keep some of the trained dense layers:

model.classifier = nn.Sequential(*[model.classifier[i] for i in range(4)])

Or replace it altogether with a different set of dense layers wrapped in a nn.Sequential:

model.classifier = nn.Sequential(
    nn.Linear(25088, 4096),
    nn.ReLU(True),
    nn.Dropout(0.5),
    nn.Linear(4096, 2048))

Additionally, you can freeze the entire head of the model (the feature extractor):
```
for param in model.features.parameters():
    param.requires_grad = False
```
Then you will be able to use that model to extract image embeddings and perform back propagation to finetune your classifier:
```
>>> model(img) # shape (batchs_size, 2048) 
```

Categories

model - How to use VGG19 transfer learning pretraining

model - How to use VGG19 transfer learning pretraining

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags