Join Stack Overflow to learn, share knowledge, and build your career. I can’t give the correct number of parameters of AlexNet or VGG Net.

For example, to calculate the number of parameters of a conv3-256 layer of VGG Net, the answer is 0.59M = (3*3)*(256*256), that is (kernel size) * (product of both number of channels in the joint layers), however in that way, I can’t get the 138M parameters. If you refer to VGG Net with 16-layer (table 1, column D) then 138M refers to the total number of parameters of this network, i.e including all convolutional layers, but also the fully connected ones.

Looking at the 3rd convolutional stage composed of 3 x conv3-256 layers: The convolution kernel is 3×3 for each of these layers. In terms of parameters this gives: As explained above you have to do that for all layers, but also the fully-connected ones, and sum these values to obtain the final 138M number.

UPDATE: the breakdown among layers give: In particular for the fully-connected layers (fc): As precised above the spatial resolution right before feeding the fully-connected layers is 7×7 pixels. This is because this VGG Net uses spatial padding before convolutions, as detailed within section 2.1 of the paper: […] the spatial padding of conv.

layer input is such that the spatial resolution is preserved after convolution, i.e. the padding is 1 pixel for 3×3 conv. Read more from…

thumbnail courtesy of