Figure 1 | Scientific Reports

Figure 1

From: Comparative analysis of vision transformers and convolutional neural networks in osteoporosis detection from X-ray images

Figure 1

An illustration of how a CNN block functions. Each CNN layer consists of several “filters”. Each filter can be represented as a n × n matrix of numbers which are learned during the training process. When the CNN layer operates on the image, at each step, which is called “stride” technically, that n × n matrix multiplied with the values of a n × n piece of the image and “summarize” that piece with a single number, as it can be seen in the figure. Then, the filter proceed one step and size of that further movement is equal to the “stride” parameter. The “parameter” makes sure that when the filter starts from the upper left corner of the image, the whole of it covers the image, namely, it simply add some pixels with the value of 0 to the borders of the image25.

Back to article page