How do Convolutional Neural Networks Work?
Trend

How do Convolutional Neural Networks Work?

Breakthroughs in deep learning in recent years have come from the development of Convolutional Neural Networks (CNNs or ConvNets). It is the main force in the development of the deep neural network field, and it can even be more accurate than humans in image recognition.
Published: Oct 06, 2022
How do Convolutional Neural Networks Work?

What is a Convolutional Neural Network?

Convolutional Neural Network is a feed-forward neural network whose artificial neurons can respond to surrounding units within a partial coverage area and has excellent performance for large-scale image processing. A convolutional neural network consists of one or more convolutional layers and a top fully connected layer, as well as associated weights and pooling layers. This structure enables convolutional neural networks to exploit the two-dimensional structure of the input data. Compared to other deep learning architectures, convolutional neural networks can give better results in image and speech recognition. This model can also be trained using the backpropagation algorithm. Compared to other deep, feed-forward neural networks, convolutional neural networks have fewer parameters to consider, making them an attractive deep learning architecture.

The Convolutional Neural Network is powerful in image recognition, and many image recognition models are also extended based on the CNN architecture. It is also worth mentioning that the CNN model is a deep learning model established by referring to the visual organization of the human brain. Learning CNN will help me learn other deep learning models.

Feature:

CNN compares parts of the image, which are called features. By comparing rough features at similar locations, CNNs are better at distinguishing whether images are the same or not, rather than comparing whole images. Each feature in an image is like a smaller image, that is, a smaller two-dimensional matrix and these features capture common elements in the image.

Convolution:

Whenever a CNN resolves a new image, without knowing where the above features are, the CNN compares anywhere in the image. To calculate how many matching features are in the whole image, we create a filtering mechanism here. The mathematical principle behind this mechanism is called convolution, which is where the name CNN comes from.

The basic principle of convolution is to calculate the degree of conformity between the feature and the image part, if the value of each pixel of the two is multiplied, and then the sum is divided by the number of pixels. If every pixel of the two images matches, sum these products and divide by the number of pixels to get 1. Conversely, if the two pixels are completely different, you will get -1. By repeating the above process and summarizing various possible features in the image, convolution can be completed. Based on the values and positions of each convolution, make a new 2D matrix. This is the original image filtered by the feature, which can tell us where to find the feature in the original image. The part with a value closer to 1 is more consistent with the feature, the closer the value is to -1, the greater the difference; as for the part with a value close to 0, there is almost no similarity at all. The next step is to apply the same method to different features, and convolutions in various parts of the image. Finally, we will get a set of filtered original images, each of which corresponds to a feature. Simply think of the entire convolution operation as a single processing step. In the operation of CNNs, this step is called a convolutional layer, which means that there are more layers to follow.

The operation principle of CNN is computationally intensive. While we can explain how a CNN works on just one piece of paper, the number of additions, multiplications, and divisions can increase quickly along the way. With so many factors affecting the number of computations, the problems that CNN's deal with can become complex with little effort, and it is no wonder that some chipmakers are designing and building specialized chips for the computational demands of CNNs.

Pooling:

Pooling is a method of compressing images and retaining important information. Its working principle can be understood with only a second degree in mathematics. Pooling will select different windows on the image, and select a maximum value within this window range. In practice, a square with a side length of two or three is an ideal setting with a two-pixel stride.

After the original image is pooled, the number of pixels it contains will be reduced to a quarter of the original, but because the pooled image contains the maximum value of each range in the original image, it still retains each range and each range. The degree of conformity of the characteristics. The pooled information is more focused on whether there are matching features in the image, rather than where these features exist in the image. Can help CNN to determine whether a feature is included in the image without having to be distracted by the location of the feature.

The function of the pooling layer is to pool one or some pictures into smaller pictures. We end up with an image with the same number of pixels, but with fewer pixels. Helps to improve the computationally expensive problem just mentioned. Reducing an 8-megapixel image to 2 megapixels beforehand can make subsequent work easier.

Linear rectifier unit:

An important step in the CNN is the Rectified Linear Unit (ReLU), which mathematically converts all negative numbers on the image to 0. This trick prevents CNNs from approaching 0 or infinity. The result after linear rectification will have the same number of pixels as the original image, except that all negative values will be replaced with zeros.

Deep learning:

After being filtered, rectified, and pooled, the original image will become a set of smaller images containing feature information. These images can then be filtered and compressed again, and their features will become more complex with each processing, and the images will become smaller. The final, lower-level processing layers contain simple features such as corners or light spots. Higher-order processing layers contain more complex features, such as shapes or patterns, and these higher-order features are usually well-recognized.

Fully connected layer:

Fully connected layers will collect the filtered pictures at a high-level, and convert this feature information into votes. In the traditional neural network architecture, the role of the fully connected layer is the main primary building block. When we input an image to this unit, it treats all pixel values as a one-dimensional list, rather than the previous two-dimensional matrix. Each value in the list determines whether the symbol in the picture is a circle or a cross. Since some values are better at discriminating forks and others are better at discriminating circles, these values will get more votes than others. The number of votes cast by all values for different options will be expressed in terms of weight or connection strength. So, every time CNN judges a new image, the image will go through many lower layers before reaching the fully connected layer. After voting, the option with the most votes will become the category for this image.

Like other layers, multiple fully-connected layers can be combined because their inputs (lists) and outputs (votes) are in similar forms. In practice, it is possible to combine multiple fully-connected layers, with several virtual, hidden voting options appearing on several of them. Whenever add a fully connected layer, the entire neural network can learn more complex feature combinations and make more accurate judgments.

Backpropagation:

The machine learning trick of backpropagation can help us decide the weights. To use backpropagation, need to prepare some pictures that already have the answer, and then must prepare an untrained CNN where the values of any pixels, features, weights, and fully connected layers are randomly determined. You can train this CNN with a labeled image.

After CNN processing, each image will eventually have a round of the election to determine the category. Compared with the previously marked positive solution, it is the identification error. By adjusting the features and weights, the error generated by the election is reduced. After each adjustment, these features and weights are fine-tuned a little higher or lower, the error is recalculated, and the adjustments that successfully reduced the error are retained. So, when we adjust each pixel in the convolutional layer and each weight in the fully connected layer, we can get a set of weights that are slightly better at judging the current image. Then repeat the above steps to identify more tagged images. During the training process, misjudgments in individual pictures will pass, but common features and weights in these pictures will remain. If there are enough labeled images, the values of these features and weights will eventually approach a steady state that is good at recognizing most images. But backpropagation is also a very computationally expensive step.

Hyperparameters:

  • How many features should be in each convolutional layer? How many pixels should be in each feature?
  • What is the window size in each pooling layer? How long should the interval be?
  • How many hidden neurons (options) should each additional fully connected layer have?

In addition to these issues, we need to consider many high-level structural issues, such as how many processing layers should be in a CNN and in what order. Some deep neural networks may include thousands of processing layers, and there are many design possibilities. With so many permutations, we can only test a small subset of the CNN settings. Therefore, the design of CNN usually evolves with the knowledge accumulated by the machine learning community, and occasionally there are some unexpected improvements in performance. In addition, many improvement techniques have been tested and found to be effective, such as using new processing layers or connecting different processing layers in more complex ways.

Published by Oct 06, 2022 Source :mcknote

Further reading

You might also be interested in ...

Headline
Trend
Comfort and Breathability Function: The Trend of Sustainable Development and Eco-friendly Materials
In today’s textile industry, with the growing awareness of environmental protection, sustainable development and eco-friendly materials have become mainstream trends. This fabric for sports support and rehabilitation braces is designed for long-term wear, providing exceptional comfort while offering excellent breathability. Its breathable properties effectively keep the skin dry, reducing odors and bacterial growth, ensuring the freshness and hygiene of the wearer.
Headline
Trend
AI Maglev Conveyor Systems: “Floating” into the Future of Manufacturing Logistics
Imagine goods no longer moving on rollers or belts, but gliding silently through the air like floating little trains—this is the magic of AI Maglev Conveyor systems. Magnetic levitation creates zero friction, low energy consumption, and minimal maintenance, while AI acts as a smart dispatcher, instantly rerouting, adjusting speed, and scheduling, making production lines unbelievably flexible. It’s not just cool—it can serve high-precision manufacturing like semiconductors and medical devices, with virtually no vibration. The market is skyrocketing, with manufacturing giants in China, Europe, and the U.S. racing to adopt it. Although the initial investment is high, the long-term benefits—energy savings, reduced maintenance, and efficiency gains—are remarkable. In the future, it will become the transport hub of smart factories, coordinating robots, systems, and human labor, so that walking into the facility feels like watching a silent, precise, and seamless showcase of future material handling.
Headline
Trend
The Rise of Digital Textile Printing: Replacing Traditional Dyeing and Printing, Moving Toward a Low-Pollution, Zero-Inventory Era
Traditional textile dyeing and printing have long been criticized for their high water consumption, heavy use of chemicals, and high energy demand—factors that not only impose a severe burden on the environment but also put pressure on the textile industry as it faces increasingly stringent environmental regulations. With the advancement of global sustainability policies and growing consumer awareness of environmental protection, Digital Textile Printing (DTP) has gradually come into the spotlight, emerging as a key direction for textile industry transformation. Featuring flexible production models, reduced environmental impact, and the ability to support small-batch, diversified designs, this technology is rapidly reshaping the landscape of the printing and dyeing sector.
Headline
Trend
AI Doctor is Here? A Medical Revolution Beyond Your Imagination
In the rapidly developing digital era, healthcare is being profoundly transformed by Artificial Intelligence (AI), the Internet of Things (IoT), and wearable devices. This is not just a technological upgrade; it is akin to the "iPhone moment" that disrupted traditional healthcare services and doctor-patient interactions, opening a new chapter in health management. Historically, medicine has been a "passive" journey fraught with uncertainty and high barriers. The powerful rise of AI is now painting a new blueprint for the global healthcare industry, steering health management toward a smarter and more personalized future.
Headline
Trend
YCS and International Bicycle Brands: A Collaboration Story
As cycling becomes more popular globally, particularly in the high-end sports bicycle sector, the demand for precision parts is steadily increasing. These components not only play a central role in a bike's performance but are also a direct reflection of the rider's experience. Many international brands are now placing a greater emphasis on personalized design and high-quality machining to meet the diverse needs of different users.
Headline
Trend
The Dual-Track Growth of Mental Health and Post-Acute Care: A New Focus for Healthcare Institutions in 2025
In 2025, the global healthcare system faces the dual challenges of a surge in chronic diseases and an aging population. The focus is shifting from treating a single illness to promoting holistic health. In the post-pandemic era, the demand for mental health services has risen sharply, with a continuous increase in the number of people suffering from anxiety and depression. To meet this challenge, healthcare institutions are actively adopting a dual-track strategy, focusing on expanding behavioral health services and ensuring seamless transitions to post-acute care. This approach is designed to enhance the continuity of patient care and improve long-term health outcomes.
Headline
Trend
Global Freight Transportation Trends Analysis
In recent years, the global freight market has continued to expand. In 2023, worldwide freight volume reached 11.6 billion tons, with maritime shipping still accounting for the largest share, while air and land transport have grown rapidly due to the rise of e-commerce. In the face of trends such as digitalization, automation, and low-carbon transportation, companies that leverage the latest transportation data and models will gain a competitive advantage and be better equipped to respond to future market changes.
Headline
Trend
Taiwan's Textile Transformation: Digitalization and Localization for Agile Responsiveness
Historically, the global textile industry relied on mass production and economies of scale for low-cost manufacturing. However, as consumer demands become increasingly diverse and dynamic, small-batch, high-mix production and fast delivery have become the market mainstream. Taiwan, with its complete and advanced textile supply chain and high-end functional fabric technology, has long demonstrated competitiveness on the international stage. Facing global supply chain restructuring and the fast fashion trend, Taiwan's textile industry is actively pursuing a digital and localized transformation. The goal is to build a flexible, responsive agile supply chain, making manufacturing a sustained competitive advantage.
Headline
Trend
Data Powers Smarter Forklifts: IIoT Drives Next-Level In-Plant Logistics
Factory material handling is undergoing a major evolution! From traditional manually operated forklifts and conveyor belts to smart equipment equipped with sensors, AI, and IIoT, these machines do more than just move materials—they’ve become “decision-making partners” connecting production, warehousing, and the supply chain. Real-time monitoring, predictive maintenance, and dynamic scheduling boost efficiency, cut costs, and reduce accidents. Leading factories worldwide are already achieving impressive results with smart material handling. In the future, forklifts and AGVs will be capable of self-diagnosis, cross-plant collaboration, and even intelligent energy management, steering the rhythm of the entire factory. Are you ready to embrace this smart logistics revolution?
Headline
Trend
The Trends of Instant Beverages: A New Era of Convenience, Health, and Flavor
In today's fast-paced world, "convenience" has become a top consideration for many shoppers. Instant beverages not only quickly satisfy thirst and provide an energy boost, but their popularity has surged again with the rise of the "stay-at-home economy" and remote work. From classic 3-in-1 coffee to high-end pour-over tea bags, instant drinks are entering a new era that balances quality and health.
Headline
Trend
New Perspectives on Food Trends: The Evolution from General Wellness to Precise Conditioning
The relationship between modern people and food is undergoing a profound transformation. We no longer view food as merely a necessity for survival, but as an art form—a tool for actively managing our physical condition. This trend is shifting from the vague concept of "wellness" to a more precise, scientific, and personalized approach. In the fast-changing food market, this has become an undeniable mainstream trend.
Headline
Trend
The Path to Upgrading Metal Fabrication: Digital Transformation, Low-Carbon Challenges, and Global Opportunities
Facing resource- and energy-intensive production processes, the metal fabrication industry must harness smart manufacturing and automation—deploying CNC machining, robotic arms, and AI monitoring—to cut costs and errors while enhancing precision and delivery reliability. Integration of ERP, MES, and APS platforms increases process transparency and enables real-time scheduling adjustments, forming a seamless data and management loop. It’s recommended to support this with global market size data and figures on rising automation investments to boost credibility.
Agree