View All Series Articles

Feature

Explore the foundations of artificial neural network modeling

Dive into Giuseppe Bonaccorso's recent book 'Mastering Machine Learning Algorithms' with a chapter excerpt on modeling neural networks.

By

Published: 31 Aug 2020

Deep learning neural networks are usually rife with challenges. For all their layered capabilities, the algorithms themselves are hard to create and even harder to manage. From the demand for millions of data points used in model training to the black box decision-making process, data scientists are fighting an uphill battle right from the start of neural network creation. However, with bigger risk comes bigger reward: Deep learning artificial neural networks can produce state-of-the-art performance in regression, image classification and business applications.

This anxiety around training methods and limitations of the algorithms is not lost on Giuseppe Bonaccorso, who recently wrote a 700-page how-to manual. For enterprises that want to take a dive into artificial neural network modeling, Bonaccorso, who is the global head of innovative data science at Bayer, offers his take on popular issues, training strategies and why building a model can work.

What are the benefits of creating an artificial neural network from scratch, especially when there are so many prepackaged vendor offerings?

Giuseppe Bonaccorso: The rationale behind the choice of a new model or an existing one should be rooted in the nature of the problem. For example, in image recognition, there are several high-performance networks that can be adapted to specific roles, but there are also problems that require more customized solutions. In these cases, building a model from scratch is likely to be the optimal strategy.

Neural networks are very flexible models. There are cases when existing architectures can simplify the work, as well as pretrained models where only some layers are retrained to meet specific requirements in transfer learning. Start with simple networks. If the results are poor, it's possible to increase complexity. However, the simplest model that guarantees both accuracy and generalization ability [is the right one].

Giuseppe Bonaccorso

Giuseppe Bonaccorso

Which toolkits are best suited to model and create a neural network?

Bonaccorso: My primary choice is TensorFlow 2, which now includes Keras, which is a high-level module. Using TensorFlow, the data scientist can easily start with Keras models based on predefined layer structures and, in case it's necessary, she can switch to more advanced features. There are also other frameworks, like PyTorch. I believe there are no silver bullets, but it's important that once a framework is chosen, all its features are thoroughly studied and evaluated. Even if not immediately helpful, some features can, in fact, become essential to solving some problems in the most effective way.

Neural networks are famous for being difficult and hard to manage. What are the most common problems when modeling a deep learning network?

Bonaccorso: Deep neural networks are extremely complex models with tens of millions of parameters. Training them means finding the optimal set of parameters to achieve a predefined goal -- and the training can easily remain stuck in suboptimal solutions. In order to mitigate this problem, several optimization algorithms have been proposed. The role of a data scientist is to pick the most appropriate algorithm and tune up its hyperparameters to maximize both the training speed and the final accuracy. Moreover, these kinds of models have an intrinsically large capacity; the more parameters you introduce, the more complex the system becomes and, consequently, its ability to learn the training set increases very quickly.

Mastering Machine Learning Algorithms

Click on this book cover
image to learn more about
the book from Packt
Publishing.

When small data sets are employed, deep learning models can easily overfit and learn to associate each training input with the correct label but lose the ability to generalize. Generalizing is a key concept in learning because we'd like to model systems that can abstract from some examples to derive a generic 'concept' representing a specific class.

Unfortunately, when working with deep neural networks, overfitting is a very common issue. However, data scientists can employ regularization, dropout and batch normalization techniques to correct issues.

How can data scientists keep their models accurate, fast and optimized over time with a model that is hard to retrain?

Bonaccorso: Once a model is properly trained, it becomes stable in its underlying data-generation process. However, many models are based on training sets that represent time-changing processes.

In fact, one common problem when retraining networks is that they tend to forget past knowledge when a new one is submitted. In order to avoid this problem, the new training sets must contain data sampled from the new data-generating process. For example, if we have trained a model to distinguish between cats and dogs and we want to extend it also to tigers, we cannot simply create a tiger data set -- we need to create a new set containing all three classes to learn to distinguish among features.

Current learning algorithms are very sensitive to drastic changes in the training sets, so it's important to keep this concept in mind when you need to update or retrain an algorithm. A more complex problem arises when the current architecture doesn't have enough capacity to learn more classes. In this case, the model would underfit, showing a very low accuracy, and the data scientist would have to consider a deeper or more complex architecture and a larger data set is needed to avoid overfitting the model.

Images or volumetric data are based on easily searched features using special operators (like convolutions), which work with the geometric structure of the samples. This intuition is based on direct observations of the structure of biological vision systems, where subsequent layers are responsible to extract more and more detailed features.

Convolutional deep neural networks are the starting point of every image-related problem, and, given the advancements in neural computation, their complexity is becoming easier to manage. Of course, convolutional layers are not enough to solve all the problems. Other helpful layers (like pooling, padding and up/down-sampling) are necessary to achieve specific goals. Aspiring data scientists should study and learn how to apply all layers in the most accurate and reasonable way.

Dive into Mastering Machine Learning Algorithms

Click here to read Chapter 17, "Modeling Neural Networks," of Bonaccorso's book.

Dig Deeper on AI technologies

Search Business Analytics

7 predictive analytics skills to improve simulation modeling
Predictive analytics skills such as statistical analysis, data preprocessing and model evaluation can help data professionals ...
Knime updates framework for agentic AI development
The open source analytics vendor is keeping up with competitors by providing features aimed at enabling users to create ...
Data science applications across industries in 2025
Industries like healthcare, retail and finance use data science applications to improve diagnostics, optimize operations, ...

Search CIO

Risk prediction models: How they work and their benefits
Accurate risk prediction models can aid risk management efforts in organizations. Here's a look at how risk models work and the ...
How to create a risk management plan: Template, key steps
A risk management plan provides a framework for managing business risks. Here's what it includes and how to develop one, plus a ...
12 best practices to keep in mind for SLA compliance
SLAs outline the criteria for acceptable performance from a service provider. Learn best practices CIOs and IT leaders should ...

Search Data Management

Data lake vs. data warehouse: Key differences explained
Data lakes and data warehouses differ in structure, processing and use cases, offering distinct advantages for enterprise ...
Oracle adds MCP support to advance agentic AI development
Given that the open standard simplifies the complex process of connecting systems during agentic AI development, the tech giant's...
Ataccama targets business users with latest AI capabilities
The vendor's latest update makes data lineage accessible to business users, enabling them to trust their data and make decisions ...

Search ERP

Lack of formal AI strategy holds back supply chain gains
Only about one-fourth of supply chain executives have a formal AI strategy in place, according to new research from Gartner. That...
11 benefits, use cases for AI in logistics
AI can play an important role in helping companies maintain the right inventory levels. Learn more about other benefits of using ...
6 ways to reduce last-mile delivery costs
Taking steps like eliminating unnecessary packaging can help companies cut down on their last-mile delivery costs and reduce ...

Close