Getty Images/iStockphoto

Tip

4 types of simulation models used in data analytics

Combining different types of simulation models with predictive analytics enables organizations to forecast events and improve the accuracy of data-driven decisions.

Kurt Cagle, The Cagle Report

Published: 14 May 2025

Simulation models are finding new uses as organizations advance in predictive analytics and data-driven decision-making.

Most data analytics techniques originated from gambling games. For example, a player might want to determine the likelihood of rolling a total of 14 with three six-sided dice -- the basis for binomial or normal distributions -- or know the odds in roulette or poker. Such games are essentially simulations, and the data analyst's goal is to create a simplified model to determine the behavior of complex systems.

These simulations have become the only viable way to solve complex real-world problems in biology, physics, economics and other domains with many interacting components. Data analytics professionals should know these four types of simulation models:

Monte Carlo method.
Agent-based modeling.
Discrete event simulation.
System dynamic modeling.

These four types of simulation models underlie many games, visual and audio synthesis techniques, machine learning algorithms, processing kernels and controller systems. Simulations can test systems virtually before an organization commits to a decision or design.

Monte Carlo method

In many simulations, it is difficult to determine whether the selected variables and the distributions of data from those variables represent the model in question. The name Monte Carlo comes from roulette, a game made famous at Monte Carlo resorts. The roulette wheel has 37 slots numbered 0 to 36, with 18 red slots, 18 black slots and one green slot. Players have a 48.65% chance of getting a red vs. black slot and a 2.7% chance of getting a green slot (the zero). The three chances represent one distribution.

Simulations can test systems virtually before an organization commits to a decision or design.

Any individual spin results in a random value. Repeat the same process 1,000 times or more and the distribution of results should follow those percentages. If it doesn't, other variables could be at work, such as a pedal that an unscrupulous dealer uses to slow down the wheel.

One of the oldest known examples of the Monte Carlo method is in its use to calculate the value of pi. This can take millions of data points to get there, which points out the limitations of Monte Carlo simulations: They are usually not that efficient.

This kind of simulation is often used with Bayesian analysis, which relies upon prior findings to determine the likelihood of an event occurring. Political analysts often use this technique, where polls generate a set of variables that can then be aggregated to create a model, with Monte Carlo methods used to test the model. Ensemble modeling for weather events also uses Monte Carlo, for example, to determine the likely path of a hurricane.

Agent-based modeling

Anyone who has watched a flock of birds take off has seen seemingly random initial behavior give way to a synchronized activity, with birds flying in a distinct formation even if no one bird controls their activity. Birds in flight have developed simple rules that tell them what to do based on what they see around them. Each bird avoids obstacles as it flies and adjusts its position in real time based on the location of birds around it. Think of these birds as agents, and the moves they make are emergent behaviors. These behaviors take place in reaction to a discrete set of rules based on what other agents do. The process of identifying what those rules are is called agent-based modeling.

Agent systems were studied in the 1960s as one of the earliest examples of cybernetics and are still significant. For instance, the traffic on a typical busy highway can be difficult to model using computers. Instead, many modelers simulate each car as an agent that generally follows a set of rules, but with periodic hiccups, to see how cars act in the aggregate.

Agent systems are also used with IoT devices and drones. These devices are not dependent on coordinating activities through a central processor, which creates latency and bottlenecks through complex processing. Instead, they react to their nearest neighbors. They check in with the central controller only when they get ambiguous information or put themselves into a safe mode if they cannot interact either with neighbors or with the central controller.

This interaction scenario is the downside of the agent system. An outage or similar disruption between a small number of agents can spread quickly. This phenomenon can cause major power outages that are difficult to recover from because the cause of everything going offline is due to emergent behavior in autonomous power stations. In the process of rebooting, the problem that led to the outage may be resolved without indications of its cause.

Agent systems can be simulated, with software objects replacing hardware ones. Cellular biology, for instance, lends itself well to agent-based modeling, as cell behavior tends to influence nearby cells of varying types.

Discrete event simulation

Related to agent systems is the notion of cellular automata, made famous by John Horton Conway in his Game of Life in the 1970s and later by Stephen Wolfram's Mathematica. Both technologies underpin transformational filters and kernels used in image processing and machine learning.

Such systems are examples of discrete event simulations. In these simulations, time is broken up into distinct steps or chunks rather than being continuous, with the model's state at each step and then a function of the model at the previous steps.

In these simulations, stable or quasi-stable components emerge without explicit programming.

Data analysts use discrete event simulations in areas where proximity determines a grid's state or space. When working with mesh-based models, the finer the mesh used to describe the map, the more accurate the results. Corrections need to be made to the model to account for the shape (or topology) of the mesh. Triangular or hexagonal meshes are more accurate than rectangular ones.

System dynamic modeling

In an ideal mathematical world, it should be possible to describe the world with independent functions, meaning that they can be treated as if they were linear. In reality, most variables that describe systems are coupled with one another -- changing the value of one variable may change another variable due to their interaction. These are nonlinear systems derived from differential equations.

With computing, we can solve such equations numerically using difference equations. Difference equations use discrete mathematics to find specific solutions that can then be generalized through building up ensembles of solutions.

A good example of such a system is predator-prey simulations. In the simplest case, there's prey, and the number of prey animals increases until their food runs out. At that point, the prey population drops to a level where its food supply can recover. Add a predator to the mix, however, and things get more complex. The prey is now coupled to two variables: its food supply and the number of predators that will kill prey animals. The population of all three species becomes nonlinear and somewhat unpredictable, even chaotic. These equations are known as Lyapunov equations, which can also be used to analyze stability in systems like fluid and airflow dynamics equations.

System dynamic modeling (SDM) studies chaotic systems. It relies on discrete event simulation and numeric methods to determine the behavior of components within that system. Beyond Lyapunov solutions, SDM is also used in high-density particle simulations -- for instance, modeling the behavior of a galaxy based on the forces acting on idealized versions of stars. Chaotic systems give rise to fractals, which are fractional dimensions often associated with iterative, recursive structures and emerging behaviors.

Kurt Cagle is the former contributing editor for Data Science Central. A published author with more than 20 books to his credit, he has long served as technology editor, writing for Forbes, O'Reilly Media, IDC and others.

4 types of simulation models used in data analytics

Combining different types of simulation models with predictive analytics enables organizations to forecast events and improve the accuracy of data-driven decisions.

Monte Carlo method

Agent-based modeling

Discrete event simulation

System dynamic modeling

Dig Deeper on Business intelligence management

Monte Carlo adds observability for unstructured data

What is modeling and simulation (M&S)?

Monte Carlo launches first agents for data observability

Acceldata unveils AI-powered data observability tools

Monte Carlo method

Agent-based modeling

Discrete event simulation

System dynamic modeling

Related Resources

Dig Deeper on Business intelligence management

Monte Carlo adds observability for unstructured data

What is modeling and simulation (M&S)?

Monte Carlo launches first agents for data observability

Acceldata unveils AI-powered data observability tools