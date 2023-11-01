Planning a machine learning architecture can be challenging because it requires balancing a range of priorities, including performance, cost and scalability.

Although these considerations apply to many types of architectures, ML environments often have specific needs, such as the ability to access bare-metal hardware. These special requirements add an extra layer of difficulty to ML architecture design.

With these challenges in mind, learn what organizations should consider when planning an ML architecture and how to design a system that best balances competing priorities.

What is an ML architecture? An ML architecture is the complete set of components that power an ML workload. The specific elements vary by environment. But core parts of an ML architecture typically include the following: Data sources . These provide the data that ML models train on. Some ML architectures draw on data that already exists, such as publicly available internet content, whereas others rely on unique, original data sources.

. These provide the data that ML models train on. Some ML architectures draw on data that already exists, such as publicly available internet content, whereas others rely on unique, original data sources. Data quality management tools . These ensure that data meets the accuracy and completeness requirements of ML models.

. These ensure that data meets the accuracy and completeness requirements of ML models. Data pipelines . These move data from its source to the models that need to ingest it.

. These move data from its source to the models that need to ingest it. Data training processes. These facilitate the process of building and refining models using the available data.

These facilitate the process of building and refining models using the available data. ML applications . These generate insights using the models trained on the architecture.

. These generate insights using the models trained on the architecture. Compute and storage infrastructure . This hosts all of the components.

. This hosts all of the components. Orchestration tooling. This is used to manage the various components of the ML architecture and unify them into a coherent ML pipeline. All these components make ML architectures more complex than many other types of IT architectures. For example, the architecture that powers a basic web application is relatively simple: a web server application, a server to host it and potentially a database to store website data. It's a simpler architecture because it doesn't have to support processes like data ingestion or model training. In addition, ML architectures can be complex because ML workloads require special types of infrastructure and resources. For example, they often need access to bare-metal infrastructure to use GPUs. They also might require orchestrators that are purposefully built for ML, such as Apache Airflow.

5 considerations when planning an ML architecture In addition to identifying which components are necessary for a particular workload, ML architects must also consider goals related to ML workload outcomes and business priorities. The following are some of the top considerations. 1. Performance Some ML workloads require higher levels of performance than others. If a team is under pressure to deliver models on a tight timeline, training may need to happen fast. Generally, this means that the ML architecture will require more compute resources to speed up training. 2. Scalability Some ML workloads grow over time due to factors such as an increase in the volume of training data or the need to deploy multiple variations of the same model. If the ability to handle increased ML workload capacity is a priority, the ML architecture should be capable of scaling up. Likewise, some ML workloads might need to scale down. For example, a team may abandon some models, requiring less infrastructure to support them. In this case, the ability to scale the environment down is important to avoid wasting money on infrastructure that's no longer needed. 3. ML lifecycle duration ML architecture design should reflect how long an ML workload needs to be operational. In some cases, ML models and apps might be deployed for a specific, one-time purpose. Others might need to operate indefinitely. A related factor to consider is how often models require retraining. Will the ML team train the model once and then run it for years, or will it be updated multiple times a year? The latter case will require an ML architecture that supports recurring model training. 4. Cost Cost is another major consideration for ML architecture design. Although organizations don't want to overpay for ML infrastructure or services, it's equally important not to underinvest in requirements. Doing so could result in development delays or poor performance. 5. Security and compliance Depending on the sensitivity of the data used for ML training as well as any compliance requirements that govern data or models, specialized infrastructure might be necessary to minimize security and data privacy risks.