In the past, businesses collected and integrated data using tools that they set up and managed themselves. Today organizations can build data integration pipelines -- hosted, fully managed data integration services, including Azure Data Factory.
Azure Data Factory collects and integrates data from nearly 100 discrete sources without users having to install and manage data integration software.
For perspective on whether Azure Data Factory is right for your workloads, review its main benefits compared to more traditional tools and other hosted data integration services.
Azure Data Factory benefits
Azure Data Factory is a good option if you have a multi-cloud architecture. With it, you can integrate and centralize data stored across various clouds. Also, it's a good choice if your various applications write user data to different locations. For example, some data may go to relational databases and others to object storage in the cloud. You will need to integrate those sources and extract user information from them.
While it isn't the only offering that can handle use cases like these, Azure Data Factory does have some standout benefits compared to other available data integration tools.
No-code data workflows
You can configure Azure Data Factory to collect and integrate data from most mainstream data sources -- such as databases, proprietary cloud storage services and file systems -- without having to write a single line of code. This makes Azure Data Factory a good fit for businesses that want to use citizen-integrators who aren't coders. These users can build data integration workflows without having to learn special skills.
Some other data integration services offer similar code-free abilities. Users can configure Microsoft SQL Server Integration Services (SSIS), a more traditional tool, to do basic integration without needing to do much coding. However, Azure Data Factory places a key focus on being accessible to non-technical users, which tools like SSIS do not.
Easy SSIS migration
For companies accustomed to using SSIS, one of the strongest selling points of Azure Data Factory is that they can lift and shift SSIS data pipelines to Azure Data Factory with minimal effort. Azure Data Factory can natively execute SSIS packages within its Integration Runtime capabilities. It also provides a migration wizard to move SSIS workloads fully into Azure.
SSIS compatibility is supported on other data integration services -- such as AWS Glue, the Amazon equivalent to Azure Data Factory, which lets you convert SSIS packages. Since Microsoft develops both Azure Data Factory and SSIS, it provides the smoothest transition from on-premises data integration to cloud data integration. Without similar compatibility, the migration process is generally more complicated on other platforms.
Large collection of data collectors
Azure Data Factory currently offers nearly 100 prebuilt data connectors to import data from external sources. A large amount of Azure Data Factory’s online data collectors can be set up instantly.
This compares favorably to most other cloud-based data integration services. For example, as of May 2022, the AWS Marketplace listed 73 collectors for Glue. Typically, other on-premises offerings can't install prebuilt connectors as easily.
Built-in monitoring and alerting
Azure Data Factory offers built-in monitoring visualization. These native visibility features mean you can easily keep track of the status of data integration operations. On top of this, it helps the user be proactive about identifying and reacting to problems, such as a failed data transformation, that could disrupt workflows. You can also set up alerts to warn about such failed operations.
Unlike on-premises data integration tools, which typically require a large upfront investment, Azure Data Factory offers pay-as-you-go pricing.
Azure Data Factory limitations
Alongside Azure Data Factory's benefits, it's important to consider its limitations.
Custom data collectors
While you can create data pipelines based on a variety of common sources -- including mainstream databases and cloud storage services -- without writing code in Azure Data Factory, you'll need to write custom code to configure nonstandard data sources. This may prove challenging if, for example, you rely on proprietary databases that can't integrate with Azure Data Factory with the prebuilt connectors.
Focus on Azure
Azure Data Factory supports some data sources hosted outside of Azure, but it's designed first and foremost for building integration pipelines that connect to Azure or other Microsoft resource types. This is a downside if you have a multi-cloud strategy that runs most of your workloads outside Azure. In that case, you may be better served by a third-party offering such as Apache Airflow that's not tied to specific vendors or platforms.
While consumption-based pricing is attractive in some ways, its long-term total cost of ownership may be higher than that of on-premises options. If you plan to run data integration services for years, you may save money by hosting it on your own infrastructure.