Getty Images

Govern citizen development to avoid data pipeline downtime

Low-code/no-code and vibe coding require data leaders to shift from gatekeepers to architects of trust to maintain data integrity and keep pipelines running reliably.

Low-code and no-code tools and AI-powered vibe coding accelerate data-related development work, but unexpected fallout can put enterprise analytics and AI applications at risk.

These development approaches help shorten the distance from idea to impact by expanding who can innovate with data. In addition to IT teams, application owners and other citizen developers can take the initiative and address unmet data needs themselves. Wishful thinking by business users becomes creative problem-solving. Work gets done faster. But the speed of technology-enabled citizen development is colliding with data pipeline reliability issues in the enterprise. The downside of unfettered development is that hidden integration layers and ad hoc data pipelines create visibility gaps for data management teams. Despite the best intentions, the results can prove costly.

A survey of 500 senior data and technology leaders conducted in late 2025 by data management vendor Fivetran found that legacy and DIY pipelines using traditional ETL processes break 30% to 47% more often than fully managed ELT systems, resulting in an average monthly downtime of 60 hours. It also found that data engineers spend 53% of their time on pipeline maintenance, taking time away from supporting new analytics and AI uses.

Anjan Kundavaram, chief product officer at Fivetran, said low-code, no-code and vibe coding all contribute to the DIY nature of many data conduits that escape oversight.

"There's a hidden data integration layer, and there are hidden data pipelines," he said.

Lina Vaskelė, chief risk and security officer at aircraft maintenance and repair services provider FL Technics, underscored this risk as businesses increasingly adopt low-code, no-code and AI-powered development tools.

"I think the main issue is losing some visibility about the data, and how the outputs are used later," she said.

Pain points in the pipeline

Attention to edge cases ranks among the key considerations in making data pipelines reliable, Kundavaram said. Potential wrinkles include changes to data sources.

"There's a lot of incidental complexity in the actual data source, like Salesforce, Workday or any [application]," he said. "Those applications are changing over time. They are not static."

An unmanaged, DIY pipeline might not be prepared to handle such changes. For instance, if a field in a data source's API is renamed, a pipeline built to look for that field will stumble when it encounters a different name. As a result, the pipeline delivers incomplete data to the target system, which could be a dashboard, data lake or data warehouse -- or, increasingly, a data lakehouse that combines elements of the latter two platforms.

Changes on the target side or in data throughput pose similar risks. A pipeline designed to handle a predictable capacity will break when there's a sudden spike. A source or target change combined with a spike in data volume could introduce cascading failures.

No-code and vibe-coded pipelines often fail to anticipate those variables, Kundavaram said.

"Some core parts of the data infrastructure are going to be very brittle. It's going to create a lot more outages," he said.

The rise of AI systems only increases the risk of ungoverned data workflows. Hastily built and potentially hidden pipelines can feed flawed data into AI models, reducing their accuracy and eroding decision quality and trust.

"The focus is shifting from just model performance to the reliability and transparency of the entire data ecosystem," said Amir Kazmi, chief technology and growth officer at Ralliant, which makes test and measurement tools, industrial sensors and other precision instruments. "The issue isn't dramatic model corruption as much as it is gradual decision degradation, where small inconsistencies compound over time."

Kazmi said pipelines that are fragmented or not fully observable make it harder to trace how outputs are generated, validate the inputs and detect model drift.

Governing low-code, no-code and vibe coding endeavors

Dealing with this situation is a matter of governance, said Thomas Squeo, CTO for the Americas at Thoughtworks, a technology consulting firm. But governance mechanisms should recognize the benefits of tools that better enable citizen development.

"I don't think the answer is to lock these tools down, because the value is outsized relative to the risk," Squeo said.

Vaskelė said many employees outside of IT have a passion for exploring tools and using them to save time on daily tasks. Advising citizen developers on best practices works better than trying to control tool use, she added.

"I think that shadow IT, or shadow systems, will exist because people will always build, will always experiment," she said. "The best approach is to guide them and to define the boundaries."

At Lithuania-based FL Technics, policies determine those boundaries, while employee training provides guidance on proper tool use, Vaskelė said. Employees are required to use tools within the corporate account. New tools need IT approval and a vendor cybersecurity assessment.

Jason Brucker, a managing director at consulting firm Protiviti, said low-code, no-code and AI platforms simplify data integration and support process modernization, but raise governance and maintainability concerns.   

As a result, some organizations might be tempted to prevent the use of low-code/no-code data integrations outside specific IT teams, Brucker noted. But doing so does not guarantee success and can lead users to circumvent internal policies and procedures, he said. Strong governance, paired with data awareness and training, is a better approach, Brucker said, adding that well-designed governance initiatives establish clear guidance and include validation expectations and monitoring.

Boosting visibility, sticking to the basics

The balancing act of managing these tools while preserving development speed calls for comprehensive monitoring, particularly of key data pipelines.

"Make the invisible, visible," Vaskelė said.

To that end, enterprises should map their processes and prioritize those most critical to the business, she said. For FL Technics, examples include workflows in aircraft maintenance hangars. Vaskelė said that, in such cases, data lineage is crucial for showing where data flows and identifying its original source.

Visibility into new tools entering the market at astonishing speed is also important.

"It's almost like a biome where, all of a sudden, there's all this new life growing at every turn, and you don't know necessarily how you are going to control it," Squeo said.

He said the goal is to be aware of the tools and categorize them: "These are the ones that are dangerous, these are the ones that are acceptable, and these are the ones that are unknown and need to be determined."

That assessment is an ongoing activity for chief data officers and other leaders, Squeo added.

On controls for these citizen development tools, Squeo emphasized the fundamentals: zero-trust principles, data loss prevention (DLP), and managing north-south and east-west traffic.

"Those are just good practices to overlay, and that's a combination of network management and DLP with a good security posture," he said. "Verify that you are doing the basics."

In addition, Squeo said some organizations deploy network proxies that provide baseline visibility into AI tool access and govern outbound data flows, especially when combined with DLP and network security policies. Proxies also help identify usage patterns, enforce access restrictions and reduce obvious risk exposure. However, he noted that because proxy servers operate primarily at the network layer, they lack visibility into the context, sensitivity or intent of data -- especially with encrypted traffic.

Proxies improve detection but don't fully resolve ungoverned data flows or shadow AI architectures, Squeo said.

Kazmi said Ralliant's approach to preventing issues with data pipelines and integration layers consists of three pillars:

  • Ownership. Treat critical data flows as products with accountable owners for quality, availability and use.
  • Platform. Standardize on governed environments with embedded validation, access controls and policies.
  • Visibility. Continuously monitor pipelines, understand dependencies and validate performance in real time.

"But what organizations are learning, ourselves included, is that speed redistributes complexity," Kazmi said.

Moving beyond vibe coding

Yasmeen Ahmad, managing director of product management for Google Cloud's data and AI cloud platform, said low-code and no-code are firmly establishing themselves in the enterprise because AI and agent technologies remove some of the manual toil in data preparation and pipeline work.

"This is one of the first use cases where we have seen almost instant ROI," she said, noting the historically labor-intensive work of software engineers, data practitioners and developers.

"Think about data engineering pipelines: There are armies of humans across our largest customers building data pipelines," Ahmad said.

Skilled data engineers, rather than business users, tend to adopt agentic AI tools to quickly vibe-code a data pipeline, she said. At Google, that is evolving toward intent-driven engineering: A data engineer states the intent of a pipeline, and Google's Gemini chatbot provides a plan. The engineer iterates on the plan, and the AI tool writes the code against the updated version. 

John Moore is a freelance writer who has covered business and technology topics for 40 years. He focuses on enterprise IT strategy, AI adoption, data management and partner ecosystems.

Dig Deeper on Data integration