Data preparation
Top Stories
-
Feature
28 Dec 2022
10 top data architect and data engineer certifications in 2023
Learn what it takes to achieve and accelerate a rewarding career in data architecture and choose from among some of the best data architect and data engineer certifications. Continue Reading
-
Feature
02 Feb 2022
Self-service data preparation: What it is and how it helps users
Using self-service tools to properly prepare data simplifies analytics and visualization tasks for business users and speeds complex modeling processes for data scientists. Continue Reading
-
Feature
31 Jan 2022
6 data preparation best practices for analytics applications
Amid issues of data sources, data silos and data quality, the process of collecting and prepping data for analytics applications requires a practical and effective approach. Continue Reading
-
Feature
27 Jan 2022
Data preparation in machine learning: 6 key steps
Trustworthy analytics outcomes depend on the right data, requiring data scientists to focus on these steps when they prepare data for use in machine learning applications. Continue Reading
-
Feature
24 Jan 2022
Top data preparation challenges and how to overcome them
Data preparation is a crucial part of analytics applications, but it's complicated. Here are seven common challenges that can send the data prep process off track. Continue Reading
-
News
21 Jan 2022
Apache Hop data orchestration hits open source milestone
The open source technology moves beyond its roots to enable a full data platform as data moves from one source to another for operations, business intelligence and analytics. Continue Reading
-
Feature
13 Dec 2021
8 analytics startups to watch over the next year
Differentiation is key to the success of any new enterprise, and a crop of analytics vendors are delivering new capabilities to try to distinguish themselves. Continue Reading
-
News
08 Dec 2021
Calyptia spurs observability data with Fluent Bit Enterprise
Collecting first-mile log data that is as close to the source as possible is the open source startup's goal as it builds out new commercial services. Continue Reading
-
News
06 Dec 2021
LogDNA raises $50M for observability data
The LogDNA CEO discusses challenges and the need for organizations to collect and analyze log data to enable operations, DevOps and security applications. Continue Reading
-
Tip
11 Nov 2021
Business shift to a data monetization strategy elevates CDOs
As their focus dramatically swings from compliance issues to data monetization, chief data officers are on track to take their rightful place among C-level executives, but slowly. Continue Reading
-
News
09 Nov 2021
Datafold raises $20M for data reliability engineering
Datafold's founder and CEO details the data observability challenges the startup is looking to address with its suite of data tools that provide visibility into data pipelines. Continue Reading
-
News
27 Oct 2021
Informatica goes public again as data management grows
Informatica has transformed in recent years from an on-premises software vendor to a SaaS-based subscription model in the cloud as new services for data have emerged. Continue Reading
-
News
21 Oct 2021
SolarWinds expands DataOps portfolio with Database Mapper
A year after acquiring SentryOne, the IT management software vendor is out with updated tools to help organizations better understand where data came from and where it is going. Continue Reading
-
News
14 Oct 2021
Alation looks to boost data intelligence with Lyngo Analytics
Fresh off a big funding round, Alation seeks to enable users to benefit from data catalogs with technology that will translate natural language queries into SQL. Continue Reading
-
Feature
11 Oct 2021
Data architect skills required, responsibilities and salaries
When considering a career as a data architect, learn about the required skills and education, plus job responsibilities, salary ranges and typical interview questions and answers. Continue Reading
-
News
14 Sep 2021
Alation boosts platform with new data governance application
The new Alation service provides a series of capabilities to improve governance, including a centralized dashboard and data steward workbench for managing workflow. Continue Reading
-
Feature
02 Sep 2021
How to choose exactly the right data story for your audience
A data practitioner has two jobs: tell the right data story and in the right way to win over project stakeholders, data expert Larry Burns says in his latest book. Continue Reading
-
News
25 Aug 2021
Cribl brings in $200M to advance data lake observability
Cribl looks to grow its LogStream platform capabilities with new funding that will help the vendor grow operations and advance its data lake visibility technology. Continue Reading
-
News
19 Aug 2021
Hitachi Vantara expands Lumada DataOps suite
Hitachi Vantara integrated its data management technologies into a new product suite intended to help organizations with data operations. Data governance is coming next. Continue Reading
-
News
13 Aug 2021
Augmented analytics capabilities mark the new era of BI
Augmented intelligence capabilities like automated data prep and natural language processing are now common, showing that BI has advanced to a new era of technological innovation. Continue Reading
-
Feature
02 Aug 2021
The pros and cons of big data outsourcing
More companies are seeking outside help to capitalize on data's value. Examine the benefits and drawbacks that come with outsourcing big data processing projects. Continue Reading
-
News
28 Jul 2021
Trifacta moves beyond data wrangling to DataOps
'Data wrangling' is a term that Trifacta assigns to data preparation, but there is more than that to the concept as part of the vendor's Data Engineering Cloud platform. Continue Reading
-
News
27 Jul 2021
DataRobot acquires Algorithmia to further MLOps goal
DataRobot continues to pursue its growth-by- acquisition strategy by buying MLOps vendor Algorithmia in a bid to become a full-service machine learning platform provider. Continue Reading
-
Feature
16 Jul 2021
The value of PDF data extraction: Sifting for hidden data
During the process of data cleaning, there's a way to extract valuable hidden data. Learn how in this excerpt from 'Cleaning Data for Effective Data Science.' Continue Reading
-
News
14 Jul 2021
Informatica brings data governance and data catalog to cloud
Informatica moves more of its on-premises capabilities into the cloud as it aims to provide an integrated data platform for multiple use cases, including analytics and AI. Continue Reading
-
Feature
01 Jul 2021
EY CTO outlines data governance challenges
Multinational professional services firm EY has taken a strategic view of how to manage and use data in a federated approach powered by a trusted data fabric. Continue Reading
-
Feature
18 Jun 2021
How data governance and data quality work together
High-quality, reliable data is essential to the data governance process. Here are strategies to ensure data quality standards are ingrained in governance processes. Continue Reading
-
News
03 Jun 2021
Cribl aims to ease data observability with LogStream update
LogStream 3.0 brings new configuration capabilities to Cribl's pipeline technology that can help organizations optimize log and metrics data. Continue Reading
-
News
21 May 2021
Superconductive raises $21M for open source data quality
The open source Great Expectations project is becoming increasingly popular, as the commercial vendor seeks to build out a cloud service to expand the project's reach. Continue Reading
-
Feature
11 May 2021
How to build an all-purpose big data pipeline architecture
Like a superhighway system and its many on- and off-ramps, an enterprise's big data pipeline transports infinite amounts of collected data from its sources to its destinations. Continue Reading
-
News
04 May 2021
Syniti boosts DataOps chops with DMR merger
The CEO of an enterprise data management vendor explains why his firm is merging with Data Migration Resources to expand their reach and data quality offerings. Continue Reading
-
Feature
27 Apr 2021
Data quality for big data: Why it's a must and how to improve it
As data volumes increase exponentially, methods to improve and ensure big data quality are critical in making accurate, effective and trusted business decisions. Continue Reading
-
Feature
22 Apr 2021
Enterprise augmented data management benefits and growth
Gartner predicts plenty of growth in the booming augmented data management market, which helps data professionals focus on insights over administrative tasks. Continue Reading
-
News
15 Apr 2021
Soda launches cloud service to improve data observability
Data quality vendor Soda has had a busy 2021, building out new services and raising funding to help organizations identify and remediate data quality problems. Continue Reading
-
News
15 Apr 2021
Bigeye raises $17M Series A funding to boost data quality
The former Uber product manager and current CEO and co-founder of a startup outlines the challenges and opportunities of enabling a new data trust platform with data quality at its foundation. Continue Reading
-
News
07 Apr 2021
Alation brings data catalog technology to the public cloud
Alation's data intelligence technology is getting easier for organizations to use, with a new managed service that can help enterprises better use the data they have. Continue Reading
-
Feature
11 Mar 2021
Bias in big data: How to find it and mitigate influence
It's no secret that bias exists in large data sets, ; the key is addressing it. With transparency, diversity and accountability, limiting that bias can be possible. Continue Reading
-
News
10 Mar 2021
Precisely looks to boost data integrity software platform
Precisely is getting an infusion of capital as it moves forward with its decades-long business mission to help organizations use data effectively to make key decisions. Continue Reading
-
News
09 Mar 2021
Apache Daffodil advancing Data Format Description Language
Data integration and data loading efforts could get easier to execute, as open source DFDL project becomes a Top-Level Project at the Apache Software Foundation. Continue Reading
-
News
19 Feb 2021
Alation 2021.1 data catalog improves data intelligence
Alation updated its platform with new features for data discovery and search, as well as more data governance capabilities to enhance data trust. Continue Reading
-
News
09 Feb 2021
Monte Carlo gets new funding to expand data observability
The CEO of data management startup Monte Carlo, which raised $25 million in Series B funding Tuesday, details her views on the key pillars of data reliability. Continue Reading
-
Feature
21 Jan 2021
Augmented data preparation the next step for self-service BI
Augmented data tools play a key role in expanding data use across organizations. Read on to find out how augmented data preparation tools democratize data in self-service BI. Continue Reading
-
Feature
30 Dec 2020
What FAIR data management means for your enterprise
The FAIR principles were made to promote the sharing of data in the research field, but their guidance can help organizations in other industries improve their own data practices. Continue Reading
-
Tip
29 Dec 2020
Data lineage documentation imperative to data quality
Understanding the detailed journey of data elements throughout the data pipeline can help an enterprise maintain data quality and improve trustworthiness. Continue Reading
-
Feature
04 Dec 2020
Collibra grows enterprise data governance for the cloud
Collibra CEO discusses the importance of data governance for enterprises and how to tie data governance to business terminology to go beyond simply controlling data. Continue Reading
-
News
24 Nov 2020
IBM to deliver refurbished Db2 for the AI and cloud era
IBM has a tuned-up version of Db2 planned, featuring a handful of AI and machine learning capabilities to make it easier for users to send and manage Db2 data across clouds. Continue Reading
-
Feature
20 Nov 2020
Maintaining data integrity key for data quality
Maintaining data integrity through improved communication and data literacy is paramount for organizations in the enterprise seeking to ensure data quality and trust. Continue Reading
-
News
19 Nov 2020
Enhanced admin controls highlight Alteryx platform update
New features to augment administrative control over the data management process highlight Alteryx 2020.4, the latest update from the data management vendor. Continue Reading
-
News
12 Nov 2020
AWS Glue DataBrew a new no-code data preparation tool
AWS Glue DataBrew is a new feature that will enable users to extract, transmit and load data to get it ready for analysis without having to write code. Continue Reading
-
Feature
02 Nov 2020
Healthcare data management challenges hold back adoption
The healthcare industry has had difficulty adopting data management best practices, but a few organizations have tackled the challenges successfully. Continue Reading
-
News
16 Oct 2020
The new normal for enterprise data governance
Mastercard data exec highlights the foundational role of enterprise data governance during the pandemic era with more people working from home and new demands on businesses. Continue Reading
-
News
13 Oct 2020
Upsolver advances open cloud data lake, data pipeline efforts
Upsolver enhanced its data preparation platform to transform data lake content into a data lakehouse structure that enables data queries and analysis. Continue Reading
-
News
29 Sep 2020
Cloudera adds data engineering capability to enable DataOps
Big data vendor Cloudera is looking to help data engineers use its platform with a new service that brings more power and management to running Spark for building data pipelines. Continue Reading
-
Feature
28 Sep 2020
3 growing applications of AI in data management
There are plenty of ways AI can augment data professionals throughout the data pipeline, from sifting through large data sets for duplicates to easing the preparation process. Continue Reading
-
Feature
25 Sep 2020
Key steps in the feature engineering process
Feature engineering is key to machine learning algorithms. Read on to learn how those features are created and chosen to increase the accuracy of those models. Continue Reading
-
Feature
28 Aug 2020
Investment in talent key for data quality in healthcare
Investing in training employees on proper data gathering and management practices is crucial for healthcare organizations seeking to ensuring data quality and patient care. Continue Reading
-
Feature
14 Aug 2020
How to streamline your data cleansing process
Data cleansing is an important part of maintaining data quality, and the process is easier if you keep ahead of it by upholding governance and quality standards. Continue Reading
-
Feature
17 Jul 2020
Top 5 feature engineering tips for better models
From understanding a model's expected goal to factoring in subject matter expertise, experts talk about the best ways to improve your feature engineering. Continue Reading
-
Feature
04 Jun 2020
Organization and automation ease data preparation process
By laying down proper groundwork and investing in automated checks, companies can ease the data preparation process and ensure they are getting the most out of their data. Continue Reading
-
Tip
28 May 2019
The evolution of the data preparation process and market
Organizations have long struggled with inconsistent data and other issues. Expert Andy Hayler explores how that has led to the rise of the data preparation tools market. Continue Reading
-
Feature
27 Feb 2019
8 tips to improve the data curation process
A data curation and modeling strategy can ensure accuracy and enhance governance. Experts offer eight best practices for curating data. First, start at the source. Continue Reading