[2] Platform: Data Governance Vendors Coexistence

updated on 14 February 2024

IT stakeholders within large enterprises have long recognized the value of purpose-built data governance solutions like Collibra, Atlan, Microsoft Purview and Alation. However, these same organizations often struggle to fully realize the potential of these tools. The situation has been exacerbated by the uncertainty and complexity introduced by overlapping governance capabilities found within data & analytics platforms (sometimes referred to as "tactical catalogs").

Tactical vs. Enterprise Catalogs

Vendors like Collibra positions their solution as spanning across the entire organization. These are unified governance solutions, which can manage assets both on premise, and in the cloud across the enterprise.

These enterprise vendors refer to solutions like Databrick's Unity Catalog as "tactical catalogs". Tactical catalogs are traditionally focused on operational and technical metadata, with a reduced emphasis on features built for business data owners and consumers. A tactical catalog like Unity Catalog is primarily concerned with governance for its own data, and similar types of data in alternative data platforms.

Unified data governance tools like Collibra position Databrick's Unity Catalog (left side of diagram) as a
Unified data governance tools like Collibra position Databrick's Unity Catalog (left side of diagram) as a "tactical catalog".

Integrating Tactical & Enterprise Governance Tools

Data governance vendors have recognized the importance of tight integration between their solutions and data & analytics platforms. Improved integration between governance tools like Microsoft Purview and Databricks is a promising trend over recent months and years. 

Microsoft Purview and Collibra offers the ability to connect and manage Databrick's Unity Catalog "from above". However, not all features are supported and there are some limitations 

Examples of limitations to look out for include:

  1. Lineage
  2. Labeling
  3. Scope, Incremental and Full Scan
  4. Data Sharing
  5. Classification & Tagging
  6. Access Policies
  7. Behavior when an object is deleted or renamed in each tool
  8. Table vs. column detail for lineage
  9. Capture of temporary tables or special cases (e.g. use of APIs)

Exploring integration of data governance and data platform, including custom integration, is not a prerequisite for initial proofs of concept. One successful approach taken by some organizations involves the implementation of data governance and data & analytics platform tools with an initial limited scope, and only loose integration. After the successful adoption of  the tools for a particular department, business function, or data area, teams are better positioned to proceed with confidence on the preferred combination of tools, and which (if any) custom integrations are worth investing in.

Start Tactical?

Support and leadership for strong governance tools and processes can come from various parts of the organization. Depending on your organization's culture, the size, and the initial champions, one approach worthy of consideration is to take a "bottom-up" approach, and implement a tactical data catalog and governance capabilities first; a more unified solution with support for on-premise solutions can be evaluated at a later date.

For organizations that have  low levels of data & analytics platform maturity, there are often more pressing and immediate concerns to be addressed before the benefits of a unified enterprise governance tool can be realized. If data is fragmented and there are major data quality issues, its best to address those first, and a tactical catalog like Databrick's Unity Catalog is a logical first step.

Consider adopting Databrick's Unity Catalog first if data teams are ready to be early adopters and demonstrate value to business and other stakeholders. Unity Catalog is also a logical starting point while addressing data fragmentation and quality issues before taking on a fully unified enterprise data governance solution.
Consider adopting Databrick's Unity Catalog first if data teams are ready to be early adopters and demonstrate value to business and other stakeholders. Unity Catalog is also a logical starting point while addressing data fragmentation and quality issues before taking on a fully unified enterprise data governance solution.

Read more