Surviving and Thriving in a Hybrid Data Management World

The vast majority of companies who are moving to cloud applications also have a significant current investment in on-premise operational applications and on-premise capabilities around data warehousing, business intelligence and analytics. That means that most of them will be working with a hybrid cloud/on-premise data management environment for the foreseeable future.

Being able to move at “cloud speed,” standing up a new application in a matter of hours, is a big advantage in terms of business agility, but one cost is that managing data becomes more complicated.

Of course, often a hybrid architecture isn’t really planned—it just happens. The “typical” use-case pattern I see looks something like this:
  • Organizations start with a cloud application, perhaps for CRM or HR.
  • Next, they add cloud database.
  • Then they add a cloud data warehouse and/or analytics capability.

A lot of times it’s the business that drives this pattern in an effort to solve particular problems. Most often IT is very involved in the effort, but the cloud analytics decision is made by the business side. As a result, IT inherits significant new data management complexity.

Data management approaches

A question I often hear lately is, “What’s the best way to implement data governance (now that we’re well under way with our analytics program and systems)?” It can be difficult to retrofit governance to existing systems, but don’t feel bad if you’re in that position; most people are in the same boat.

Often, the focus is on the initial data migration to the new operational application or analytics, where a simple bulk data loader is employed in the interest of speed and agility. That has several downsides:

  • No metadata. (Using a data integration tool with metadata support, instead of a bulk loader, will give you an end-to-end view of your data lineage throughout your environment. This will be critical as the complexity grows and you need to make changes, quickly and without errors.)
  • You lose the opportunity to do a data clean-up while moving the data, and risk populating a new application with bad data.
  • You miss the chance to think about ongoing data security and data governance on a broader scale.

Once the new applications have gone live, the focus will shift to ensuring data consistency and freshness across the data management environment. Moving both between cloud and on-premise systems and cloud-to-cloud bring their own challenges, and leave fewer resources dedicated to overall data management.

If you don’t want to slow down the business initiatives that are driving the new applications, but still want to prevent that data complexity or chaos, it will pay to have a data management architecture and best practices in place before-hand.

What to think about when planning for hybrid data management

Data management in a large organization is challenging. It gets even more complex when it’s hybrid, as well. The key is to plan ahead so that you’re not a roadblock to the business. Key considerations for a successful program include:

  • Does your vendor provide out-of-the-box, high-performance connectivity to all of the on-premise and cloud sources and targets you’ll need to integrate?
  • Does your vendor have compatibility across their on-premise and cloud data integration capabilities: shared skills, shared code (mappings), shared management tools?
  • Do they support multiple integration patterns: batch, real-time, API integration, etc.?
  • Can they smoothly enable you to grow your data management capabilities as you need them: Data quality, data governance, master data management, metadata management, security, B2B, etc.?
  • Do they support wizards and template-drive development for “citizen integrators”?
  • Do they have metadata management tools for data lineage and business meaning and context. This is critical for reducing errors, enabling self-service, and speeding changes to the environment.

One additional thing to think about is a data integration hub. This replaces dozens or hundreds of point-to-point data integrations with a simple publish-and-subscribe model. Data publishers publish data updates to the hub, once. Data subscribers receive the data they’re interested in from the hub in the format, timeframe, and quality level they require. Modern data hubs can now support both publishers and subscribers across on-premise, cloud, and big data systems, which radically simplifies the job of data management.

At the end of the day, the business challenge is to deliver value faster than the competition. The IT challenge is meeting the speed and quality requirements of the business while enabling them to accelerate their business agility. It can be done, but it takes careful planning and a good, forward-looking data management architecture.


Leave a Reply