Showing posts with label Data Integration. Show all posts
Showing posts with label Data Integration. Show all posts

Friday, April 30, 2010

Data Integration in a Nutshell: Four Essential Guidelines


Dashboard Insight published a post by Philip Russom, senior manager of TDWI Research, called Data Integration in a Nutshell: Four Essential Guidelines, where he compiled a list of four points that keep coming up in conversations, interviews, and consulting about data integration. He thinks of these points as guidelines in a nutshell that can shape how people fundamentally think of DI, as well as how people measure the quality, modernity, and maintainability of DI solutions. He hopes these nutshell guidelines can help DI specialists and the people who work with them to see a more future-facing vision of what DI can and should be.

Guideline #1: Data integration is a family of techniques and best practices

The unfortunate knee-jerk reaction of many data warehouse professionals is that the term data integration is synonymous with ETL (extract, transform, and load) simply because ETL is the most common form of data integration found in data warehousing. However, there are other techniques (and best practices to go with them), including data federation, database replication, and data synchronization. Different techniques have different capabilities and prominent use cases, so it behooves a data integration specialist to know and apply them all.

Guideline #2: Data integration practices reach across both analytics and operations

In Analytic DI, one or more DI techniques are applied in the context of business intelligence (BI) or data warehousing (DW). Operational DI applies DI techniques outside BI/DW, typically for the migration or consolidation of operational databases, synchronizing operational databases, or exchanging data in a business-to-business context. Analytic DI and operational DI are both growing practice areas, and both are progressively staffed from a common competency center or similar organization.

Guideline #3: Data integration is an autonomous data management practice

In some old-fashioned organizations, DI is considered a mere subset of DW. It can be that, but it can also be independent. For example, the existence of operational DI proves DI’s independence from DW. Furthermore, hundreds of DI competency centers have sprung up in the last ten years or so as a shared-service organization for staffing all DI work -- not just DI for DW.

Guideline #4: A data integration solution should have architecture

After all, other types of IT solutions have architecture. DI architecture helps you with DI development standards, the reuse of DI objects, and the maintenance of solutions. The preferred architecture among integration technologies -- whether for data or application integration -- is the hub-and-spoke. For this reason, most DI tools today lend themselves to hub-and-spoke. However, there are many variations of it, so you need to actively design an architecture for your DI solutions.

Friday, August 14, 2009

Collaborative Data Integration


What Works is a publication with an interesting content, provided by The Data Warehousing Institute(TDWI). In the Volume 27 - August 2009, Philip Russom, Senior Manager of TDWI Research, published a good article entitled Collaborative Data Integration.

He started the article with the definition of TDWI Research for collaborative data integration: A collection of user best practices, software tool functions, and cross-functional project workflows that foster collaboration among the growing number of technical and business people involved in data integration projects and initiatives.

According the article, several trends are driving up the requirements for collaboration in data integration projects:
- Data integration specialists are growing in number
- Data integration specialists are expanding their work beyond data warehousing
- Data integration work is increasingly dispersed geographically
- Data integration is now better coordinated with other data management disciplines
- More business people are getting their hands on data integration
- Data governance and other forms of oversight touch data integration

Different organizational units provide a structure in which data integration can be collaborative:
- Technology-focused organizational structures
- Business-driven organizational structures
- Hybrid structures

"Corporations and other user organizations have hired more inhouse data integration specialists in response to an increase in the amount of data warehousing work and operational data integration work outside of warehousing", he wrote.

"Although much of the collaboration around data integration consists of verbal communication, software tools for data integration include functions that automate some aspects of collaboration", he also wrote. Some features have existed in other application development tools, but were only recently added to data integration tools,like: Check out and check in, Versioning, and source code management features.

About Data Integration Tool Requirements for Business Collaboration: "a few data integration and data quality tools today support areas within the tools for data stewards or business personnel to use. In such an area, the user may actively do some hands-on work, like select data structures that need quality or integration attention, design a rudimentary data flow (which a technical worker will flesh out later), or annotate development artifacts (e.g., with descriptions of what the data represents to the business)".

He explained that collaboration via a tool depends on a central repository: The views just described are enabled by a repository that accompanies the data integration tool. Depending on the tool brand, the repository may be a dedicated metadata or source code repository that has been extended to manage much more than metadata and development artifacts, or it may be a general database management system.

He finished with some recommendations:

- Recognize that data integration has collaborative requirements. The greater the number of data integration specialists and people who work closely with them, the greater the need is for collaboration around data integration.

- Determine an appropriate scope for collaboration. At the low end, bug fixes don’t merit much collaboration; at the top end, business transformation events require the most.

- Support collaboration with organizational structures. These can be technology focused (like data management groups), business driven (data stewardship and governance), or a hybrid of the two (BI teams and competency centers).

- Select data integration tools that support broad collaboration. For technical implementers, this means data integration tools with source code management features (especially for versioning). For business collaboration, it means an area within a data integration tool where the user can select data structures and design rudimentary process flows for data integration.

- Demand a central repository. Both technical and business team members—and their management—benefit from an easily accessed, server-based repository through which everyone can share their thoughts and documents, as well as view project information and semantic data relevant to data integration.

Thursday, December 4, 2008

Enterprise Information Management


Lyndsay Wise published last month two articles about Enterprise Information Management in Dashboard Insight. In the first article, she provides an overview of information management and in the second article, she looks the Business Objects/SAP view of EIM.

She said that whether it is called enterprise information management, data management, or information management, the general understanding is that managing information across the organization includes the concepts of data governance, data integration, data quality, and master data management.

The Basics Of Information Management


According her, Enterprise Information Management (EIM) is gaining momentum. Organizations are hard pressed to find ways to adequately manage their data without affecting production systems and operational processes.

Enterprise information management takes MDM and other data related initiatives to the next level by enabling organizations to manage their data across sources, ensure a level of quality at each stage of the integration process, and enable organizations to govern the various processes that are associated with the various data points.

With a unified view of data, organizations no longer see separate views of that result from business units, but see how they relate to the overall picture of performance as well.

Data quality adds the final touch to the integration and beginning of MDM mix. To maintain these initiatives, information constantly needs to be validated to preserve valid and accurate data being entered into and moved across various operational systems and within multiple data stores.

This expansion of how data is managed across the organization will continue to converge. Organizations will begin to look at data management as an overall solution that includes integration and data quality initiatives as an extension of current MDM related projects.

Organizations are clearly starting to move towards EIM. Part of this means not only adopting an EIM solution but adopting a data governance framework which includes developing a process for managing data and having a committee of stakeholders that help define taxonomy, hierarchy, etc. and managing these processes across the organization.

The concept of a holistic approach to data management enables organizations to develop end to end solutions that take into account disparate business units and how the data processed within those units translates into valuable information at different touch points across the organization.

I think with the complexity of the companies nowadays, the necessity of information management is increasingly important to enable the companies to manage their data effectively, transforming data into valuable information to make better decisions.

The Role of Data Quality With The Business Objects/SAP EIM Platform


About the approach of the Business Objects, she said that is to combine their data services for data integration and data quality. This removes a barrier of cleaning and integrating data separately and creates a single environment. The solution works by identifying how a change in the source system will affect the various processes and give users the ability to look at calculations and identify where numbers come from, which expands confident decision making based on data as well as compliance and audit requirements.

Sunday, November 30, 2008

Upgrading your data integration efforts to enable Business Intelligence (BI) 2.0


I read a post in Informatica Corporation's blog, entitled Upgrading your data integration efforts to enable Business Intelligence (BI) 2.0, written by Rick Sherman.

He mentioned two good articles that talking about the concepts of BI 2.0, the first article called Business Intelligence 2.0: Simpler, More Accessible, Inevitable, written by Neil Raden, and published in Intelligent Enterprise; and the second article called BI 2.0: The Next Generation, written by Charles Nichols, and published in DM Review.

He said: "With the advent of ICCs (Integration Competency Centers) and robust data integration suites, companies can eliminate integration stovepipe efforts and work towards enabling one data integration backbone. Common people, processes and procedures work towards data integration.

BI 2.0 sounds cool and makes you think you need yet another BI tool. Of course that BI tool has to be the latest and greatest BI tool on the market today but don’t be fooled, it is really not about BI 2.0 but rather DI 2.0 (Data Integration 2.0.)"

I agree with him when he talks about data integration, but I think the new concepts of Data Integration are included in the concepts of BI 2.0.

Saturday, June 7, 2008

Data Quality and Data Integration tools are converging

Gartner published this week its Data Quality Tools Magic Quadrant 2008. The evaluation is based on Completeness of Vision and Ability to Execute.

According to Gartner, “Leaders in the market demonstrate strength across a complete range of data quality functionality, including profiling, parsing, standardization, matching, validation and enrichment. They exhibit a clear understanding and vision of where the market is headed, including recognition of non-customer data quality issues and the delivery of enterprise-level data quality implementations. Leaders have an established market presence, significant size and a multinational presence (directly or as a result of a parent company).”

The report placed DataFlux, Business Objects, IBM, Trillium and Informatica in the leaders' quadrant.

In the niche players' quadrant are the companies: Datatactics, DataLever, Uniserv, DataMentors, Netrics, Innovative and Datanomic.

The report detected that Data Quality and Data Integration tools are converging.

Nowadays, the data quality is one of the most important issues in IT.

I think the Gartner's Magic Quadrant reports are a good source to check what is happening in the market.

Monday, April 28, 2008

A good example of the lack of Data Integration


This weekend, John Soat published in his blog in Information Week, a letter that he wrote to an CIO of a Healthcare group, where he complains because he had filled twice forms, with his personal data, in the same day, in two care centers that belong of the same Healthcare group, when he needed treatment.

He wrote seriously, but the letter is funny. He talks about data integration between many companies of the same group, in this case, many care centers of the same Healthcare group.

The data integration is a very important issue to the companies nowadays, because doesn't permit the redundancy of data, is better to maintain the historical data, and consequently maintain the data quality.

I agree with him, because I think it is very boring, you have to fill out a form twice with the same data.