There is a surprising amount of attention given to GIS data models in the U. S. oil & gas pipeline industry,. It is not uncommon for outsiders looking in to think that too much emphasis is placed on data models and wonder why operators don’t instead focus on the applications – that the applications are what really matter and most any data model will do as long as it works. Why is there so much interest in data models and does the GIS data model really matter that much? This article discusses these questions.
The more integrated an industry is, the more important uniform data practices and standards become. For example, in the real estate industry, data interchange standards allow multiple listing services to share data to be posted to web sites and searched across many databases. Many industries such as travel and banking have created data exchange standards for seamlessly sharing information between systems. This common data vocabulary enables complex transactions to occur seamlessly and securely.
The same phenomena is becoming evident in the oil & gas pipeline industry – the increased need for integration between service providers and operators has created a need for better data exchange mechanisms. Paper and spreadsheet data exchange is no longer viable considering the demand for timely, high quality information required to manage asset integrity. The increasing volume of integrity management related data required to perform risk assessment, as well as the increased scrutiny placed on data management processes by regulators is driving the development of more integrated data storage and exchange practices.
Industry integration at the application level also drives data consistency in the oil & gas pipeline industry. The more dependent operators are on software and data from multiple service providers, the more important it becomes to have repeatable processes and this encourages more consistent data. The desire to run applications “off-the-shelf”, thus lowering costs, improving software quality and support, causes many operators to move away from proprietary or home grown systems and toward sustainable standards-based models.
Fundamentally, most applications are only as good as their underlying data models. The data model is a limited representation or schematic of the reality it is describing and there are many ways to describe/model pipelines. Limitations and constraints of the data model will often limit the functionality and flexibility of the application. Making changes to the underlying data model, especially core tables can have significant or severe impact on the applications. Getting the database design wrong can have significant negative impact on the applications, the users and the organization.
One underlying principle of application design, especially with applications that extensively interact with the database, is to get the database design right first and try to minimize the changes afterward. It is usually much easier to extend a well-designed data model than it is to fix a poorly designed model. The better a data model represents the physical assets and processes, the more useful it is and the longer it will be used.
Standard vs. Template
In general, there seem to be three different viewpoints about data models. One is the idea of creating an industry standard data model. That is, if a standard can be created and if everyone uses the same standard, at least in theory, applications built on the standard should be able to run on databases implemented with the standard. It would also be easier to migrate data from one database to another if both adhere to the same standard.
Another basic viewpoint is the idea of creating a data model template. Users can take the template and modify it to meet their unique business needs. The extra costs associated with customizing the template are offset by the fact that the GIS better reflects those unique business drivers. The business drivers may have to do with the operation, general business strategy or unique regulatory constraints that operators have.
A middle ground to these two viewpoints is to modify the standard itself and create a proprietary model that may have much in common with the standard, but does not strictly adhere to it. In reality this approach is the same as using the standard as the template and modifying the template.
The third viewpoint is that the data model one uses just doesn’t matter that much, as long as it a good design and it fits the needs of the organization.
When is a Standard a Standard?
Standard data models, such as the PODS (Pipeline Open Data Standard) model, uniformly describe key objects, attributes and relationships that describe pipelines. Companies can easily create a PODS database by simply running the DDL script that is provided. The real challenges begin when data is put into the model, applications are written for the model, and when the model is implemented. Different understandings about how the standard is implemented may result in non-standard and possibly incompatible implementations of the standard.
There are several reasons for this. First, different user priorities and application requirements may drive different implementations. Although the standard itself meets the general requirements, the implementation may need to accommodate these differences. These differences can vary within one company based on personnel background, regulatory compliance issues and budgetary constraints.
Next, vendors may differentiate themselves by implementing the model differently. This can work both for and against vendors, and usually works against operators, especially when the time comes to upgrade to the next version of the standard. Differentiation that changes the standard itself results in proprietary models, thus losing the benefit of the standard. Differentiation that still adheres to the standard provides vendor and operator with a solid platform for future growth.
Differences in implementation of data standards also occur because of a lack of understanding of the value of the standard, lack of clear documentation and training, and a simple lack of interest in adhering to the standard.
The role of the standards organization is to provide a framework to define and then encourage compatible implementations of their standards.
The template approach works if an organization values the benefits of the customized model more than the benefits of a standardized model. This approach causes operators to rely primarily on vendors, even individual employees within the vendor operation that understand and support their template and their customizations.
The Bottom Line
Does the data model matter? Yes. The data model must match the current business needs but also be flexible enough to meet new challenges down the road. A proprietary data model must be very well understood by the operator without relying exclusively on the service provider/vendor. If a standard data model is selected, then the operator needs to ensure that it is implemented consistently within the enterprise and consistently with the practices of the standards organization.
The reality is that the data standard is only as important as the organization that stands behind the standard, the degree to which the standard is really used, and the rate of change of the standard. Adoption and support of the standard is a strategy that companies can use to improve the efficiency and quality of data used in integrated industries with many service providers such as is the case with US oil & gas pipelines today.
The increasingly integrated regulatory and service needs of the oil and gas industry is driving pipeline operators to pay more attention to their data model and not simply focus on end user applications.