Spatial ETL
Encyclopedia
Spatial ETL tools provide the data processing functionality of traditional Extract, Transform, Load
(ETL) software, but with a primary focus on the ability to manage spatial data (which may also be called geographic,
map or location data).
The conversion of spatial data between the source (extract) and destination (load) formats is often referred to as spatial data translation. A Spatial ETL system may translate data directly from one format to another, or via an intermediate format; the latter being more common when transformation of the data is to be carried out.
, but some are unique to spatial data.
Spatial data commonly consists of a geographic element and related attribute data; therefore Spatial ETL transformations are often described as being either geometric transformations - transformation of the geographic element - or attribute transformations - transformations of the related attribute data.
tools for processing non-spatial data have existed for some time, ETL
tools that can manage the unique characteristics of spatial data only emerged in the early 1990s.
Spatial ETL tools emerged in the GIS
industry to enable interoperability (or the exchange of information) between the industry’s diverse array of mapping applications and associated proprietary formats. However, Spatial ETL tools are also becoming increasingly important in the realm of Management Information Systems
as a tool to help organizations integrate spatial data with their existing non-spatial databases, and also to leverage their spatial data assets to develop more competitive business strategies.
transformation tools; the concept being to import data then carry out step-by-step transformation or analysis within the GIS application itself. Conversely, Spatial ETL does not require the user to import or view the data, and generally carries out its tasks in a single predefined process.
With the push to achieve greater interoperability
within the GIS industry, many existing GIS applications are now incorporating Spatial ETL tools within their products; the ArcGIS
Data Interoperability Extension being a good example of this.
applications are attempting to incorporate Spatial ETL functionality within their products.
Extract, transform, load
Extract, transform and load is a process in database usage and especially in data warehousing that involves:* Extracting data from outside sources* Transforming it to fit operational needs...
(ETL) software, but with a primary focus on the ability to manage spatial data (which may also be called geographic,
Geographic data
Geographic data is about much more than electronic pictures of maps.The geographic data that describes our world allows for city planning, flood prediction and relief, emergency service routing, environmental assessments, wind pattern monitoring and many other applications.Geographic data is...
map or location data).
Extract and Load
Spatial data, more than any other, suffers from the problems of data held in different formats (whether proprietary or open) and adhering to different standards. Therefore a key requirement for a Spatial ETL system is that it be capable of handling as many data formats as possible, in a consistent manner.The conversion of spatial data between the source (extract) and destination (load) formats is often referred to as spatial data translation. A Spatial ETL system may translate data directly from one format to another, or via an intermediate format; the latter being more common when transformation of the data is to be carried out.
Transform
The transformation phase of a Spatial ETL process allows a variety of functions; some of these are similar to standard ETLExtract, transform, load
Extract, transform and load is a process in database usage and especially in data warehousing that involves:* Extracting data from outside sources* Transforming it to fit operational needs...
, but some are unique to spatial data.
Spatial data commonly consists of a geographic element and related attribute data; therefore Spatial ETL transformations are often described as being either geometric transformations - transformation of the geographic element - or attribute transformations - transformations of the related attribute data.
Common Geometric Transformations
- Reprojection: the ability to convert spatial data between one coordinate system and another.
- Spatial transformations: the ability to model spatial interactions and calculate spatial predicates
- Topological transformations: the ability to create topological relationships between disparate datasets
- Resymbolisation: the ability to changes the cartographic characteristics of a feature, such as colour or line-style
- Geocoding: the ability to convert attributes of tabular data into spatial data
Additional Features
Desirable features of a spatial ETL application are:- Data Comparison: Ability to carry out change detection and do incremental updates
- Conflict Management: Ability to manage conflicts between multiple users of the same data
- Data Dissemination: Ability to publish data via the internet or deliver by email regardless of source format
- Semantic Processing: Ability to understand the rules of different data formats to minimize user input whilst preserving meaning
Spatial ETL Uses
Spatial ETL has a number of distinct uses to which it is put.- Data cleanup: The removal of errors within a dataset
- Data Merging: The bringing together of multiple datasets into a common framework - ConflationConflationConflation occurs when the identities of two or more individuals, concepts, or places, sharing some characteristics of one another, become confused until there seems to be only a single identity — the differences appear to become lost...
is a good example of this - Data verification: The comparison of multiple datasets for verification and quality assurance purposes
- Data translation: Convertation between different data formats.
Spatial ETL - Origins and History
Although ETLExtract, transform, load
Extract, transform and load is a process in database usage and especially in data warehousing that involves:* Extracting data from outside sources* Transforming it to fit operational needs...
tools for processing non-spatial data have existed for some time, ETL
Extract, transform, load
Extract, transform and load is a process in database usage and especially in data warehousing that involves:* Extracting data from outside sources* Transforming it to fit operational needs...
tools that can manage the unique characteristics of spatial data only emerged in the early 1990s.
Spatial ETL tools emerged in the GIS
Geographic Information System
A geographic information system, geographical information science, or geospatial information studies is a system designed to capture, store, manipulate, analyze, manage, and present all types of geographically referenced data...
industry to enable interoperability (or the exchange of information) between the industry’s diverse array of mapping applications and associated proprietary formats. However, Spatial ETL tools are also becoming increasingly important in the realm of Management Information Systems
Management information system
A management information system provides information needed to manage organizations efficiently and effectively. Management information systems involve three primary resources: people, technology, and information. Management information systems are distinct from other information systems in that...
as a tool to help organizations integrate spatial data with their existing non-spatial databases, and also to leverage their spatial data assets to develop more competitive business strategies.
Spatial ETL and GIS
Traditionally, GIS applications have had the ability to read or import a limited number of spatial data formats, but with few specialist ETLExtract, transform, load
Extract, transform and load is a process in database usage and especially in data warehousing that involves:* Extracting data from outside sources* Transforming it to fit operational needs...
transformation tools; the concept being to import data then carry out step-by-step transformation or analysis within the GIS application itself. Conversely, Spatial ETL does not require the user to import or view the data, and generally carries out its tasks in a single predefined process.
With the push to achieve greater interoperability
Interoperability
Interoperability is a property referring to the ability of diverse systems and organizations to work together . The term is often used in a technical systems engineering sense, or alternatively in a broad sense, taking into account social, political, and organizational factors that impact system to...
within the GIS industry, many existing GIS applications are now incorporating Spatial ETL tools within their products; the ArcGIS
ArcGIS
ArcGIS is a suite consisting of a group of geographic information system software products produced by Esri.ArcGIS is a system for working with maps and geographic information...
Data Interoperability Extension being a good example of this.
Spatial ETL and ETL
Mindful of the degree to which any data can be assigned a fixed geographic position, and of the proliferation of spatial capabilities within standard relational or object databases, vendors of standard ETLExtract, transform, load
Extract, transform and load is a process in database usage and especially in data warehousing that involves:* Extracting data from outside sources* Transforming it to fit operational needs...
applications are attempting to incorporate Spatial ETL functionality within their products.
Spatial ETL tools
- CISS TDI GmbH
- Dotted Eyes Ltd In Unifying the Spatial Environment (USE), Dotted Eyes are data transformation and translation experts.
- Galdos Systems Inc Galdos goes "beyond ETL" with its INtune data synchronization framework.
- Snowflake Software Ltd Solutions for Data eXchange. Free eval and support
- Safe Software Inc.Safe SoftwareSafe Software Inc. provides software and consulting services focused on managing the exchange of both spatial and non-spatial data between GIS applications and/or relational databases with differing file formats and structures....
- PCI Geomatics Enterprises Inc.
- WisdomForce Technologies supplies software for Oracle Spatial.
- SpatialDataIntegrator is an open source software powered by Talend and developed by camptocamp.
- GeoKettle is another Open Source Spatial ETL tool.
See also
- Business intelligenceBusiness intelligenceBusiness intelligence mainly refers to computer-based techniques used in identifying, extracting, and analyzing business data, such as sales revenue by products and/or departments, or by associated costs and incomes....
- Object-relational databaseObject-relational databaseAn object-relational database , or object-relational database management system , is a database management system similar to a relational database, but with an object-oriented database model: objects, classes and inheritance are directly supported in database schemas and in the query language...