Copyright © 2015-2017 Fraunhofer Institute for Computer Graphics Research IGD
1. Introduction
GeoRocket is a high-performance data store for geospatial files. It can store 3D city models (e.g. CityGML), GML files or any other XML-based geospatial data sets. GeoRocket provides the following features:
High performance data storage with multiple back-ends such as Amazon S3, MongoDB, distributed file systems (e.g. HDFS or Ceph), or your local hard drive (enabled by default)
Support for high-speed search features based on the popular Open-Source framework Elasticsearch. You can perform spatial queries and search for attributes, layers and tags.
GeoRocket is made for the Cloud. Based on the Open-Source toolkit Vert.x it is reactive and can handle big files and a large number of parallel requests.
GeoRocket exists in two editions—an Open-Source version and a Pro edition for enterprise applications
1.1. Architecture
GeoRocket has a reactive, scalable and asynchronous software architecture. Imported files are split into chunks that are indexed individually. The data store keeps unprocessed chunks. This enables you to later retrieve the original file that you put into GeoRocket without losing any information.[1]
The following figure depicts the software architecture of GeoRocket.
The import process starts in the upper left corner. Every imported file is first split into individual chunks. Depending on the input format chunks have different meanings. CityGML files, for example, are split into individual cityObjectMember
objects which are typically the buildings of a city model.
Attached to each chunk is metadata containing additional information describing the chunk. This includes tags specified by the client and other automatically generated attributes.
The chunks are put into the GeoRocket data store. There are several data store implementations supporting different back-ends such as Amazon S3, HDFS, MongoDB or the local hard drive (default).
Immediately after a chunk has been put into the data store the indexer starts working asynchronously in the background. It reads new chunks from the data store and analyses them for known patterns. It recognises spatial coordinates, attributes and other content. The indexer creates a directory of every item found—the ‘index’.
The export process starts with querying the indexer for chunks matching the criteria supplied by the client. These chunks are then retrieved from the data store (together with their metadata) and merged into a result file.
1.1.1. Secondary data store
GeoRocket’s architecture allows for the creation of secondary data stores that co-exist with the main data store where the original chunks are kept. The following figure depicts the process:
Whenever a new chunk is added to the data store a custom processor can retrieve it to create a secondary data store. Data from this store can then be served directly to the client without further processing. Possible use cases for this scenario are:
Optimize 3D scenes for web-based visualisation. Create a secondary data store that contains glTF files. glTF is a specification for the efficient transmission of 3D scenes to the browser.
Convert all chunks stored in CityGML version 2 to CityGML version 1 for clients that are incompatible to version 2.
Process a 3D city model and derive LOD1 buildings from LOD2 or LOD3.
The advantage of keeping a secondary data store is that it is created automatically in the background when new data is added to GeoRocket. This avoids manual processing. Individual processors may even keep the secondary data store up to date incrementally and only re-create those parts that have changed since it has been created or updated the last time.
2. Getting started
GeoRocket consists of two components: the server and the command-line interface (CLI). Download the Server and CLI bundles from the GeoRocket website and extract them to a directory of your choice.
Open your command prompt and change to the directory where you installed GeoRocket Server. Execute georocketd
to run the server.
cd georocket-server-1.0.0/bin
./georocketd
Please wait a couple of seconds until you see the following message:
GeoRocket launched successfully.
The server has launched and now waits for incoming HTTP requests on port 63020
(default).
Next open another command prompt and change to the directory where you installed GeoRocket CLI. Run georocket
to access the server through a convenient command-line application.
cd georocket-cli-1.0.0/bin
./georocket
You can now import your first geospatial file. Suppose your file is called /home/user/my_file.gml
. Issue the following command to import it to GeoRocket.
./georocket import /home/user/my_file.gml
GeoRocket CLI will now send the file to the server. Depending on the size of the dataset this will take a couple of seconds up to a few minutes (for very large datasets).
Finally, export the contents of the whole store to a file using the export
command.
./georocket export / > my_new_file.gml
That’s it! You have successfully imported your first file into GeoRocket.
3. Command-line application
GeoRocket comes with a handy command-line interface (CLI) letting you interact with the server in a convenient way on your command prompt. The interface provides a number of commands. The following sections describe each command and their parameters in detail.
In the following sections it is assumed that you have the georocket
executable in your path. If you have not done so already, you may add it to your path with the following command (Linux):
export PATH=/path/to/georocket-cli-1.0.0/bin:$PATH
Or under Windows do:
set PATH=C:\path\to\georocket-cli-1.0.0\bin;%PATH%
3.1. Help command
Display help for the command-line interface and exit.
Examples:
georocket
or
georocket --help
or
georocket help
The help command also gives information on specific CLI commands. Just provide the name of the command you would like to have help for. For example, the following command displays help for the Import command:
georocket help import
3.2. Import command
Import one or more files into GeoRocket. Specify the name of the file to import as follows.
georocket import myfile.xml
You can also import the file to a certain layer. The layer will automatically be created for you. The following command imports the file myfile.xml
to the layer CityModel
.
georocket import --layer CityModel myfile.xml
Use slashes to import to sub-layers.
georocket import --layer CityModel/LOD1/Center myfile.xml
You may attach tags to imported files. Tags are human-readable labels that you can use to search for files or chunks stored in GeoRocket. Use a comma to separate multiple tags.
georocket import --tags city,district,lod1 myfile.xml
3.3. Export command
Export a layer stored in GeoRocket. Provide the name of the layer you want to export.
georocket export CityModel/LOD1
By default the export command writes to standard out (your console). Redirect output to a file as follows.
georocket export CityModel/LOD1 > lod1.xml
You may also export the whole data store. Just provide the root layer /
to the export command.
georocket export /
3.4. Search command
Search the GeoRocket data store and export individual geospatial features (chunks). Provide a query to the search command as follows.
georocket search myquery
You can also search individual layers.
georocket search --layer CityModel myquery
By default the search command writes to standard out (your console). Redirect output to a file as follows.
georocket search myquery > results.xml
Use a space character to separate multiple query terms. Search results will be combined by logical OR.
See the Query language section for a full description of all possible terms in a query.
3.5. Delete command
Remove geospatial features (chunks) or whole layers from the GeoRocket data store. Provide a query to the delete command to select the features to delete.
georocket delete myquery
You can also restrict the delete command to a certain layer.
georocket delete --layer CityModel myquery
Delete a whole layer (including all its chunks and sub-layers) as follows.
georocket delete --layer CityModel/LOD1
You may even delete the whole data store by specifying the root layer /
.
georocket delete --layer /
4. HTTP interface
GeoRocket Server provides an (REST-like) HTTP interface that you can use to interact with the data store as well as to embed GeoRocket in your application. By default GeoRocket listens to incoming connections on port 63020.
4.1. GET information
Get information about GeoRocket (application name, version, etc.).
Resource URL
/
Parameters
None
Status codes
200 | The operation was successful |
Example request
GET / HTTP/1.1
4.1.5. Example response
HTTP/1.1 200 OK
Content-Type: application/json
Content-Length: 100
{
"name" : "GeoRocket",
"version" : "1.0.0",
"tagline" : "It's not rocket science!"
}
4.2. GET file
Search the data store for chunks that match a given query. Merge the chunks found and return the result as a file.
Resource URL
/store/:path
Parameters
path | The absolute path to a layer to search. Omit this parameter to query the whole data store. |
search | A URL-encoded query string. If no query string is provided all chunks from the requested layer will be returned. |
Status codes
200 | The operation was successful |
400 | The provided information was invalid (e.g. malformed query) |
404 | The requested chunks were not found or the query returned an empty result |
500 | An unexpected error occurred on the server side |
Example request
GET /store/CityModel?search=LOD1+textured+13.378,52.515,13.380,52.517 HTTP/1.1
Example response
HTTP/1.1 200 OK
Transfer-Encoding: chunked
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<CityModel ...>
...
</CityModel>
4.3. POST file
Import a file into GeoRocket. Split the file into chunks and put them into the data store.
Resource URL
/store/:path
Parameters
path | The absolute path to a layer where the chunks from the imported file should be stored. Omit this parameter to put the chunks into the data store’s root layer |
tags | A comma-separated list of tags (i.e. labels) to attach to each imported chunk. |
Status codes
202 | The operation was successful. The file was accepted for importing and is now being processed asynchronously. |
400 | The provided information was invalid (e.g. malformed input file) |
500 | An unexpected error occurred on the server side |
Example request
POST /store/CityModel?tags=LOD1,textured HTTP/1.1
Content-Length: 35903517
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<CityModel ...>
...
</CityModel>
4.3.5. Example response
HTTP/1.1 202 Accepted file - importing in progress
Content-Length: 0
4.4. DELETE chunks
Delete chunks or layers from the data store.
Resource URL
/store/:path
Parameters
path | The absolute path to the layer from which chunks matching the given query should be deleted. If no query is given this is the path to the layer to delete (including all its contents—sub-layers and chunks). |
search | A URL-encoded query string specifying which chunks should be deleted. If no query string is provided the whole layer is deleted. |
path
) nor a query (search
) then the whole contents of the GeoRocket data store will be deleted.Status codes
204 | The operation was successful. The matching chunks were deleted from the data store. |
400 | The provided information was invalid (e.g. malformed query) |
500 | An unexpected error occurred on the server side |
204
.Example request
DELETE /store/CityModel?search=LOD1 HTTP/1.1
4.4.5. Example response
HTTP/1.1 204 No Content
Content-Length: 0
5. Query language
5.1. Strings
GeoRocket performs a full-text search for strings in every tag and every indexed attribute.
Example:
string
5.2. Bounding boxes
Bounding boxes can be specified using four floating point numbers separated by a comma. The format is:
left,bottom,right,top
or
minimum_longitude,minimum_latitude,maximum_longitude,maximum_latitude
Example:
13.378,52.515,13.380,52.517
5.3. Logical operators
The operators OR, AND and NOT can be used to logically combine terms in a query. They are applied using the following notation:
<operator>(<operand_1> <operand_2> ... <operand_n>)
Operands are separated by a space character. Logical operations can be nested.
Examples:
AND(a b)
AND(a NOT(b))
OR(NOT(a) NOT(b))
5.3.1. OR
Use the logical OR operator to search for chunks that match at least one of the given operands.
Example:
OR(foo 13.378,52.515,13.380,52.517 bar)
This example matches all chunks that have a tag or indexed attribute with the value foo
or bar
as well as those that are within the bounding box 13.378,52.515,13.380,52.517
.
By default, if you don’t specify a logical operation, all top-level terms in a query are combined by OR. Just use a space character to separate operands. The following query is a shorthand for the example above.
Example:
foo 13.378,52.515,13.380,52.517 bar
5.3.2. AND
Use the logical AND operator to search for chunks that match all of the given operands.
Example:
AND(13.378,52.515,13.380,52.517 foobar)
This example matches all chunks that are within the bounding box 13.378,52.515,13.380,52.517
and that have a tag or indexed attribute with a value of foobar
.
5.3.3. NOT
Use the logical NOT operator to search for chunks that match none of the given operands.
Example:
NOT(13.378,52.515,13.380,52.517 foobar)
This example matches all chunks that are not within the bounding box 13.378,52.515,13.380,52.517
and that don’t have a tag or indexed attribute with a value of foobar
.
6. Client configuration
You can configure GeoRocket’s command-line application (CLI) by editing the file conf/georocket.yaml
in the application directory. The file must be a valid YAML file. The following sections describe possible configuration keys and values.
Keys are specified using the dot notation. You can use the keys in your file as they are specified here or use normal YAML notation instead. For example, the following configuration item
georocket.host: localhost
is identical to:
georocket:
host: localhost
6.1. Server connection
georocket.host | The host where GeoRocket Server is running. |
georocket.port | The TCP port GeoRocket Server is listening on. |
7. Server configuration
You can configure GeoRocket Server by editing the file conf/georocketd.yaml
in the application directory. The file must be a valid YAML file. The following sections describe possible configuration keys and values.
Keys are specified using the dot notation. You can use the keys in your file as they are specified here or use normal YAML notation instead. For example, the following configuration item
georocket.storage.class: io.georocket.storage.file.FileStore
is identical to:
georocket:
storage:
class: io.georocket.storage.file.FileStore
7.1. HTTP interface
georocket.host | The host GeoRocket should bind to. By default GeoRocket only listens to incoming connections from |
georocket.port | The TCP port GeoRocket should listen on. |
7.2. Back-ends
georocket.storage.class | The data store implementation to use. Possible values include: |
7.2.1. File back-end
Data store implementation
io.georocket.storage.file.FileStore
Configuration
georocket.storage.file.path | The path on the local hard drive where the data store should be located. |
7.2.2. HDFS
Data store implementation
io.georocket.storage.hdfs.HDFSStore
Configuration
georocket.storage.hdfs.defaultFS | The endpoint of the HDFS NameNode |
georocket.storage.hdfs.path | The path on the distributed file system where the chunks should be stored. The directory must exist and write permissions must have been granted to the user executing GeoRocket. |
7.2.3. MongoDB
Data store implementation
io.georocket.storage.mongodb.MongoDBStore
Configuration
georocket.storage.mongodb.connectionString | The connection string URI used to connect to MongoDB. For example: |
georocket.storage.mongodb.database | The database where the chunks should be stored |
7.2.4. Amazon S3
Data store implementation
io.georocket.storage.s3.S3Store
Configuration
georocket.storage.s3.accessKey | The Amazon S3 Access Key used for authentication |
georocket.storage.s3.secretKey | The Amazon S3 Secret Key used for authentication |
georocket.storage.s3.host | The host of the S3 endpoint |
georocket.storage.s3.port | The port of the S3 endpoint |
georocket.storage.s3.bucket | The S3 bucket where chunks should be stored |
georocket.storage.s3.pathStyleAccess |
|
georocket.storage.s3.forceSignatureV2 |
|
georocket.storage.s3.requestExpirySeconds | The number of seconds a pre-signed S3 request should stay valid |
7.3. Elasticsearch
The GeoRocket distribution contains a version of Elasticsearch that will automatically be started together with GeoRocket. You can disable this behaviour and use a remote Elasticsearch instance instead.
Set the following configuration items to disable the provided Elasticsearch instance and to configure the host and port of the remote one:
georocket:
index:
elasticsearch:
embedded: false
host: 127.0.0.1
port: 9200
7.3.1. Configuration
georocket.index.elasticsearch.embedded |
|
georocket.index.elasticsearch.host | Elasticsearch host address |
georocket.index.elasticsearch.port | Elasticsearch TCP port |