GeoRocket User Documentation

version 1.0.0

Copyright © 2015-2017 Fraunhofer Institute for Computer Graphics Research IGD

1. Introduction

GeoRocket is a high-performance data store for geospatial files. It can store 3D city models (e.g. CityGML), GML files or any other XML-based geospatial data sets. GeoRocket provides the following features:

  • High performance data storage with multiple back-ends such as Amazon S3, MongoDB, distributed file systems (e.g. HDFS or Ceph), or your local hard drive (enabled by default)

  • Support for high-speed search features based on the popular Open-Source framework Elasticsearch. You can perform spatial queries and search for attributes, layers and tags.

  • GeoRocket is made for the Cloud. Based on the Open-Source toolkit Vert.x it is reactive and can handle big files and a large number of parallel requests.

  • GeoRocket exists in two editions—​an Open-Source version and a Pro edition for enterprise applications

1.1. Architecture

GeoRocket has a reactive, scalable and asynchronous software architecture. Imported files are split into chunks that are indexed individually. The data store keeps unprocessed chunks. This enables you to later retrieve the original file that you put into GeoRocket without losing any information.[1]

The following figure depicts the software architecture of GeoRocket.

GeoRocket architecture diagram
Figure 1. The architecture of GeoRocket

The import process starts in the upper left corner. Every imported file is first split into individual chunks. Depending on the input format chunks have different meanings. CityGML files, for example, are split into individual cityObjectMember objects which are typically the buildings of a city model.

Attached to each chunk is metadata containing additional information describing the chunk. This includes tags specified by the client and other automatically generated attributes.

The chunks are put into the GeoRocket data store. There are several data store implementations supporting different back-ends such as Amazon S3, HDFS, MongoDB or the local hard drive (default).

Immediately after a chunk has been put into the data store the indexer starts working asynchronously in the background. It reads new chunks from the data store and analyses them for known patterns. It recognises spatial coordinates, attributes and other content. The indexer creates a directory of every item found—​the ‘index’.

The export process starts with querying the indexer for chunks matching the criteria supplied by the client. These chunks are then retrieved from the data store (together with their metadata) and merged into a result file.

1.1.1. Secondary data store

GeoRocket’s architecture allows for the creation of secondary data stores that co-exist with the main data store where the original chunks are kept. The following figure depicts the process:

Secondary data store
Figure 2. Secondary data store

Whenever a new chunk is added to the data store a custom processor can retrieve it to create a secondary data store. Data from this store can then be served directly to the client without further processing. Possible use cases for this scenario are:

  • Optimize 3D scenes for web-based visualisation. Create a secondary data store that contains glTF files. glTF is a specification for the efficient transmission of 3D scenes to the browser.

  • Convert all chunks stored in CityGML version 2 to CityGML version 1 for clients that are incompatible to version 2.

  • Process a 3D city model and derive LOD1 buildings from LOD2 or LOD3.

The advantage of keeping a secondary data store is that it is created automatically in the background when new data is added to GeoRocket. This avoids manual processing. Individual processors may even keep the secondary data store up to date incrementally and only re-create those parts that have changed since it has been created or updated the last time.

2. Getting started

GeoRocket consists of two components: the server and the command-line interface (CLI). Download the Server and CLI bundles from the GeoRocket website and extract them to a directory of your choice.

GeoRocket requires Java 8 or higher to be installed on your system.

Open your command prompt and change to the directory where you installed GeoRocket Server. Execute georocketd to run the server.

cd georocket-server-1.0.0/bin
./georocketd

Please wait a couple of seconds until you see the following message:

GeoRocket launched successfully.

The server has launched and now waits for incoming HTTP requests on port 63020 (default).

Next open another command prompt and change to the directory where you installed GeoRocket CLI. Run georocket to access the server through a convenient command-line application.

cd georocket-cli-1.0.0/bin
./georocket

You can now import your first geospatial file. Suppose your file is called /home/user/my_file.gml. Issue the following command to import it to GeoRocket.

./georocket import /home/user/my_file.gml

GeoRocket CLI will now send the file to the server. Depending on the size of the dataset this will take a couple of seconds up to a few minutes (for very large datasets).

Finally, export the contents of the whole store to a file using the export command.

./georocket export / > my_new_file.gml
You can also search for individual features (chunks) and export only a part of the previously imported file. Refer to the Search command section.

That’s it! You have successfully imported your first file into GeoRocket.

3. Command-line application

GeoRocket comes with a handy command-line interface (CLI) letting you interact with the server in a convenient way on your command prompt. The interface provides a number of commands. The following sections describe each command and their parameters in detail.

In the following sections it is assumed that you have the georocket executable in your path. If you have not done so already, you may add it to your path with the following command (Linux):

export PATH=/path/to/georocket-cli-1.0.0/bin:$PATH

Or under Windows do:

set PATH=C:\path\to\georocket-cli-1.0.0\bin;%PATH%

3.1. Help command

Display help for the command-line interface and exit.

Examples:

georocket

or

georocket --help

or

georocket help

The help command also gives information on specific CLI commands. Just provide the name of the command you would like to have help for. For example, the following command displays help for the Import command:

georocket help import

3.2. Import command

Import one or more files into GeoRocket. Specify the name of the file to import as follows.

georocket import myfile.xml

You can also import the file to a certain layer. The layer will automatically be created for you. The following command imports the file myfile.xml to the layer CityModel.

georocket import --layer CityModel myfile.xml

Use slashes to import to sub-layers.

georocket import --layer CityModel/LOD1/Center myfile.xml

You may attach tags to imported files. Tags are human-readable labels that you can use to search for files or chunks stored in GeoRocket. Use a comma to separate multiple tags.

georocket import --tags city,district,lod1 myfile.xml

3.3. Export command

Export a layer stored in GeoRocket. Provide the name of the layer you want to export.

georocket export CityModel/LOD1

By default the export command writes to standard out (your console). Redirect output to a file as follows.

georocket export CityModel/LOD1 > lod1.xml

You may also export the whole data store. Just provide the root layer / to the export command.

georocket export /
Exporting the whole data store may take a while depending on how much data you have stored in GeoRocket.

3.4. Search command

Search the GeoRocket data store and export individual geospatial features (chunks). Provide a query to the search command as follows.

georocket search myquery

You can also search individual layers.

georocket search --layer CityModel myquery

By default the search command writes to standard out (your console). Redirect output to a file as follows.

georocket search myquery > results.xml

Use a space character to separate multiple query terms. Search results will be combined by logical OR.

See the Query language section for a full description of all possible terms in a query.

3.5. Delete command

Remove geospatial features (chunks) or whole layers from the GeoRocket data store. Provide a query to the delete command to select the features to delete.

georocket delete myquery

You can also restrict the delete command to a certain layer.

georocket delete --layer CityModel myquery

Delete a whole layer (including all its chunks and sub-layers) as follows.

georocket delete --layer CityModel/LOD1

You may even delete the whole data store by specifying the root layer /.

georocket delete --layer /
This is a dangerous operation. It will remove everything that is stored in your GeoRocket instance. There is no safety net—​no confirmation prompt and no recycle bin.

4. HTTP interface

GeoRocket Server provides an (REST-like) HTTP interface that you can use to interact with the data store as well as to embed GeoRocket in your application. By default GeoRocket listens to incoming connections on port 63020.

4.1. GET information

Get information about GeoRocket (application name, version, etc.).

Resource URL
/
Parameters

None

Status codes

200

The operation was successful

Example request
GET / HTTP/1.1

4.1.5. Example response

HTTP/1.1 200 OK
Content-Type: application/json
Content-Length: 100

{
  "name" : "GeoRocket",
  "version" : "1.0.0",
  "tagline" : "It's not rocket science!"
}

4.2. GET file

Search the data store for chunks that match a given query. Merge the chunks found and return the result as a file.

Resource URL
/store/:path
Parameters

path
(optional)

The absolute path to a layer to search. Omit this parameter to query the whole data store.

search
(optional)

A URL-encoded query string. If no query string is provided all chunks from the requested layer will be returned.

Status codes

200

The operation was successful

400

The provided information was invalid (e.g. malformed query)

404

The requested chunks were not found or the query returned an empty result

500

An unexpected error occurred on the server side

Example request
GET /store/CityModel?search=LOD1+textured+13.378,52.515,13.380,52.517 HTTP/1.1
Example response
HTTP/1.1 200 OK
Transfer-Encoding: chunked

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<CityModel ...>
  ...
</CityModel>

4.3. POST file

Import a file into GeoRocket. Split the file into chunks and put them into the data store.

Resource URL
/store/:path
Parameters

path
(optional)

The absolute path to a layer where the chunks from the imported file should be stored. Omit this parameter to put the chunks into the data store’s root layer /.

tags
(optional)

A comma-separated list of tags (i.e. labels) to attach to each imported chunk.

Status codes

202

The operation was successful. The file was accepted for importing and is now being processed asynchronously.

400

The provided information was invalid (e.g. malformed input file)

500

An unexpected error occurred on the server side

Example request
POST /store/CityModel?tags=LOD1,textured HTTP/1.1
Content-Length: 35903517

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<CityModel ...>
  ...
</CityModel>

4.3.5. Example response

HTTP/1.1 202 Accepted file - importing in progress
Content-Length: 0

4.4. DELETE chunks

Delete chunks or layers from the data store.

Resource URL
/store/:path
Parameters

path
(optional)

The absolute path to the layer from which chunks matching the given query should be deleted. If no query is given this is the path to the layer to delete (including all its contents—​sub-layers and chunks).

search
(optional)

A URL-encoded query string specifying which chunks should be deleted. If no query string is provided the whole layer is deleted.

If you don’t specify a layer (path) nor a query (search) then the whole contents of the GeoRocket data store will be deleted.
Status codes

204

The operation was successful. The matching chunks were deleted from the data store.

400

The provided information was invalid (e.g. malformed query)

500

An unexpected error occurred on the server side

This HTTP method is idempotent. Even if the given query returns no results (i.e. if there is nothing to delete) the operation completes successfully with a status code of 204.
Example request
DELETE /store/CityModel?search=LOD1 HTTP/1.1

4.4.5. Example response

HTTP/1.1 204 No Content
Content-Length: 0

5. Query language

As of version 1.0.0 the query language is rather limited. At the moment you can only specify strings and bounding boxes.

5.1. Strings

GeoRocket performs a full-text search for strings in every tag and every indexed attribute.

Example:

string

5.2. Bounding boxes

Bounding boxes can be specified using four floating point numbers separated by a comma. The format is:

left,bottom,right,top

or

minimum_longitude,minimum_latitude,maximum_longitude,maximum_latitude
As of version 1.0.0 GeoRocket only supports spatial queries given in WGS84 coordinates (longitude/latitude). However, data stored in GeoRocket can have an arbitrary spatial reference system as long as it is specified in the original file.

Example:

13.378,52.515,13.380,52.517

5.3. Logical operators

The operators OR, AND and NOT can be used to logically combine terms in a query. They are applied using the following notation:

<operator>(<operand_1> <operand_2> ... <operand_n>)

Operands are separated by a space character. Logical operations can be nested.

Examples:

AND(a b)
AND(a NOT(b))
OR(NOT(a) NOT(b))

5.3.1. OR

Use the logical OR operator to search for chunks that match at least one of the given operands.

Example:

OR(foo 13.378,52.515,13.380,52.517 bar)

This example matches all chunks that have a tag or indexed attribute with the value foo or bar as well as those that are within the bounding box 13.378,52.515,13.380,52.517.

By default, if you don’t specify a logical operation, all top-level terms in a query are combined by OR. Just use a space character to separate operands. The following query is a shorthand for the example above.

Example:

foo 13.378,52.515,13.380,52.517 bar

5.3.2. AND

Use the logical AND operator to search for chunks that match all of the given operands.

Example:

AND(13.378,52.515,13.380,52.517 foobar)

This example matches all chunks that are within the bounding box 13.378,52.515,13.380,52.517 and that have a tag or indexed attribute with a value of foobar.

5.3.3. NOT

Use the logical NOT operator to search for chunks that match none of the given operands.

Example:

NOT(13.378,52.515,13.380,52.517 foobar)

This example matches all chunks that are not within the bounding box 13.378,52.515,13.380,52.517 and that don’t have a tag or indexed attribute with a value of foobar.

6. Client configuration

You can configure GeoRocket’s command-line application (CLI) by editing the file conf/georocket.yaml in the application directory. The file must be a valid YAML file. The following sections describe possible configuration keys and values.

Keys are specified using the dot notation. You can use the keys in your file as they are specified here or use normal YAML notation instead. For example, the following configuration item

georocket.host: localhost

is identical to:

georocket:
  host: localhost

6.1. Server connection

georocket.host
(default: "localhost")

The host where GeoRocket Server is running.

georocket.port
(default: 63020)

The TCP port GeoRocket Server is listening on.

7. Server configuration

You can configure GeoRocket Server by editing the file conf/georocketd.yaml in the application directory. The file must be a valid YAML file. The following sections describe possible configuration keys and values.

Keys are specified using the dot notation. You can use the keys in your file as they are specified here or use normal YAML notation instead. For example, the following configuration item

georocket.storage.class: io.georocket.storage.file.FileStore

is identical to:

georocket:
  storage:
    class: io.georocket.storage.file.FileStore

7.1. HTTP interface

georocket.host
(default: "127.0.0.1")

The host GeoRocket should bind to. By default GeoRocket only listens to incoming connections from 127.0.0.1 (localhost). If you want it to listen to connections coming from arbitrary clients set this configuration item to 0.0.0.0.

georocket.port
(default: 63020)

The TCP port GeoRocket should listen on.

7.2. Back-ends

georocket.storage.class
(defaults to the File back-end)

The data store implementation to use. Possible values include:
io.georocket.storage.file.FileStore
io.georocket.storage.hdfs.HDFSStore
io.georocket.storage.mongodb.MongoDBStore
io.georocket.storage.s3.S3Store

7.2.1. File back-end

Data store implementation
io.georocket.storage.file.FileStore
Configuration

georocket.storage.file.path
(required)

The path on the local hard drive where the data store should be located.

7.2.2. HDFS

Data store implementation
io.georocket.storage.hdfs.HDFSStore
Configuration

georocket.storage.hdfs.defaultFS
(required)

The endpoint of the HDFS NameNode

georocket.storage.hdfs.path
(required)

The path on the distributed file system where the chunks should be stored. The directory must exist and write permissions must have been granted to the user executing GeoRocket.

7.2.3. MongoDB

Data store implementation
io.georocket.storage.mongodb.MongoDBStore
Configuration

georocket.storage.mongodb.connectionString
(required)

The connection string URI used to connect to MongoDB. For example: mongodb://localhost:27017

georocket.storage.mongodb.database
(required)

The database where the chunks should be stored

7.2.4. Amazon S3

Data store implementation
io.georocket.storage.s3.S3Store
Configuration

georocket.storage.s3.accessKey
(required)

The Amazon S3 Access Key used for authentication

georocket.storage.s3.secretKey
(required)

The Amazon S3 Secret Key used for authentication

georocket.storage.s3.host
(required)

The host of the S3 endpoint

georocket.storage.s3.port
(default: 80)

The port of the S3 endpoint

georocket.storage.s3.bucket
(required)

The S3 bucket where chunks should be stored

georocket.storage.s3.pathStyleAccess
(default: true)

true if path-style access to the S3 bucket is used or false if a sub-domain is used

georocket.storage.s3.forceSignatureV2
(default: false)

true if S3 requests should be signed using the old Signature V2 algorithm instead of newer versions

georocket.storage.s3.requestExpirySeconds
(default: 600)

The number of seconds a pre-signed S3 request should stay valid

7.3. Elasticsearch

The GeoRocket distribution contains a version of Elasticsearch that will automatically be started together with GeoRocket. You can disable this behaviour and use a remote Elasticsearch instance instead.

Set the following configuration items to disable the provided Elasticsearch instance and to configure the host and port of the remote one:

georocket:
  index:
    elasticsearch:
      embedded: false
      host: 127.0.0.1
      port: 9200

7.3.1. Configuration

georocket.index.elasticsearch.embedded
(default: true)

true if GeoRocket should launch the provided Elasticsearch instance. false if it should connect to an existing instance.

georocket.index.elasticsearch.host
(default: "localhost")

Elasticsearch host address

georocket.index.elasticsearch.port
(default: 9200)

Elasticsearch TCP port


1. Exported files might have a slightly different formatting. Whitespaces between chunks might be different, but other than that, exported files contain the exact same information as imported ones.