GeoRocket User Documentation

version 1.4.0-SNAPSHOT

Copyright © 2015-2020 Fraunhofer Institute for Computer Graphics Research IGD

1. Introduction

GeoRocket is a high-performance data store for geospatial files. Its focus lies on the storage, indexing, and analysis of big vector data. GeoRocket supports GeoJSON, CityGML (3D city models), GML or any other XML-based geospatial data format. It provides the following features:

  • Scalable high-performance data storage with multiple back-ends such as Amazon S3, MongoDB, H2 (default), distributed file systems (e.g. HDFS or Ceph), or your local hard drive.

  • Support for high-speed search features based on the popular Open-Source framework Elasticsearch. You can perform spatial queries and search for attributes, layers and tags.

  • GeoRocket is made for the Cloud. Based on the Open-Source toolkit Vert.x it is reactive and can handle big files and a large number of parallel requests.

  • GeoRocket exists in two editions—​an Open-Source version and a Pro edition for enterprise applications

1.1. Architecture

GeoRocket has a reactive, scalable and asynchronous software architecture. Imported files are split into chunks that are indexed individually. The data store keeps unprocessed chunks. This enables you to later retrieve the original file that you put into GeoRocket without losing any information.[1]

The following figure depicts the software architecture of GeoRocket.

GeoRocket architecture diagram
Figure 1. The architecture of GeoRocket

The import process starts in the upper left corner. Every imported file is first split into individual chunks. Depending on the input format, chunks have different meanings. CityGML files, for example, are split into individual cityObjectMember objects which are typically the buildings of a city model.

Attached to each chunk, there is metadata containing additional information describing the chunk. This includes tags and properties specified by the client, as well as other automatically generated attributes.

The chunks are put into the GeoRocket data store. There are several data store implementations supporting different back-ends such as Amazon S3, MongoDB, H2 (default), HDFS or the local hard drive. Immediately after a chunk has been put into the data store, the indexer starts working asynchronously in the background. It reads new chunks from the data store and analyses them for known patterns. It recognises spatial coordinates, attributes, and other content. The indexer creates an inverted index of every item found.

The export process starts with querying the indexer for chunks matching the criteria supplied by the client. These chunks are then retrieved from the data store (together with their metadata) and merged into a result file.

1.1.1. Secondary data store

GeoRocket’s architecture allows for the creation of secondary data stores that co-exist with the main data store where the original chunks are kept. The following figure depicts the process:

Secondary data store
Figure 2. Secondary data store

Whenever a new chunk is added to the data store, a custom processor can retrieve it to create a secondary data store. Data from this store can then be served directly to the client without further processing. Possible use cases for this scenario are:

  • Optimize 3D scenes for web-based visualisation. Create a secondary data store that contains glTF files. glTF is a specification for the efficient transmission of 3D scenes to the browser.

  • Convert all chunks stored in CityGML version 2 to CityGML version 1 for clients that are incompatible to version 2.

  • Process a 3D city model and derive LOD1 buildings from LOD2 or LOD3.

The advantage of keeping a secondary data store is that it is created automatically in the background when new data is added to GeoRocket. This avoids manual processing. Individual processors may even keep the secondary data store up to date incrementally and only re-create those parts that have changed since it has been created or updated the last time.

1.2. Glossary

This section contains a list of terms often used in this document and in GeoRocket.

Chunk

A part of an imported file, typically a geospatial feature (e.g. a building from a 3D city model). Chunks are immutable, which means they cannot be modified in GeoRocket’s data store.

Metadata

Information about a chunk (such as user-defined tags and properties, as well as derived attributes).

Secondary data store

A store for data that is automatically derived from chunks in the main data store (e.g. glTF files derived from imported CityGML chunks).

Tag

A user-defined label that can be attached to one or more chunks in order to categorise data. In contrast to a layer, multiple tags can be attached to a chunk.

Property

A user-defined key-value pair that can be attached to a chunk. Multiple properties can be attached to one chunk, but the key must be unique. Properties belong to metadata and should not be mixed up with attributes contained in the imported data (such as CityGML generic attributes or GeoJSON properties).

Layer

A way to structure the data store. Layers can be compared to folders or directories on a hard drive. In contrast to tags, a chunk can only be stored in one layer. Chunks without a layer are kept in the root layer named /. Layers can be structured hierarchically, but parent layers always include all chunks of their children.

Indexed attribute

In contrast to properties, indexed attributes do not belong to metadata. Instead, they are information inside the imported chunks, detected by the indexer (e.g. GML IDs, CityGML generic attributes, or GeoJSON properties). Since chunks cannot be modified, indexed attributes are immutable.

2. Getting started

GeoRocket consists of two components: the server and the command-line interface (CLI). Download the Server and CLI bundles from the GeoRocket website and extract them to a directory of your choice.

GeoRocket requires Java 8 or higher to be installed on your system.

Open your command prompt and change to the directory where you installed GeoRocket Server. Execute georocketd to run the server.

cd georocket-server-1.4.0-SNAPSHOT/bin
./georocketd

Please wait a couple of seconds until you see the following message:

GeoRocket launched successfully.

The server has launched and now waits for incoming HTTP requests on port 63020 (default).

Next, open another command prompt and change to the directory where you installed GeoRocket CLI. Run georocket to access the server through a convenient command-line application.

cd georocket-cli-1.4.0-SNAPSHOT/bin
./georocket

You can now import your first geospatial file. Suppose your file is called /home/user/my_file.gml. Issue the following command to import it to GeoRocket.

./georocket import /home/user/my_file.gml

GeoRocket CLI will now send the file to the server. Depending on the size of the dataset, this will take a couple of seconds up to a few minutes (for very large datasets).

Finally, export the contents of the whole store to a file using the export command.

./georocket export / > my_new_file.gml
You can also search for individual features (chunks) and export only a part of the previously imported file. Refer to the Search command section.

That’s it! You have successfully imported your first file into GeoRocket.

3. Command-line application

GeoRocket comes with a handy command-line interface (CLI) letting you interact with the server in a convenient way on your command prompt. The interface provides a number of commands. The following sections describe each command and their parameters in detail.

In the following sections it is assumed that you have the georocket executable in your path. If you have not done so already, you may add it to your path with the following command.

Linux:

export PATH=/path/to/georocket-cli-1.4.0-SNAPSHOT/bin:$PATH

Windows:

set PATH=C:\path\to\georocket-cli-1.4.0-SNAPSHOT\bin;%PATH%

3.1. Help command

Display help for the command-line interface and exit.

Examples:

georocket

or

georocket --help

or

georocket help

The help command also gives information on specific CLI commands. Just provide the name of the command you would like to have help for. For example, the following command displays help for the Import command:

georocket help import

3.2. Import command

Import one or more files into GeoRocket. Specify the name of the file to import as follows.

georocket import myfile.xml

You can also import the file to a certain layer. The layer will automatically be created for you. The following command imports the file myfile.xml to the layer CityModel.

georocket import --layer CityModel myfile.xml

Use slashes to import to sub-layers.

georocket import --layer CityModel/LOD1/Center myfile.xml

You may attach tags to imported files. Tags are human-readable labels that you can use to search for files or chunks stored in GeoRocket. Use a comma to separate multiple tags.

georocket import --tags city,district,lod1 myfile.xml

In addition, you may define properties. Properties are key-value pairs that can be attached to imported files. Similar to tags, you can use properties to find chunks stored in GeoRocket. Multiple properties can be attached to a chunk, but keys must be unique. Use a colon ':' to separate key and value, and a comma to specify multiple properties.

georocket import --properties type:building,lod:1 myfile.xml

Of course, you can combine tags, properties and layers:

georocket import --layer CityModel \
  --tags city,district,lod1 \
  --properties type:building,lod:1 \
  myfile.xml

For a description on how to use tags and properties to retrieve chunks from the data store, we refer to the sections on the search command and the query language.

GeoRocket is able to automatically detect the coordinate reference system (CRS) of an imported file. If this is, for any reason, not possible, you may manually specify a reference system with the parameter --fallbackCRS. GeoRocket will only use this fallback CRS if it does not find a valid one in the imported file. The CLI accepts CRS strings in the form EPSG:<code> (e.g. EPSG:25832). See the EPSG registry for more information.

3.3. Export command

Export a layer stored in GeoRocket. Provide the name of the layer you want to export.

georocket export CityModel/LOD1

By default, the export command writes to standard out (your console). Redirect output to a file as follows.

georocket export CityModel/LOD1 > lod1.xml

You may also export the whole data store. Just provide the root layer / to the export command.

georocket export /
Exporting the whole data store may take a while, depending on how much data you have stored in GeoRocket.

If your data stored in GeoRocket is homogeneous, you can enable optimistic merging to tremendously reduce the latency between the request and the first returned chunk:

georocket export --optimistic-merging /

Note that chunks that cannot be merged will be skipped. The number of skipped chunks will be written to the standard error stream (stderr). Repeat the request if you want to get all chunks (e.g. with optimistic merging disabled).

3.4. Search command

Search the GeoRocket data store and export individual geospatial features (chunks). Provide a query to the search command as follows.

georocket search myquery

You can also search individual layers.

georocket search --layer CityModel myquery

By default, the search command writes to standard out (your console). Redirect output to a file as follows.

georocket search myquery > results.xml

Use a space character to separate multiple query terms. Search results will be combined by logical OR.

See the Query language section for a full description of all possible terms in a query.

There are command interpreters that do not accept specific query strings. You may have to escape individual characters to formulate a valid command. Consider the following example:

georocket search EQ(key value)

This command works perfectly on the Windows Command Prompt, but not under Linux/macOS with bash or zsh. For these shells, you have to escape the parentheses as follows:

georocket search EQ\(key value\)

Do not try to quote the whole query string or to escape the space character. THE FOLLOWING COMMANDS ARE MOST LIKELY NOT WHAT YOU WANT:

georocket search "EQ(key value)"
georocket search EQ\(key\ value\)

These commands search for chunks that contain the verbatim string EQ(key value) and not for those where the specified property equals the given value!

If your data stored in GeoRocket is homogeneous, you can enable optimistic merging to tremendously reduce the latency between the request and the first returned chunk:

georocket export --optimistic-merging /

Note that chunks that cannot be merged will be skipped. The number of skipped chunks will be written to the standard error stream (stderr). Repeat the request if you want to get all chunks (e.g. with optimistic merging disabled).

3.5. Tag command

Modify tags of existing chunks in the data store. Tags are labels that you can use to categorise your data and to make it searchable. The tag command has two sub-commands that you can use to add or remove tags.

3.5.1. Add tags

Add tags to existing chunks in the data store as follows:

georocket tag add --tags city,lod1 myquery

This command adds the tags city and lod1 to all chunks matching the given query.

You may also limit the command to chunks from a given layer:

georocket tag add --layer CityModel --tags city,lod1 myquery

3.5.2. Remove tags

Remove tags from existing chunks in the data store as follows:

georocket tag rm --tags city,lod1 myquery

The command will remove the tags city and lod1 from all chunks matching the given query.

To limit the command to chunks from a certain layer use the --layer parameter:

georocket tag rm --layer CityModel --tags city,lod1 myquery

3.6. Property command

Manage properties of existing chunks in the data store. Properties are key-value pairs that you can attach to your data to make it searchable. The property command has sub-commands to set, remove, and retrieve properties.

Properties belong to metadata and should not be mixed up with attributes contained in the imported data (such as CityGML generic attributes or GeoJSON properties). Modifying properties only affects GeoRocket’s index and does not change the imported chunks!

3.6.1. Set properties

Set properties of existing chunks in the data store as follows:

georocket property set --properties type:building,lod:1 myquery

This command modifies chunks matching the given query. It sets the property type to building and lod to 1.

You may also limit the command to chunks from a given layer:

georocket property set --layer CityModel --properties type:building,lod:1 myquery

Numerical property values, dates, and times are automatically analysed by GeoRocket and can be used in combination with comparison operators (such as EQ, LT, and GT) when formulating a query. For example, if you attach a property named importDate to all chunks, denoting the date when the chunk was imported into GeoRocket, you will be able to query the data store for all chunks whose importDate is before 1 January 2017 with the following query:

LT(importDate 2017-01-01)

Dates must be given in the form YYYY-MM-DD, YYYY-MM or YYYY. Times must be given as HH:mm:ss, HH:mm or HH.

3.6.2. Get property values

Get all values of a property with the following command:

./georocket property get --property type myquery

This command retrieves all values of the property with the key type from all chunks matching the given query.

You may limit the command to a certain layer as follows:

./georocket property get --layer CityModel --property type myquery
The operation returns a list of all values of the given property from all matching chunks. Duplicate values are not filtered out. This means, in the example above, if there are 10 chunks whose property type has the value building, you will get a list with the value building repeated 10 times.

3.6.3. Remove properties

Remove properties from existing chunks in the GeoRocket data store:

georocket property rm --properties type,lod myquery

This command removes the properties with the keys type and lod from all chunks matching the given query.

You may limit the command to chunks from a given layer:

georocket property rm --layer CityModel --properties type,lod myquery

3.7. Delete command

Remove geospatial features (chunks) or whole layers from the GeoRocket data store. Provide a query to the delete command to select the features to delete.

georocket delete myquery

You can also restrict the delete command to a certain layer.

georocket delete --layer CityModel myquery

Delete a whole layer (including all its chunks and sub-layers) as follows.

georocket delete --layer CityModel/LOD1

You may even delete the whole data store by specifying the root layer /.

georocket delete --layer /
This is a dangerous operation. It will remove everything that is stored in your GeoRocket instance. There is no safety net, no confirmation prompt, and no recycle bin.

4. HTTP interface

GeoRocket Server provides an HTTP interface (REST-like, Richardson Maturity Model 2) that you can use to interact with the data store and to embed GeoRocket in your application. By default, GeoRocket listens to incoming connections on port 63020.

4.1. GET information

Get information about GeoRocket (application name, version, etc.).

Resource URL
/
Parameters

None

Status codes

200

The operation was successful

Example request
GET / HTTP/1.1

4.1.5. Example response

HTTP/1.1 200 OK
Content-Type: application/json
Content-Length: 100

{
  "name" : "GeoRocket",
  "version" : "1.4.0-SNAPSHOT",
  "tagline" : "It's not rocket science!"
}

4.2. GET file

Search the data store for chunks that match a given query. Merge the chunks found and return the result as a file.

Resource URL
/store/:path
Parameters

path
(optional)

The absolute path to a layer to search. Omit this parameter to query the whole data store.

search
(optional)

A URL-encoded query string. If no query string is provided all chunks from the requested layer will be returned.

optimisticMerging
(optional)

A boolean value (true or false) specifying whether optimistic merging is enabled. Normally, GeoRocket has to check all chunks matching a query and find the best merge strategy before it can return them. If your data stored in GeoRocket is homogeneous, you can enable optimistic merging to tremendously reduce the latency between the request and the first returned chunk. Note that chunks that cannot be merged will be skipped. The number of skipped chunks can be retrieved from the GeoRocket-Unmerged-Chunks HTTP trailer (see below).

scroll
(optional)

A boolean value (true or false) denoting whether scrolling should be enabled. Scrolling allows you to download large amounts of data in a progressive fashion. If it is enabled, GeoRocket will only return a given number of chunks in one request (see size parameter). Each response will include an HTTP header named X-Scroll-Id whose value can be used to retrieve more chunks in subsequent requests (see scrollId parameter). The response will also include the HTTP headers X-Total-Hits denoting the total number of chunks matching the query and X-Hits specifying the number of chunks returned in the current response. To retrieve all chunks matching a query, issue the same request with the returned scroll ID again and again until X-Hits is less than X-Total-Hits or until GeoRocket returns the HTTP status code 404 (Not Found).

size
(default: 100)

The maximum number of chunks to return in one request if scrolling is enabled (see scroll parameter). This parameter will be ignored if scrolling is not enabled.

scrollId
(optional)

The scroll ID returned in the previous response to a scrolling request (see scroll parameter).

Request headers

TE

This header should contain the string trailers if GeoRocket is allowed to return HTTP trailers after the response (see the list of response trailers below).

Response headers

Trailer

This header will be included in the response if GeoRocket is about to send HTTP trailers after the response (see the TE request header). It specifies the trailers that GeoRocket will send (see the list of response trailers below).

X-Total-Hits

The total number of chunks matching the current query. This header will only be included if scrolling is enabled (see scroll parameter).

X-Hits

The number of chunks returned in the current response. This header will only be included if scrolling is enabled (see scroll parameter).

X-Scroll-Id

An ID that can be used to retrieve further chunks in subsequent scrolling requests. This header will only be included if scrolling is enabled (see scroll parameter).

Response trailers

 GeoRocket-Unmerged-Chunks

The number of chunks that were skipped during merging. Possible reasons for unmerged chunks are: (1) chunks were added to GeoRocket’s store while merging was in progress, or (2) optimistic merging was enabled and some chunks did not fit to the search result. Based on this HTTP trailer, the client can decide whether to repeat the request to fetch the missing chunks (e.g. with optimistic merging disabled) or not. This HTTP trailer will only be sent if the request header TE contains the string trailers and if there actually were chunks that could not be merged.

Status codes

200

The operation was successful

400

The provided information was invalid (e.g. malformed query)

404

The requested chunks were not found or the query returned an empty result

500

An unexpected error occurred on the server side

Example requests
GET /store?search=Berlin HTTP/1.1
GET /store/CityModel?search=LOD1+textured+13.378,52.515,13.380,52.517 HTTP/1.1
Example response
HTTP/1.1 200 OK
Transfer-Encoding: chunked

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<CityModel ...>
  ...
</CityModel>

4.3. GET property values

Get a list of all values of a property from all chunks matching a given query. Properties are key-value pairs that you can attach to your data to make it searchable.

Duplicate values are not filtered out. See the get property value command for more information.
Properties belong to metadata and should not be mixed up with indexed attributes contained in the imported data (such as CityGML generic attributes or GeoJSON properties). For indexed attributes, see the GET indexed attribute values endpoint instead.
Resource URL
/store/:path
Parameters

property
(required)

The name of the property whose values should be returned.

path
(optional)

The absolute path to a layer to search. Omit this parameter to query the whole data store.

search
(optional)

A URL-encoded query string. If no query string is provided, the property values of all chunks from the requested layer will be returned.

Status codes

200

The operation was successful

400

The provided information was invalid (e.g. malformed query)

404

The requested chunks were not found or the query returned an empty result

500

An unexpected error occurred on the server side

Example request
GET /store/CityModel?property=type&search=LOD1+textured+13.378,52.515,13.380,52.517 HTTP/1.1
Example response
HTTP/1.1 200 OK
Transfer-Encoding: chunked

["Building", "Building", "Building", "Tree", ... "Tree", "Building", "Tree", "Street"]

4.4. GET indexed attribute values

Get a list of all values of an indexed attribute from all chunks matching a given query. In contrast to properties, indexed attributes are information inside the imported chunks (such as CityGML generic attributes or GeoJSON properties).

Similar to the GET property values endpoint, duplicate values are not filtered out.
Resource URL
/store/:path
Parameters

attribute
(required)

The name of the indexed attribute whose values should be returned.

path
(optional)

The absolute path to a layer to search. Omit this parameter to query the whole data store.

search
(optional)

A URL-encoded query string. If no query string is provided, the attribute values of all chunks from the requested layer will be returned.

Status codes

200

The operation was successful

400

The provided information was invalid (e.g. malformed query)

404

The requested chunks were not found or the query returned an empty result

500

An unexpected error occurred on the server side

Example request
GET /store/CityModel?attribute=Street&search=LOD1 HTTP/1.1
Example response
HTTP/1.1 200 OK
Transfer-Encoding: chunked

["Main Street", "Main Street", "5th Avenue", "Lake Street", ... "5th Avenue", "5th Avenue", "Lake Street", "Main Street"]

4.5. POST file

Import a file into GeoRocket. Split the file into chunks and put them into the data store.

This operation supports GZIP. Clients may upload compressed files to GeoRocket by including a Content-Encoding header in the request with a value of gzip.
Resource URL
/store/:path
Parameters

path
(optional)

The absolute path to a layer where the chunks from the imported file should be stored. Omit this parameter to put the chunks into the data store’s root layer /.

tags
(optional)

A comma-separated list of tags (i.e. labels) to attach to each imported chunk.

fallbackCRS
(optional)

GeoRocket is able to automatically detect the coordinate reference system (CRS) of an imported file. If this is, for any reason, not possible, you may manually specify a reference system with this parameter. GeoRocket will only use it if it does not find a valid one in the imported file. Values for this parameter must be in the form EPSG:<code> (e.g. EPSG:25832). See the EPSG registry for more information.

Response headers

X-Correlation-Id

A unique identifier that can be used to query the status of importing and indexing the uploaded file through the task endpoint.

Status codes

202

The operation was successful. The file was accepted for importing and is now being processed asynchronously.

400

The provided information was invalid (e.g. malformed input file)

500

An unexpected error occurred on the server side

Example request
POST /store/CityModel?tags=LOD1,textured HTTP/1.1
Content-Length: 35903517

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<CityModel ...>
  ...
</CityModel>
Example response
HTTP/1.1 202 Accepted file - importing in progress
Content-Length: 0
X-Correlation-Id: 1234566789abcdef12345678

4.6. PUT tags

Add tags to existing chunks in the data store.

Resource URL
/store/:path
Parameters

tags
(required)

A comma-separated list of tags (i.e. labels) to attach to each matching chunk.

path
(optional)

The absolute path to a layer containing the chunks to which the tags should be added. Omit this parameter to add the tags to all matching chunks in the data store.

search
(optional)

A URL-encoded query string. If no query string is provided, the tags will be added to all chunks from the given layer.

Status codes

204

The operation was successful

400

The provided information was invalid (e.g. malformed query)

405

The operation is not allowed. It is not possible to modify anything else in the data store except tags and properties

500

An unexpected error occurred on the server side

Example request
PUT /store/CityModel?tags=textured&search=LOD3 HTTP/1.1
Example response
HTTP/1.1 204 No Content
Content-Length: 0

4.7. PUT properties

Add properties to existing chunks in the data store.

Resource URL
/store/:path
Parameters

properties
(required)

A comma-separated list of properties to set. Each property should be defined in the form key:value.

path
(optional)

The absolute path to a layer containing the chunks whose properties should be set. Omit this parameter to set the properties of all matching chunks in the data store.

search
(optional)

A URL-encoded query string. If no query string is provided, the properties of all chunks from the given layer will be set.

Status codes

204

The operation was successful

400

The provided information was invalid (e.g. malformed query)

405

The operation is not allowed. It is not possible to modify anything else in the data store except tags and properties

500

An unexpected error occurred on the server side

Example request
PUT /store/CityModel?properties=type:building,lod:3&search=LOD3 HTTP/1.1
Example response
HTTP/1.1 204 No Content
Content-Length: 0

4.8. DELETE chunks

Delete chunks or layers from the data store.

Resource URL
/store/:path
Parameters

path
(optional)

The absolute path to the layer from which chunks matching the given query should be deleted. If no query is given this is the path to the layer to delete (including all its contents—​sub-layers and chunks).

search
(optional)

A URL-encoded query string specifying which chunks should be deleted. If no query string is provided the whole layer is deleted.

async
(default: false)

A boolean value (true or false) denoting whether the operation should be performed asynchronously or not. If the value is true, GeoRocket will schedule the operation and immediately return with HTTP status code 202.

If you don’t specify a layer (path) nor a query (search) then the whole contents of the GeoRocket data store will be deleted.
Response headers

X-Correlation-Id

A unique identifier that can be used to query the status of deleting the chunks through the task endpoint.

Status codes

202

The request was accepted and the matching chunks will be deleted from the data store asynchronously. This status code will only be returned if the async parameter is true.

204

The operation was successful. The matching chunks were deleted from the data store. This status code will only be returned if the async parameter is false (default).

400

The provided information was invalid (e.g. malformed query)

500

An unexpected error occurred on the server side

This HTTP method is idempotent. Even if the given query returns no results (i.e. if there is nothing to delete) the operation will complete successfully with a status code of 202 or 204 (depending on the async parameter).
Example request
DELETE /store/CityModel?search=LOD1&async=true HTTP/1.1
Example response
HTTP/1.1 202 Accepted
Content-Length: 0
X-Correlation-Id: 1234566789abcdef12345678

4.9. DELETE tags

Remove tags from existing chunks in the data store.

Resource URL
/store/:path
Parameters

tags
(required)

Comma-separated list of tags to remove from the chunks

path
(optional)

The absolute path to the layer containing the chunks from which the given tags should be removed

search
(optional)

A URL-encoded query string specifying from which chunks the given tags should be removed. If no query string is provided the tags are removed from all chunks in the given layer.

Status codes

204

The operation was successful. The tags were deleted from the matching chunks.

400

The provided information was invalid (e.g. malformed query)

500

An unexpected error occurred on the server side

This HTTP method is idempotent. Even if the given query returns no results or if the given tags do not exist (i.e. if there is nothing to delete), the operation completes successfully with a status code of 204.
Example request
DELETE /store/CityModel?search=LOD3&tags=textured HTTP/1.1
Example response
HTTP/1.1 204 No Content
Content-Length: 0

4.10. DELETE properties

Remove properties from existing chunks in the data store.

Resource URL
/store/:path
Parameters

properties
(required)

Comma-separated list of property keys to remove from the chunks

path
(optional)

The absolute path to the layer containing the chunks from which the properties should be removed

search
(optional)

A URL-encoded query string specifying from which chunks the properties should be removed. If no query string is provided the properties are removed from all chunks in the given layer.

Status codes

204

The operation was successful. The properties were deleted from the matching chunks.

400

The provided information was invalid (e.g. malformed query)

500

An unexpected error occurred on the server side

This HTTP method is idempotent. Even if the given query returns no results or if the given properties do not exist (i.e. if there is nothing to delete), the operation completes successfully with a status code of 204.
Example request
DELETE /store/CityModel?search=LOD1&properties=type HTTP/1.1
Example response
HTTP/1.1 204 No Content
Content-Length: 0

4.11. GET tasks

Get information about the status of asynchronous tasks such as importing, indexing, or deleting. The operation’s response is structured as described in the task model section.

Resource URL
/tasks/:correlationId
Parameters

correlationId
(optional)

A unique task identifier (also called ‘correlation ID’). Operations such as POST file or DELETE chunks return such an identifier in their response headers (X-Correlation-Id). If this parameter is left off, GeoRocket will return information about all tasks. If it is given, GeoRocket will only return information about the specified task.

Status codes

200

The operation was successful. The response is structured as described in the task model section.

404

The requested task information was not found

Example request
GET /tasks/ HTTP/1.1
Example response
HTTP/1.1 200 OK
Content-Type: application/json
Content-Length: 186

{
  "1234566789abcdef12345678": [{
    "endTime": "2018-11-06T11:27:54.705Z",
    "startTime": "2018-11-06T11:27:52.345Z",
    "type": "receiving"
  }, {
    ...
  }]
}

4.12. Task model

The response of the task endpoint is an object that maps correlation IDs to an array of tasks. Each task has a type, as well as a startTime and endTime. Depending on their type, tasks may have additional properties (see definition of task types below). A task may also have an error property containing a list of errors that occurred during the task execution.

The process of importing a file into GeoRocket through the POST file endpoint will be tracked by a receiving task, an importing task, and an indexing task. The process is finished when all these tasks are finished—​i.e. when their endTime properties are set. The indexing task will always finish last.

The process of deleting chunks through the DELETE chunks endpoint will be tracked by a removing task and a purging task. The process is finished when both tasks are finished, but the purging task will always finish last.

Common properties

type
(required)

The task type. Valid values are receiving, importing, indexing, removing, and purging.

startTime
(required)

An ISO-8601 timestamp specifying when the task was started

endTime
(optional)

An ISO-8601 timestamp specifying when the task has ended. This property will not be set if the task is still running.

errors
(optional)

An array of errors that occurred during the task execution. This property will not be set if the task is still running or if it was executed successfully.

Example response

The following response contains a correlation ID 1234566789abcdef12345678 with three successful tasks, as well as another correlation ID 2234566789abcdef12345679 with two failed tasks.

{
  "1234566789abcdef12345678": [{
    "type": "receiving",
    "startTime": "2018-12-03T13:40:50.285328Z",
    "endTime": "2018-12-03T13:40:52.607407Z"
  }, {
    "type": "importing",
    "startTime": "2018-12-03T13:40:53.582647Z",
    "endTime": "2018-12-03T13:40:56.822243Z",
    "importedChunks": 2025
  }, {
    "type": "indexing",
    "startTime": "2018-12-03T13:40:59.424719Z",
    "endTime": "2018-12-03T13:41:02.927662Z",
    "indexedChunks": 2025
  }],
  "2234566789abcdef12345679": [{
    "type": "purging",
    "startTime": "2018-12-03T14:00:07.369Z",
    "endTime": "2018-12-03T14:00:07.699Z",
    "purgedChunks": 0,
    "totalChunks": 8642,
    "errors": [{
      "reason": "One or more chunks could not be deleted",
      "type": "ReplyException"
    }]
  }, {
    "type": "removing",
    "startTime": "2018-12-03T14:00:07.641Z",
    "endTime": "2018-12-03T14:00:07.698Z",
    "removedChunks": 0,
    "totalChunks": 8642,
    "errors": [{
      "reason": "A very description example error message",
      "type": "generic_error"
    }]
  }]
}

4.12.3. Receiving task

This task tracks the progress while a file is being received by GeoRocket through the POST file endpoint. When this task is finished, the file has been uploaded to GeoRocket but it has not been imported and indexed yet. This means the file contents cannot be queried yet.

Properties

type
(required)

The value is always receiving.

4.12.4. Importing task

This task tracks the progress of importing a file into GeoRocket’s data store. The task starts immediately after the file has been received—​i.e. when the receiving task has ended.

When the importing task is finished, the file has been imported but it has not been fully indexed yet. Importing and indexing run in parallel but importing will always finish first. This means that until both tasks are finished, the file contents cannot be fully queried.

Properties

type
(required)

The value is always importing.

importedChunks
(required)

The number of chunks imported so far. When the task has finished, this value will equal the total number of imported chunks.

4.12.5. Indexing task

This task tracks the progress of indexing chunks in GeoRocket’s data store. When this task is finished, GeoRocket has processed (i.e. received, imported, and indexed) the entire contents of the file. The indexing task will always finish after the receiving and importing tasks.

Properties

type
(required)

The value is always indexing.

indexedChunks
(required)

The number of chunks indexed so far. When the task has finished, both values—​the number of indexed chunks as well as the number of imported chunks—​will be equal.

4.12.6. Removing task

This task tracks the progress of removing chunks from GeoRocket’s index. The removing task and the purging task run in parallel. When both tasks have finished, the chunks have been deleted completely from GeoRocket. The purging task will always finish after the removing task.

type
(required)

The value is always removing.

totalChunks
(required)

The total number of chunks to remove. Always equals totalChunks from the purging task with the same correlation ID.

removedChunks
(required)

The number of chunks removed so far. When the task has finished, totalChunks and removedChunks will be equal. In addition, when the corresponding purging task with the same correlation ID has also finished, removedChunks will equal purgedChunks.

4.12.7. Purging task

This task tracks the progress of removing chunks from GeoRocket’s data store. The purging task and the removing task run in parallel. When both tasks have finished, the chunks have been deleted completely from GeoRocket. The purging task will always finish after the removing task.

type
(required)

The value is always purging.

totalChunks
(required)

The total number of chunks to remove. Always equals totalChunks from the removing task with the same correlation ID.

purgedChunks
(required)

The number of chunks removed so far. When the task has finished, totalChunks and purgedChunks will be equal. In addition, when the corresponding removing task with the same correlation ID has also finished, purgedChunks will equal removedChunks.

4.13. Compression

The GeoRocket HTTP interface supports GZIP compression. If the configuration item georocket.http.compress is set to true (default), GeoRocket is able to compress responses of all operations described above. Note that this will only work if the client advertises that it understands gzip by sending an appropriate Accept-Encoding HTTP header.

In addition, the POST file operation supports GZIP compression. Clients can upload compressed files to GeoRocket by including a Content-Encoding header in the request with a value of gzip.

4.14. Error responses

All endpoints described above return standardised HTTP status codes. With these status codes you are able to determine if an operation was successful or not. The error codes are descriptive (see RFC7231), but sometimes more information is needed. Whenever an error occurs, GeoRocket returns a JSON object providing additional details. The JSON object always has the same structure:

  • It has a property named error.

  • This property is an object with the properties type and reason.

  • type is a string providing more information about what kind of error has occurred.

  • reason is a human-readable string giving details about the cause of the error.

Error types

At the moment, the following values are defined for the error type:

generic_error

A generic error occurred, see the property reason for details.

http_error

The server issued an HTTP request to a third-party system (e.g. Elasticsearch) which failed

invalid_property_syntax_error

The syntax of a property is not valid. Valid properties are in the form key:value.

More types may be added in future versions of GeoRocket.

Example response
HTTP/1.1 404 Not Found
Transfer-Encoding: chunked

{"error":{"type":"generic_error","reason":"Not Found"}}

5. Query language

The GeoRocket query language can be used to search the data store for chunks matching given criteria.

5.1. Strings

GeoRocket performs a full-text search for strings in every tag and every indexed attribute.

Example:

string

5.2. Bounding boxes

Bounding boxes can be specified using four floating point numbers separated by a comma. The format is:

left,bottom,right,top

or

minimum_longitude,minimum_latitude,maximum_longitude,maximum_latitude

Example:

13.378,52.515,13.380,52.517

By default, spatial queries should be given in WGS84 coordinates (longitude/latitude), but you can also configure the default value in GeoRocket’s configuration file.

Alternatively, you may specify a coordinate reference system (CRS) directly in the query. For this, you have to put the CRS string in front of the coordinates. For example, the following notation specifies a bounding box in the metric 'DHDN / 3-degree Gauss-Kruger zone 3' reference system:

EPSG:31467:3477533,5605738,3477534,5605739

CRS strings should be in the form EPSG:<code> (e.g. EPSG:25832). See the EPSG registry for more information.

5.3. Logical operators

The operators OR, AND and NOT can be used to logically combine terms in a query. They are applied using the following notation:

<operator>(<operand_1> <operand_2> ... <operand_n>)

Operands are separated by a space character. Logical operations can be nested.

Examples:

AND(a b)
AND(a NOT(b))
OR(NOT(a) NOT(b))

5.3.1. OR

Use the logical OR operator to search for chunks that match at least one of the given operands.

Example:

OR(foo 13.378,52.515,13.380,52.517 bar)

This example matches all chunks that have a tag or indexed attribute with the value foo or bar as well as those that are within the bounding box 13.378,52.515,13.380,52.517.

By default, if you don’t specify a logical operation, all top-level terms in a query are combined by OR. Just use a space character to separate operands. The following query is a shorthand for the example above.

Example:

foo 13.378,52.515,13.380,52.517 bar

5.3.2. AND

Use the logical AND operator to search for chunks that match all of the given operands.

Example:

AND(13.378,52.515,13.380,52.517 foobar)

This example matches all chunks that are within the bounding box 13.378,52.515,13.380,52.517 and that have a tag or indexed attribute with a value of foobar.

5.3.3. NOT

Use the logical NOT operator to search for chunks that match none of the given operands.

Example:

NOT(13.378,52.515,13.380,52.517 foobar)

This example matches all chunks that are not within the bounding box 13.378,52.515,13.380,52.517 and that don’t have a tag or indexed attribute with a value of foobar.

5.4. Comparison operators

These operators can be used to compare property values to literals. There are five comparison operators:

EQ

equals

The property value must be equal to the given literal

LT

less than

The property value must be less than the given literal

GT

greater than

The property value must be greater than the given literal

LTE

less or equal

The property value must be less than or equal to the given literal

GTE

greater than

The property value must be greater than or equal to the given literal

Similar to logical operators, comparison operators must be given in the prefix notation as follows:

<operator>(<property> <literal>)

Examples:

EQ(type building)
LT(lod 3)
GTE(yearOfConstruction 1982)

You can also combine logical and comparison operators as follows:

NOT(EQ(type building))
OR(EQ(lod 1) GT(lod 2))
AND(GTE(yearOfConstruction 1982) LT(yearOfConstruction 2000))

Numerical property values, dates, and times are automatically analysed by GeoRocket and can be used in combination with the comparison operators. For example, if you attach a property named importDate to all chunks, denoting the date when the chunk was imported into GeoRocket, you will be able to query the data store for all chunks whose importDate is before 1 January 2017 with the following query:

LT(importDate 2017-01-01)

Dates must be given in the form YYYY-MM-DD, YYYY-MM or YYYY. Times must be given as HH:mm:ss, HH:mm or HH.

6. Client configuration

You can configure GeoRocket’s command-line application (CLI) by editing the file conf/georocket.yaml in the application directory. The file must be a valid YAML file. The following sections describe possible configuration keys and values.

Keys are specified using the dot notation. You can use the keys in your file as they are specified here or use normal YAML notation instead. For example, the following configuration item

georocket.host: localhost

is identical to:

georocket:
  host: localhost

6.1. Server connection

georocket.host
(default: "localhost")

The host where GeoRocket Server is running.

georocket.port
(default: 63020)

The TCP port GeoRocket Server is listening on.

7. Server configuration

You can configure GeoRocket Server by editing the file conf/georocketd.yaml in the application directory. The file must be a valid YAML file. The following sections describe possible configuration keys and values.

Keys are specified using the dot notation. You can use the keys in your file as they are specified here or use normal YAML notation instead. For example, the following configuration item

georocket.storage.class: io.georocket.storage.file.FileStore

is identical to:

georocket:
  storage:
    class: io.georocket.storage.file.FileStore

You may override items in your configuration file with environment variables. This is particularly useful if you are using GeoRocket inside a Docker container. The environment variables use a slightly different naming scheme. All variables are in capital letters and dots are replaced by underscores. For example, the configuration key georocket.storage.class becomes GEOROCKET_STORAGE_CLASS and georocket.storage.mongodb.database becomes GEOROCKET_STORAGE_MONGODB_DATABASE. You may use YAML syntax to specify the environment variable values.

7.1. General

georocket.home
(default: application directory)

An absolute path to the directory where GeoRocket can find its configuration and where it should put its internal storage directory.

georocket.logConfig
(default: false)

A boolean value (true or false) denoting whether GeoRocket should log its configuration on startup. This can be useful for debugging.

7.2. Queries

georocket.query.defaultCRS
(default: EPSG:4326)

A coordinate reference system (CRS) that should be used by default for all queries. CRS strings should be given in the form EPSG:<code> (e.g. EPSG:25832). See the EPSG registry for more information. The default value refers to World Geodetic System 1984 (WGS 84), which is the reference coordinate system used by the Global Positioning System (GPS) based on longitude and latitude.

7.3. HTTP interface

georocket.host
(default: "127.0.0.1")

The host GeoRocket should bind to. By default GeoRocket only listens to incoming connections from 127.0.0.1 (localhost). If you want it to listen to connections coming from arbitrary clients set this configuration item to 0.0.0.0.

georocket.port
(default: 63020)

The TCP port GeoRocket should listen on.

georocket.http.compress
(default: true)

A boolean value (true or false) denoting whether GeoRocket should compress responses with gzip/deflate if the client supports it.

georocket.http.ssl
(default: false)

A boolean value (true or false) denoting if HTTP connections should be encrypted via SSL/TLS. This feature requires georocket.http.certPath and georocket.http.keyPath to be set.

georocket.http.certPath
(optional)

Path to a X.509 certificate file to be used for encryption. Only necessary if georocket.http.ssl is enabled.

georocket.http.keyPath
(optional)

Path to a file containing a non-encrypted private key to be used for encryption. Only necessary if georocket.http.ssl is enabled.

georocket.http.alpn
(default: false)

True if GeoRocket should support Application-Layer Protocol Negotiation (ALPN) and, hence, HTTP/2 connections. This feature requires georocket.http.ssl to be enabled.

georocket.http.cors.enable
(default: false)

A boolean value (true or false) denoting whether Cross-Origin Resource Sharing (CORS) should be enabled (i.e. whether GeoRocket can be accessed by a browser on another origin).

georocket.http.cors.allowOrigin
(defaults to no allowed origins)

A regular expression specifying allowed origins. Use * to allow all origins.

georocket.http.cors.allowCredentials
(default: false)

A boolean value (true or false) denoting whether the Access-Control-Allow-Credentials response header should be returned.

georocket.http.cors.allowHeaders
(optional)

A string or an array indicating which header field names can be used during a request.

georocket.http.cors.allowMethods
(optional)

A string or an array indicating which HTTP methods can be used during a request.

georocket.http.cors.exposeHeaders
(optional)

A string or an array indicating which headers are safe to expose to the API of a CORS API specification.

georocket.http.cors.maxAge
(optional)

The number of seconds the results of a preflight request can be cached in a preflight result cache.

7.4. Back-ends

georocket.storage.class
(defaults to the H2 back-end)

The data store implementation to use. Possible values include:
io.georocket.storage.file.FileStore
io.georocket.storage.h2.H2Store
io.georocket.storage.hdfs.HDFSStore
io.georocket.storage.mongodb.MongoDBStore
io.georocket.storage.s3.S3Store

7.4.1. File back-end

Store chunks in a folder structure on the local hard drive. Each chunk will be written to a separate file.

Data store implementation
io.georocket.storage.file.FileStore
Configuration

georocket.storage.file.path
(required)

The path on the local hard drive where the data store should be located.

7.4.2. H2 back-end

Store chunks in a H2 database on the local hard drive. This back-end is typically much faster than the file back-end. All chunks will be written to a single file (the H2 database).

Data store implementation
io.georocket.storage.h2.H2Store
Configuration

georocket.storage.h2.path
(required)

The path on the local hard drive where the H2 database file should be located.

georocket.storage.h2.compress
(default: false)

A boolean value (true or false) denoting whether the chunks stored in the H2 database should be compressed using the LZF algorithm. This can save a lot of disk space but will slow down read and write operations slightly.

7.4.3. HDFS

Store chunks on HDFS (Hadoop distributed file system). Each chunk will be written to a separate file on the distributed file system.

Data store implementation
io.georocket.storage.hdfs.HDFSStore
Configuration

georocket.storage.hdfs.defaultFS
(required)

The endpoint of the HDFS NameNode

georocket.storage.hdfs.path
(required)

The path on the distributed file system where the chunks should be stored. The directory must exist and write permissions must have been granted to the user executing GeoRocket.

7.4.4. MongoDB

Store chunks in a MongoDB database. GeoRocket uses MongoDB’s GridFS to store chunks. This back-end is recommended for applications that need very fast and efficient storage (optionally combined with other capabilities of MongoDB such as replication and sharding).

Data store implementation
io.georocket.storage.mongodb.MongoDBStore
Configuration

georocket.storage.mongodb.connectionString
(required)

The connection string URI used to connect to MongoDB. For example: mongodb://localhost:27017

georocket.storage.mongodb.database
(required)

The database where the chunks should be stored

It is possible to compress the communication between GeoRocket and MongoDB by specifying the compressors option as part of the connection string. The following connection string enables the fast Snappy compression algorithm:

mongodb://localhost:27017/?compressors=snappy

This can save a lot of bandwidth since the chunks managed by GeoRocket can typically be compressed very effectively. It is recommended to enable this option all the time. See the MongoDB Java driver documentation for more information.

7.4.5. Amazon S3

Store chunks in an Amazon S3 bucket. Each chunk will be written to a separate object.

Data store implementation
io.georocket.storage.s3.S3Store
Configuration

georocket.storage.s3.accessKey
(required)

The Amazon S3 Access Key used for authentication

georocket.storage.s3.secretKey
(required)

The Amazon S3 Secret Key used for authentication

georocket.storage.s3.host
(required)

The host of the S3 endpoint

georocket.storage.s3.port
(default: 80)

The port of the S3 endpoint

georocket.storage.s3.bucket
(required)

The S3 bucket where chunks should be stored

georocket.storage.s3.pathStyleAccess
(default: true)

true if path-style access to the S3 bucket is used or false if a sub-domain is used

georocket.storage.s3.forceSignatureV2
(default: false)

true if S3 requests should be signed using the old Signature V2 algorithm instead of newer versions

georocket.storage.s3.requestExpirySeconds
(default: 600)

The number of seconds a pre-signed S3 request should stay valid

7.5. Index

georocket.index.maxBulkSize
(default: 200)

The maximum number of chunks GeoRocket sends to Elasticsearch for indexing in one request. Tweak this parameter if you experience problems with Elasticsearch being too busy.

georocket.index.maxParallelInserts
(default: 5)

The maximum number of files GeoRocket imports in parallel. If more files are sent to GeoRocket they will be put into a queue. Tweak this parameter if you experience problems with Elasticsearch or GeoRocket being too busy and occupying too many resources.

georocket.index.maxQueuedChunks
(default: 10000)

The maximum number of chunks the indexer queues due to backpressure before it pauses the import. If this happens, the indexer will later unpause the import as soon as at least half of the queued chunks have been indexed. Lower this value if you are importing a large amount of data and GeoRocket uses too much memory.

georocket.index.indexableChunkCache.maxSize
(default: 67108864 = 64 MB)

After chunks have been imported into the store and before they are indexed, they are temporarily put into a cache to save bandwidth and time. This configuration item specifies the maximum size of this cache in bytes. The more often GeoRocket can make use of cached chunks, the faster it will index them and the less it has to communicate with the storage back-end. A high maximum cache size may mean more memory consumption (depending on how many chunks are kept in the cache at a time). A reasonable value is the average size of the geospatial files you typically import but you may also choose a much higher value if you have enough available RAM in your system.

georocket.index.indexableChunkCache.maxTimeSeconds
(default: 60)

The maximum number of seconds a chunk stays in the cache after import and before it is indexed. If this value is too low, chunks may have to be retrieved from the storage back-end during indexing.

georocket.index.spatial.precision
(default: maximum)

The desired precision for the spatial indexer in GeoRocket. The value should be a number followed by a distance unit (e.g. 1m, 2km, 10cm, 1mi). Note that the higher the precision, the more memory GeoRocket will use. Set this configuration item to a value that is reasonable for your application. The default value is the highest precision GeoRocket (or Elasticsearch) can achieve. However, this value might not work well for geometries that cover a large area such as a whole country (or even the world). Reduce the precision in such a case to save memory and to avoid crashes.
ATTENTION: This value cannot be changed once GeoRocket has created its index. Set this value before you start GeoRocket for the first time.

7.6. Elasticsearch

The GeoRocket distribution contains a version of Elasticsearch that will automatically be started together with GeoRocket by default. You can disable this behaviour and use a remote Elasticsearch instance instead.

Set the following configuration items to disable the provided Elasticsearch instance and to configure the host and port of the remote one:

georocket:
  index:
    elasticsearch:
      embedded: false
      hosts: ["192.168.0.100:9200"]

Replace the connection string 192.168.0.100:9200 with the actual hostname (or ip address) and port of your existing Elasticsearch instance.

7.6.1. Configuration

georocket.index.elasticsearch.embedded
(default: true)

true if GeoRocket should launch the provided Elasticsearch instance. false if it should connect to an existing instance.

georocket.index.elasticsearch.hosts
(default: ["localhost:9200"])

An array of connection strings. If georocket.index.elasticsearch.embedded is false, the array defines how to connect to an existing Elasticsearch instance/cluster. Each item is a string consisting of a hostname (or ip address) and a port joined by a colon and denotes the address of an Elasticsearch node. If your Elasticsearch instance has only one node, you must specify exactly one item in the array. If you want to connect to multiple nodes of a cluster, you may specify multiple items (e.g. ["192.168.0.100:9200", "192.168.0.101:9200", "192.168.0.102:9200"]). If georocket.index.elasticsearch.embedded is true, only the first item in the array will be considered. In this case, this item will specify to which host and port the embedded Elasticsearch instance will be bound to.

georocket.index.elasticsearch.autoUpdateHostsIntervalSeconds
(default: -1)

If this configuration item is greater than 0, GeoRocket will regularly poll the configured Elasticsearch cluster and automatically update the list of nodes (georocket.index.elasticsearch.hosts). This is useful if you have a dynamic cluster with a changing number of hosts or if you do not want to specify all nodes in georocket.index.elasticsearch.hosts and wish GeoRocket to fill it automatically for you. For example, you may only specify the Elasticsearch master nodes and let GeoRocket discover the data nodes automatically. The configuration item specifies the update interval in seconds. A reasonable number is 300, which equals 5 minutes. Note that this configuration item will be ignored if georocket.index.elasticsearch.embedded is true.

georocket.index.elasticsearch.compressRequestBodies
(default: false)

true if bodies of HTTP requests sent to Elasticsearch should be compressed with GZIP. This can save bandwidth but only works if HTTP compression is enabled in Elasticsearch.

georocket.index.elasticsearch.javaOpts
(optional)

JVM options for the embedded Elasticsearch instance. This configuration item will only be taken into account if georocket.index.elasticsearch.embedded is true. It can be overridden through the environment variable ES_JAVA_OPTS. See the Elasticsearch documentation for more information about ES_JAVA_OPTS and reasonable values for the heap size.


1. Exported files might have a slightly different formatting. Whitespaces between chunks might be different, but other than that, exported files contain the exact same information as imported ones.