Data Gator blog

To content | To menu | To search

Aggregation

Entries feed - Comments feed

Monday 13 February 2017

Updating Topic Data

In this post I will describe one way how the Topics data can get updated and thus aggregated. The updating is done strictly on an on demand bases. This was done to allow flexible configuration choices depending on the needs of the user. To update the data all one needs to do is make a HTTP PUT request to the updateData end point of the Topic. After receiving this HTTP PUT request then the DataGator web service will make a request for data from the broker that was configured when the Topic was created. After the data is received from the broker it is then aggregated by the Topic using the Aggregate(s) that have been configured for it. This is described more in a post below titled Aggregates.

Here is a simple example of getting the Topic data to update using curl.

curl -X PUT 'http://api.datagator.tech/Topics/588b99fbfba65a25640ccc58/updateData?access_token=TOKEN'

A sample response after making the above request would look similar to this.

{
"updated": {
"Open@3600": "Aggregated 1859 observations in 0.00978148 seconds with 0 bad observations. ",
"High@3600": "Aggregated 1859 observations in 0.0095574 seconds with 0 bad observations. ",
"Low@3600": "Aggregated 1859 observations in 0.0108046 seconds with 0 bad observations. ",
"Close@3600": "Aggregated 1859 observations in 0.0123292 seconds with 0 bad observations. ",
"Total": "7436 aggregate operations performed in 0.0424727 seconds. (175077 agg ops per sec)"
}
}

Some status and timings are currently in the results mainly just to get a relative idea of the impact that any changes have made. The timings are of just opening a connection to where the Aggregate data is stored, aggregating the data and storing it in its data base. Each Aggregate uses a separate and segregated data store.

There are various way of making the HTTP PUT request that will update the Topic data. I am currently using a Node-Red flow that will periodically make the request.  Using a Webhook service triggered by just about anything is also a possibility. Ideally the IoT gateway that collects the data should have a good idea of when it would be a good time to have it aggregated.

Later I will discuss visualizing the aggregated data.

Sunday 5 February 2017

Aggregates

After a Topic has been created as described in my previous posts, then a way to aggregate the data would be needed next.  In this next set of posts I will discuss how that can be done. 

First it should be helpful to understand that each new Aggregate created will have a Topic as a parent. Each Aggregate can have only one parent Topic and they are otherwise unique individual entities. Other than its parent the main properties of an Aggregate are its method and the time period it will use to aggregate the data. It is not possible to change these properties after an Aggregate has been created and in fact the only property that can be changed after it has been created is its name. 

When a Topic gets new data it will send a complete copy of this data to each of its Aggregates. Each Aggregate will then aggregate this data  according to the configured options then store the aggregated data for later use.

Creating a new Aggregate is very similar to how creating a Topic was discussed earlier. Below is the JSON configuration structure with some brief descriptions for a Topic.

Aggregate {
aggregateMethod (string): The aggregation method to be applied to the Topic data at each period duration. ,
period (number): The duration of time specified in seconds to aggregate the Topic data. ,
beginDate (string, optional): The starting time and date to use as the begining aggregated Topic data. ,
aggregateName (string, optional): A name to associate with this method of aggregated Topic data.
}

There are two required values (aggregateMethod and period) that will need to be set in order to create a new Aggregate. If the beginDate is not set then the brokerCreate date from the Topic to which this Aggreagte belongs will be used. The aggregateName is used to store a more human readable way to identify this Aggregate and can also be be used in a search criteria to find available Aggregates. 

Here is an example using curl that will create a new Aggregate.

curl -X POST -d '{ \ 
"aggregateName": "Close", \
"aggregateMethod": "Close", \
"period": 3600 \
}' 'http://api.datagator.tech/Topics/587f7c45aa784d4a3743667d/aggregates?access_token=TOKEN'

The Topic that this Aggregate will belong to is in the URI path used to create it. This is why it is not necessary to declare in the configuration JSON.

The response from creating the example Aggregate is below.

{
"aggregateMethod": "Close",
"period": 3600,
"beginDate": "2014-05-20T21:50:32.000Z",
"aggregateName": "Close",
"id": "588be68d58a8a828986379b4",
"created": "2017-01-28T00:32:13.674Z",
"modified": "2017-01-28T00:32:13.674Z"
}

In a future I will discuss how actually get the data aggregated.