-
Notifications
You must be signed in to change notification settings - Fork 14
Description
Problem Description
I'm not sure whether this is even supposed to work, but storing both numeric and string data within a single channel results in three issues:
- Doing an export on such a channel will only return the numeric data. However, doing a tile request returns both.
- Inserting a sample with string data for a timestamp which has no numeric data causes a numeric data value to be inserted as well, apparently from the neighboring timestamp.
- Doing a gettile for a data sample at a time that has a string value but no numeric value returns 0 for the value. One would expect
nullinstead.
Before illustrating the steps to reproduce, assume we have the following two JSON data files to insert into the datastore:
data1.json
{
"channel_names" : ["foo"],
"data" : [
[1450000001, 10],
[1450000002, 20],
[1450000004, 40],
[1450000005, 50],
[1450000006, 60],
[1450000007, 70],
[1450000008, 80],
[1450000009, 90]
]
}data2.json
{
"channel_names" : ["foo"],
"data" : [
[1450000003, "thirty"],
[1450000007, "seventy"]
]
}Steps To Reproduce
begin by inserting the first data file:
./bin/import --format json ./data-test 100 cpb_device ./data1.json
It should succeed with the following response:
{
"channel_specs" : {
"foo" : {
"channel_bounds" : {
"max_time" : 1450000009,
"max_value" : 90,
"min_time" : 1450000001,
"min_value" : 10
},
"imported_bounds" : {
"max_time" : 1450000009,
"max_value" : 90,
"min_time" : 1450000001,
"min_value" : 10
}
}
},
"failed_records" : 0,
"max_time" : 1450000009,
"min_time" : 1450000001,
"successful_records" : 1
}Now do an export to verify:
./bin/export --csv ./data-test 100.cpb_device.foo
It should print the following:
EpochTime,100.cpb_device.foo
1450000001,10
1450000002,20
1450000004,40
1450000005,50
1450000006,60
1450000007,70
1450000008,80
1450000009,90
Also verify by requesting a tile:
./bin/gettile ./data-test 100 cpb_device.foo -5 90625000
You should get the following:
{
"data" : [
[1450000001, 10, 0, 1],
[1450000002, 20, 0, 1],
[1450000004, 40, 0, 1],
[1450000005, 50, 0, 1],
[1450000006, 60, 0, 1],
[1450000007, 70, 0, 1],
[1450000008, 80, 0, 1],
[1450000009, 90, 0, 1],
[1450000012.5, -1e308, 0, 0]
],
"fields" : ["time", "mean", "stddev", "count"],
"level" : -5,
"offset" : 90625000
}So far, so good. Now, insert the second data file. Note that this data file contains two string values, one at time 1450000003 and another at time 1450000007. Looking at data1.json, we see that there's no existing numeric data value for this channel at time 1450000003, but there is one (70) for time 1450000007.
./bin/import --format json ./data-test 100 cpb_device ./data2.json
It should succeed with the following response:
{
"channel_specs" : {
"foo" : {
"channel_bounds" : {
"max_time" : 1450000009,
"max_value" : 90,
"min_time" : 1450000001,
"min_value" : 10
},
"imported_bounds" : {
"max_time" : 1450000007,
"min_time" : 1450000003
}
}
},
"failed_records" : 0,
"max_time" : 1450000007,
"min_time" : 1450000003,
"successful_records" : 1
}Now do another export to verify:
./bin/export --csv ./data-test 100.cpb_device.foo
EpochTime,100.cpb_device.foo
1450000001,10
1450000002,20
1450000003,40
1450000004,40
1450000005,50
1450000006,60
1450000007,70
1450000008,80
1450000009,90
So, there are the first two problems: no string values are getting exported at all and a numeric value (which we never inserted) is getting returned at time 1450000003. There's of course the question of how to report 2 values for a single timestamp (as with time 1450000007), but that's more of an implementation detail. Regardless, one would expect to at least see a string value for time 1450000003.
Now do a gettile to see the difference:
./bin/gettile ./data-test 100 cpb_device.foo -5 90625000
{
"data" : [
[1450000001, 10, 0, 1, null],
[1450000002, 20, 0, 1, null],
[1450000002.5, -1e308, 0, 0, null],
[1450000003, 0, 0, 1, "thirty"],
[1450000003.5, -1e308, 0, 0, null],
[1450000004, 40, 0, 1, null],
[1450000005, 50, 0, 1, null],
[1450000006, 60, 0, 1, null],
[1450000007, 70, 0, 1, "seventy"],
[1450000008, 80, 0, 1, null],
[1450000009, 90, 0, 1, null],
[1450000012.5, -1e308, 0, 0, null]
],
"fields" : ["time", "mean", "stddev", "count", "comment"],
"level" : -5,
"offset" : 90625000
}For gettile, we get the string values, but notice that the value at time 1450000003 is no longer being reported as 40--it's now 0. One would expect a null value, just like there are null values for the comment fields.