sql - Deleting redundant values in timeseries data -


consider database scheme this:

create table log (     observation_point_id integer primary key not null,     datetime text not null,     value real not null ) 

which contains 'observations' of value; example temperature measurement. observation device (i.e., thermometer :) ) samples temperature every 5 seconds , gets logged database.

there multiple thermometers, each of identified (for purposes of simplified example) 'observation_point'.

now, let's assume precision of thermometer 1 degree; have many observations redundant. let's log x degrees @ 9h00m00s, it's quite still x degrees @ 9h00m05s, 9h00m10s etc. need store value , time @ first measured temperature, , @ last measured it.

i can check on every insert if value preceding redundant, , delete that. that's quite expensive, considering there many loggers write database, , frequency of logging higher 5 seconds in real use case.

so idea run 'cleanup' every, 1 minute, delete values between extremes e1 , e2 interval [e1,e2] defined each series of subsequent values v1, v2, ..., vn v1 = v2 = ... = vn. 'subsequent' here meaning when ordered 'datetime'.

my question: there way express in sql query? there way approach this?

(my baseline 'select order by', loop on results). can't 'before' values hit database (i.e., cache values until next measurement , write value if measurement different), because might observations @ lower frequency once every few seconds, , cannot afford lose observations. (now i'm typing this, maybe 'cache' values in separate database table, think i'm straying far real question now).


Comments