Downsampling data over time
As data ages, storage and query could be further optimized by down sampling the data. There should be a process or method to provide an aggregate query that could run on data older than X time. Example: My database retention is 365 days and i ingest 3 samples per minute. After data is 31 days old i can provide a query that will aggregate by MAX value over 1 minute. After 90 days, another aggregate runs that finds the Max value over 5 minutes. Raw data is of course deleted after aggregate runs.
Jake Edwards commented
I would like something similar, it's like a smarter retention policy.
Data could also be saved by reducing the resolution/granularity of the historical data.
For example, let's say you take samples of data every 30 seconds.
If you have 1 day worth of metrics from 3 months ago, that read the same value. You could reduce the "resolution" of the history so that that entire day reads as the same value (i.e. 1 entry instead of 2,880). Basically removing/collapsing redundant historical data.