Remove Sample and reimplement

First a little background. I will share some details of how we use multitick, but guard some of the specifics since not all of that code is open-source.

In many places where we use multitick, especially agents, we want the following functionality:
- Occasional ticking at larger intervals (e.g. 1 minute) during normal operation. Currently we call this "sampled ticking."
- Jitter (uneven ticking) so that there isn't a DDOS against our APIs. Ideally, predictable and deterministic to ease debugging and incident analysis.
- Guarantees about how many ticks will occur in an interval, even if jittered so it will happen unevenly. In other words, if we say we want a tick each minute but it's jittered, we do not want ticks to happen randomly but average once a minute. We mean, once per minute there WILL be one and only one tick, guaranteed.
- All users of the ticker get exactly the same ticks, so if several routines are doing sampled jittered ticking, they all get it at the same time. This is important because we may be doing things like getting some data samples, and we want all of them to correspond to the same point in time so we can make sense of the overall picture they present.
- Ability to turn off the sampling and get ticks at the underlying higher frequency. This is in response to something like a fault being detected by Adaptive Fault Detection. When such a condition is present, we want to be able to increase the frequency of our occasional data samples and get a lot of data for a short time. All users of the ticker should again get all the same ticks in this case, so they're coordinated.

In our internal codebase we currently accomplish this with something called `RandomlyWithin()`. This is based on a technique I learned from a colleague years ago. This is part of a larger bit of code, but for context, I think it's enough to say that we have an internal `ts` package that provides time functionality for a `Ts` type, which is Unix timestamps (integers). In that package, we have code like this:

``` go
// RandomlyWithin takes a timestamp and a duration, then returns a random time
// within that duration. For example, if you want to do something at an
// instant within the current hour, you can say now.RandomlyWithin(ts.Hour)
// and you'll get back a timestamp within the current hour at which you should
// do whatever it is you want. Although the choice of timestamp is random, it
// is deterministic, and you will receive the same answer all hour long. See
// http://stackoverflow.com/questions/14823792/how-to-choose-a-random-time-
// once-per-hour
// If you want to add some entropy, pass an optional second parameter, which
// will ensure that your process doesn't have the same behaviors as some other.
func (t Ts) RandomlyWithin(dur Ts, entropy ...uint32) Ts {
    md5hasher := md5.New()
    intervalStart := t.Floor(dur)
    toHash := uint32(intervalStart)
    if len(entropy) > 0 {
        toHash += entropy[0]
    }
    md5hasher.Write([]byte{
        uint8(toHash >> 24 & 255),
        uint8(toHash >> 16 & 255),
        uint8(toHash >> 8 & 255),
        uint8(toHash & 255)})
    randomNum := binary.BigEndian.Uint32(md5hasher.Sum(nil)[0:4])
    result := intervalStart + Ts(randomNum)%dur
    return result
}
```

So this is selecting an offset within the selected interval `dur`, based on the start period of the interval, and with some additional offset the caller can pass in.

Every time interval will have exactly one timestamp that satisfies the test. For example, if you ask if a timestamp is `RandomlyWithin(ts.Minute)` you will get a result for exactly one timestamp during that minute. This property is important because it does not rely on the application state. That is, if you have a process doing this once a minute and you restart it, there is no possibility that it will take a selected action twice in the minute.

Note that the function doesn't return true/false whether the given timestamp is the chosen one. It returns the chosen one. This can be useful because the calling application can then do things like countdowns; even if the selected timestamp isn't the chosen one, it knows which one will be chosen (or if that's already passed).

We use this code in our applications roughly like the following:

``` go
for now := range ticker {
    nowTs = ts.FromTime(now)
    if nowTs.RandomlyWithin(ts.Minute) == nowTs {
        // ...
    }
}
```

We were planning to move this functionality into multitick with the `Sample()` function, and make it appropriate for more general use cases, but after implementing it a little differently and looking at it, it doesn't really do what we want in some ways. So I think we need to reimplement it.

Here are two ideas I've had while thinking about this.
1. Make an object or a function that implements something similar to `ts.RandomlyWithin()` as shown above. The advantages of this are that extra ability to do things like figure out when the tick you're waiting for is coming. The pattern for use would then look pretty similar to the above code.
2. Make multitick.Ticker more flexible. It can either create a ticker internally and use it, or you can give it a ticker and it'll read ticks from that and broadcast them as usual. This way we could construct a Ticker by subscribing to another Ticker. (Chaining Tickers together could be very useful in general). If we do this, then one Ticker could generate events every second, and another one could filter them out and pass through one per interval (for example, one per minute). Turning on and off the filtering could be accomplished with Sample, as it is now. However, Sample could be implemented in terms of item 1, so users of the package could either get a simple interface without bells and knobs, or use the knobs if they want to.

I'm looking for suggestions and feedback.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Remove Sample and reimplement #9

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Remove Sample and reimplement #9

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions