MB-65860: Introducing support for fileIO Callbacks#2209
MB-65860: Introducing support for fileIO Callbacks#2209abhinavdangeti merged 28 commits intomasterfrom
Conversation
4d605e7 to
9e54bce
Compare
e8c8761 to
e45bcf3
Compare
| continue | ||
| } | ||
| snapshot := snapshots.Bucket(k) | ||
| metaBucket := snapshot.Bucket(util.BoltMetaDataKey) |
There was a problem hiding this comment.
since it looks like every bolt write/read is something now subjected to the file writer/reader operation, why not rootBolt be a struct that wraps the boltDB within it and it implements the exact same methods of bolt APIs. This struct will also have the writer and reader registered in it, so when you do any bolt.API() call it'll do an addtional filewriter/reader operation - in my opinion this way we will have much better inheritance and encapsulation of the writer and reader operations. Similar to how index_meta encapsulates the file writer and reader
@Likith101, @abhinavdangeti please let me know if this is something you guys agree and are open to implementing this
There was a problem hiding this comment.
I believe this is something we can look into. I do agree with all of the positives, but it will be a significant effort to implement this currently
| doneCh := make(chan error) | ||
| ctx = context.WithValue(ctx, mergeDoneKey, doneCh) | ||
|
|
||
| // ROLLBACKING IS DISABLED FOR THIS OPERATION |
There was a problem hiding this comment.
Quick nit :)
Refactor to - PARTIAL ROLLBACK* WILL NOT BE SUPPORTED
There was a problem hiding this comment.
@Thejas-bhat would you weigh in on this - still a bit uncomfortable with this.
There was a problem hiding this comment.
Yeah I've been trying to understand this part of code today. I'm guessing we'd temporarily force the index to have only one index snapshot in the background which is like a violation of contract that the index should always maintain numSnapshotsToKeep snapshots, especially when there are enough snapshots in the system. I think we should be re-encrypting things over here since we need to adhere to the existing contracts as well.
There was a problem hiding this comment.
I'll try to explain the scenario the best I can. Rollbacking works because of sequence number changes. Let's say there are 2 snapshots currently present. One with sequence number 200 and one with sequence number 300. In this case, it is useful to hold on to the older snapshot incase of any KV related issues causing us to go back to any sequence number between 200 and 300.
Now, coming to this particular drop operation, there are a couple more assumptions we need to make. Drop Keys is mostly called as soon as a key has been rotated and re-encrypted or when encryption is enabled/disabled or when the key itself has been replaced (this is different from rotation). In all of these operations, as soon as the option is toggled, we are issued a key drop request to get rid of the data that is used by the prior key material (empty key id incase of encryption disabling/enabling).
With this being the nature of this request, the first assumption we will make is that almost all of the data needs to be rewritten. There are other situations possible, but this is the most common one.
Secondly, when working with any index that is sufficiently large, we will end up encountering multiple segments/zap files that are unable to be merged. We instead opt to rewrite them using different merge tasks. With the way our current merger and persister work, each merge task execution will trigger the creation of a new snapshot. Therefore we introduce a bunch of new snapshots for the same operation.
Now, coming to the situation, assuming there are two snapshots with sequence numbers 200 and 300 both being present. It is incredibly complicated to ensure the survival of both of these snapshots independently.
- We will not be able to hold any data older data because it is a breach of contract for drop keys. Even if we held on to it, we will not be able to read it after rolling back because the key material would have been purged after the operation (by cbauth)
- We will not be able to hold on to the sequence number difference as we will likely introduce a bunch of newer snapshots all of which will have the latest sequence number
Thus, I opted to temporarily force the index to only hold one snapshot during the operation
There was a problem hiding this comment.
Secondly, when working with any index that is sufficiently large, we will end up encountering multiple segments/zap files that are unable to be merged.
I can't say I follow this.
We will not be able to hold on to the sequence number difference as we will likely introduce a bunch of newer snapshots all of which will have the latest sequence number.
And this - introducing new segments with the same content from older snapshot and binding it to a newer snapshot should not change sequence numbers. We're not really merging the segments are we - we're simply opening them and re-encrypting them with new keys before writing back.
I understand there is complexity involved in binding segments to an older snapshot than the root .. so for now - I'm ok with tabling this and introduce a separate PR if needed* after a conversation on Thursday.
No description provided.