|
1 | 1 |
|
2 | | -## Client-server protocol: Bookmarks |
| 2 | +This document describes what bookmarks are, and how they are used in client-server protocol. |
3 | 3 |
|
4 | 4 | A stream is a sequence of entries, where these entries belong to some specific operation, and are identified by an entry number. |
5 | 5 |
|
@@ -102,229 +102,3 @@ Next is a code of an error message, $\texttt{errorStr}$. If everything goes well |
102 | 102 | A string is also sent, which is in a more human readable text. |
103 | 103 |
|
104 | 104 | An error is sent in the form of a code and a string, where the string is just an array of bytes, and each byte is an ASCII character. |
105 | | - |
106 | | - |
107 | | -## Client-server protocol messages |
108 | | - |
109 | | -The stream client-server protocol messages are $\texttt{Start}$, $\texttt{StartBookmark}$, $\texttt{Stop}$, $\texttt{Header}$, $\texttt{Entry}$ and $\texttt{Bookmark}$. |
110 | | - |
111 | | -- The $\texttt{Start}$ message is sent from the stream client to the stream server in order to request for a stream to be sent. |
112 | | - |
113 | | - If the stream client wants to receive all the stream file, a $\texttt{Start}$ message is sent with an $\mathtt{entryNumber}$ $= \texttt{0}$. |
114 | | - |
115 | | - If the stream client knows the entry number at which the stream should start, it sends a $\texttt{Start}$ message with that particular entry number. That is, $\mathtt{entryNumber} \not= \texttt{0}$. |
116 | | - |
117 | | -- $\texttt{StartBookmark}$ message is the type of message the stream client can send if the the stream client does not the know entry number, but knows something more meaningful to the application, like a bookmark. |
118 | | - |
119 | | - In the case of the Polygon zkEVM, if the stream client wants to receive information from a certain L2 block number, then it provides the appropriate bookmark by sending a $\texttt{StartBookmark}$ message. |
120 | | - |
121 | | - Such a bookmark is actually a codification of the L2 block number. |
122 | | - |
123 | | -- $\texttt{Stop}$ message is a message the stream client can send to the stream server if it wants to stop receiving the stream. |
124 | | - |
125 | | -- $\texttt{Header}$ message can be sent if the stream client requests just the header of a particular entry. |
126 | | - |
127 | | - We discuss the information an entry contains, in more detail, later in this document. |
128 | | - |
129 | | - In summary, an entry has two parts: a part called a $\texttt{Header}$, and a part called the $\texttt{data}$. |
130 | | - |
131 | | - So, a $\texttt{Header}$ message is used to request for just the header, but not the full data of an entry. |
132 | | - |
133 | | - It's like asking for only the block header and not the entire L2 block, in the Polygon zkEVM case. |
134 | | - |
135 | | -- $\texttt{Entry}$ message can be sent by the stream client in order to request for a specific entry in the stream file. Thus, instead of requesting the stream server to start streaming from a particular entry onwards, only one entry is obtained by sending an $\texttt{Entry}$ message. |
136 | | - |
137 | | -- $\texttt{Bookmark}$ message is a message in which a stream client sends a bookmark to the stream server, and the stream server in turn tells the stream client what entry number is linked to the bookmark. |
138 | | - |
139 | | -The above six messages are all messages used in the client-server protocol. And they can be found [here](https://github.com/0xPolygonHermez/zkevm-data-streamer#stream-tcp-commands). |
140 | | - |
141 | | -## Stream server-source library |
142 | | - |
143 | | -Interaction between the stream source and each stream server is enabled by the Server-source library, which is a Go library with six main functions for modifying or adding entries to operations. |
144 | | - |
145 | | -### Send data functions |
146 | | - |
147 | | -When each of these functions is called, a corresponding message is generated and sent to the stream server: |
148 | | - |
149 | | -1. $\texttt{StartAtomicOp}(\ )$ starts an atomic operation. When called, a message that amounts to saying: "start an atomic operation," is generated and sent from the stream source to the stream server. |
150 | | -2. $\texttt{AddStreamEntry(u32 entryType, u8[] data)}$ adds an entry to the atomic operation and returns an $\texttt{u64 entryNumber}$. When called, a message equivalent to saying: "Add an entry of this type, with this data, to the current atomic operation," is generated and sent to the stream server. |
151 | | -3. $\texttt{AddStreamBookmark(u8[] bookmark)}$ adds an entry to the atomic operation and returns an $\texttt{u64}$ $\texttt{entryNumber}$. |
152 | | -4. $\texttt{CommitAtomicOp}(\ )$ commits an operation $\texttt{Op}$ so that its entries can be sent to stream clients. When called, a message which is tantamount to saying: "All entries associated with the current operation have been sent, the operation ends with the last sent entry," is generated and sent to the stream server. |
153 | | -5. $\texttt{RollbackAtomicOp}(\ )$ rolls back an atomic operation. |
154 | | -6. $\texttt{UpdateEntryData(u64 entryNumber, u32 entryType, u8[] newData)}$ updates an existing entry. This function only applies to entries for which the atomic operation has not been committed. |
155 | | - |
156 | | - |
157 | | - |
158 | | -### Query data functions |
159 | | - |
160 | | -The stream source can use a few more functions of the stream server-source library, to get information from the stream server. |
161 | | - |
162 | | -It uses the following functions: |
163 | | - |
164 | | -- $\texttt{GetHeader()}$: The stream source uses this function to query the header of a particular entry. The function returns, $\texttt{struct HeaderEntry}$. |
165 | | -- $\texttt{GetEntry(u64 entryNumber)}$: This function is used to get an entry that corresponds to a given entry number. It returns, $\texttt{struct FileEntry}$. |
166 | | -- $\texttt{GetBookmark(u8[ ] bookmark)}$: The stream source uses this function to get a bookmark. The function returns, $\texttt{u64 entryNumber}$. |
167 | | -- $\texttt{GetFirstEventAfterBookmark(u8[ ] bookmark)}$: This function is used to get the first entry after a given bookmark. It returns, $\texttt{struct FileEntry}$. |
168 | | - |
169 | | - |
170 | | - |
171 | | -The complete stream source-server library is described, but referred to as the DATA STREAMER INTERFACE (API), [here](https://github.com/0xPolygonHermez/zkevm-data-streamer#data-streamer-interface-api). |
172 | | - |
173 | | -It's possible to create, using the stream source-server library, a stream source that connects with a server, opens and commits operations. |
174 | | - |
175 | | - |
176 | | - |
177 | | -## Stream file |
178 | | - |
179 | | -Next is an explanation of stream file structure. |
180 | | - |
181 | | -The stream file is created in a binary format instead of a text file. |
182 | | - |
183 | | -It has a header page and one or more data pages. The header page is first and has a fixed size of 4096 bytes. The data pages follow immediately after the header page, and the size of each data page is 1 MB. |
184 | | - |
185 | | -Data pages contain entries. |
186 | | - |
187 | | -If an entry does not fit in the remaining page space, it gets stored in the next page. This means the unused space in the previous data page gets filled with some padding. |
188 | | - |
189 | | - |
190 | | - |
191 | | - |
192 | | -### The header page |
193 | | - |
194 | | -Let's zoom into how the $\texttt{Header}$ page looks like. |
195 | | - |
196 | | -1. The $\texttt{HeaderEntry}$ consists of the following data; $\texttt{magicNumbers}$, $\texttt{packetType}$, $\texttt{headerLength}$, $\texttt{streamType}$, $\texttt{TotalLength}$, and $\texttt{TotalEntries}$. |
197 | | - |
198 | | - |
199 | | - |
200 | | - The $\texttt{HeaderEntry}$ starts with an array of 16 bytes, called $\texttt{magicNumbers}$. |
201 | | - |
202 | | - |
203 | | -$$ |
204 | | -\texttt{u8[16] magicNumbers} |
205 | | -$$ |
206 | | - |
207 | | - |
208 | | - The $\texttt{magicNumbers}$ identify which application to which the data in the stream file belongs. |
209 | | - |
210 | | - |
211 | | - |
212 | | -2. The $\texttt{magicNumbers}$ identify which application to which the data in the stream file belongs. |
213 | | - |
214 | | -In the Polygon zkEVM case, the $\texttt{magicNumbers}$ is the ASCII-encoding of these sixteen (16) characters: $\texttt{polygonDATSTREAM}$. |
215 | | - |
216 | | - |
217 | | - |
218 | | -3. After the $\texttt{magicNumbers}$ comes the $\texttt{packetType}$, which indicates whether the current page is a $\texttt{Header}$ page or a $\texttt{Data}$ page. |
219 | | - |
220 | | - |
221 | | -$$ |
222 | | -\texttt{u8 packetType = 1 // 1: Header entry} |
223 | | -$$ |
224 | | - |
225 | | - |
226 | | -The $\texttt{packetType}$ for the $\texttt{Header}$ entry is $\texttt{1}$, but it is $\texttt{2}$ for the $\texttt{Data}$ entry and $\texttt{0}$ for a padding. |
227 | | - |
228 | | - |
229 | | - |
230 | | - |
231 | | -4. Included in the $\texttt{Header}$ page is the $\texttt{streamType}$, which has the same meaning as seen in the Server-source protocol: It indicates the application, or in particular, the stream source node to which the stream server should connect. |
232 | | - |
233 | | - |
234 | | -$$ |
235 | | -\texttt{u64 streamType // 1: zkEVM Sequencer} |
236 | | -$$ |
237 | | - |
238 | | - |
239 | | -As mentioned in the above line of code, $\texttt{streamType = 1}$ means the stream source node is the zkEVM Sequencer. |
240 | | - |
241 | | - |
242 | | - |
243 | | -5. The $\texttt{streamType}$ is then followed by the $\texttt{TotalLength}$ , which is the total number of bytes used in the stream file. |
244 | | - $$ |
245 | | - \texttt{u64 TotalLength // Total bytes used in the file} |
246 | | - $$ |
247 | | - |
248 | | - |
249 | | - |
250 | | -6. After the $\texttt{TotalLength}$ is the $\texttt{TotalEntries}$, which is the total number of entries used in the file. |
251 | | - |
252 | | - |
253 | | - $$ |
254 | | - \texttt{u64 TotalEntries // Total number of data entries} |
255 | | - $$ |
256 | | - |
257 | | - |
258 | | - |
259 | | -### The data pages |
260 | | - |
261 | | -A data page contains entries and some padding. |
262 | | - |
263 | | -Since this is a $\texttt{data}$ page, and not a $\texttt{Header}$ page, the entries are preceded by $\texttt{packetType = 2}$, while the padding is preceded by $\texttt{packetType = 0}$. |
264 | | - |
265 | | - |
266 | | -$$ |
267 | | -\begin{aligned} |
268 | | -1.\quad &\texttt{u8 packetType // 2:Data entry, 0:Padding} \\ |
269 | | -2.\quad &\texttt{u32 Length // Total length of data entry} \\ |
270 | | -3.\quad &\texttt{u32 Type // 0xb0:Bookmark, 1:Event1, 2:Event2,... } \\ |
271 | | -4.\quad &\texttt{u64 Number // Entry number (sequence from 0)} \\ |
272 | | -5.\quad &\texttt{u8[] data} |
273 | | -\end{aligned} |
274 | | -$$ |
275 | | - |
276 | | - |
277 | | -The $\texttt{packetType}$ is followed by the $\texttt{Length}$ of the data entry, then the $\texttt{Type}$ of the entry. That is, whether it is a bookmark or an event entry. |
278 | | - |
279 | | -A bookmark $\texttt{Type}$ is indicated by $\texttt{0xb0}$, while each event's $\texttt{Type}$ is its position among a sequence of events. That is, each $i$-th event is of $\mathtt{Type = i}$. |
280 | | - |
281 | | -The next value after the $\texttt{Type}$ is the entry number, denoted by $\texttt{Number}$. The next values in a data page are $\texttt{data}$. |
282 | | - |
283 | | -After the last entry in a data page, is the $\texttt{packetType = 0}$ and some padding for any unused space. |
284 | | - |
285 | | - |
286 | | - |
287 | | - |
288 | | -## How do we implement commits and rollbacks? |
289 | | - |
290 | | -Recall that the server-source protocol begins with calling the $\texttt{StartAtomicOp}(\ )$, corresponding to which a message is sent to the stream server, preparing to receive entries related to a specific atomic operation. |
291 | | - |
292 | | -When the stream source sends the entries, the stream server appends the data of the entries to the stream file. |
293 | | - |
294 | | -Once all entries have been sent, the stream source calls the $\texttt{CommitAtomicOp}(\ )$ function, and the header of the stream file is subsequently updated. In particular, the $\texttt{totalLength}$ and $\texttt{totalEntries}$ fields. |
295 | | - |
296 | | -But if the $\texttt{RollbackAtomicOp}(\ )$ is triggered instead of the $\texttt{CommitAtomicOp}(\ )$, the header is not updated. |
297 | | - |
298 | | -In other words, the header of the stream file only changes when the $\texttt{CommitAtomicOp}(\ )$ function is called. So, although some entries related to the atomic operation have already been added to the stream file, the header of the stream file is updated only with information of entries related to committed atomic operations. |
299 | | - |
300 | | -Since the $\texttt{RollbackAtomicOp}(\ )$ function can only be executed before a given atomic operation is committed, the header is not updated because the added entries (of the uncommitted atomic operation) will be overwritten with entries of the next atomic operation(s). |
301 | | - |
302 | | -This means a rollback amounts to overwriting entries in the strem file that are related to an uncommitted atomic operation. |
303 | | - |
304 | | - |
305 | | - |
306 | | -### Example (Commit and rollback) |
307 | | - |
308 | | -Suppose an operation $\mathtt{Op_A}$ has been started, and $\texttt{100}$ entries had already been added to the stream file when a rollback was triggered. |
309 | | - |
310 | | -Since a rollback was triggered before $\mathtt{Op_A}$ was committed, the header of the stream file remains unaffected by the $\texttt{100}$ added entries. |
311 | | - |
312 | | -Let's say the $\texttt{totalLength}$ of the stream file is $\texttt{1712}$ when the rollback occurred. Although the actual length of the stream is $\texttt{1812}$, the $\texttt{totalLength}$ in the header of the stream file remains unchanged. |
313 | | - |
314 | | -Suppose that the next atomic operation $\mathtt{Op_B}$ gets started and committed, but has only $40$ related entries. |
315 | | - |
316 | | -Since $\mathtt{Op_B}$ is committed but only $40$ entries have been added to the stream file, the header will now reflect the $\texttt{totalLength}$ to be $1752$. This means only $\texttt{40}$ of the $\texttt{100}$ entries of the previously uncommitted operation $\mathtt{Op_A}$ got overwritten, but the actual stream is still $\texttt{1812}$. |
317 | | - |
318 | | -How is this not a problem from an application point of view? |
319 | | - |
320 | | -It's because if a stream client requests for the stream, the stream server sends the stream only up to the $\texttt{totalLength}$ recorded in the header of the stream file, $1752$, and not the actual length of the stream, $\texttt{1812}$. |
321 | | - |
322 | | -## Concluding remarks |
323 | | - |
324 | | -The basic trick here is for the stream server to only use the information recorded in the header of the stream file, and to change that information only if an atomic operation is committed. |
325 | | - |
326 | | -This way the stream server always sends the stream only up to the last entry of the committed operation. |
327 | | - |
328 | | -All-in-all, this is just an optimal way to rollback. There's no need to delete information from the stream file. The header of the stream file is updated only if an atomic operation has been committed. |
329 | | - |
330 | | -This is the main reason why parameters such as the $\texttt{totalLength}$ and $\texttt{totalEntries}$ are recorded in the header of the stream file. |
0 commit comments