Hi and thanks for your great work
In figure 4 of the paper, we have this diagram for Feature Extractor module:
The second part (four DC block) takes $F_{t-1}^e$ as input right? but in the implementation:
def forward(self, x, quant):
x1, ctx_t = self.forward_part1(x, quant)
ctx = self.forward_part2(x1)
return ctx, ctx_t
you passed x1 (output of first part BEFORE multiplying by $q_f$
why?