Hi, hanjun. Thanks a lot for your great work! I have a question about the hierarchical Q-Learning mentioned in the paper. In equation 11, there are 2M Q functions and the paper claims only two distinct parametrizations are enough. However, when I read the code /node_attack/q_net_node.py, in line 163, there initialize num_steps Q networks which are different from the paper. Does my understanding have any mistakes? Looking forward to your reply!!
Hi, hanjun. Thanks a lot for your great work! I have a question about the hierarchical Q-Learning mentioned in the paper. In equation 11, there are 2M Q functions and the paper claims only two distinct parametrizations are enough. However, when I read the code /node_attack/q_net_node.py, in line 163, there initialize num_steps Q networks which are different from the paper. Does my understanding have any mistakes? Looking forward to your reply!!