Regarding the tanh non-linearity and per joint action scale in AnyTracker

thank you authors for the incredible work. 

I've been reading your paper and the OpenTrack code. I have a simple question:
in the paper, you mentioned that the pd targets is reference motion of next frame + action_scale * tanh(policy output)

but in the code, i see the current implementation is:
1. actions_scale is uniform rather than empirically designed hyperparameters
2. tanh is not applied to the policy output

in MLP forward function in brax2torch.py 

```
    def forward(self, x: torch.Tensor) -> torch.Tensor:
        for i, layer in enumerate(self.hidden):
            x = layer(x)
            if i != len(self.hidden) - 1 or self.activate_final:
                x = self.act(x)
        if self.split:
            loc, _ = torch.chunk(x, 2, dim=-1)
            return torch.tanh(loc)
        return x
```
and the self.split is always False in current code base.

please help to clarify this inconsistency. or just using uniform alpha and linear action it is still fine in your test

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regarding the tanh non-linearity and per joint action scale in AnyTracker #12

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Regarding the tanh non-linearity and per joint action scale in AnyTracker #12

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions