Skip to content

Regarding the tanh non-linearity and per joint action scale in AnyTracker #12

@gapbridger

Description

@gapbridger

thank you authors for the incredible work.

I've been reading your paper and the OpenTrack code. I have a simple question:
in the paper, you mentioned that the pd targets is reference motion of next frame + action_scale * tanh(policy output)

but in the code, i see the current implementation is:

  1. actions_scale is uniform rather than empirically designed hyperparameters
  2. tanh is not applied to the policy output

in MLP forward function in brax2torch.py

    def forward(self, x: torch.Tensor) -> torch.Tensor:
        for i, layer in enumerate(self.hidden):
            x = layer(x)
            if i != len(self.hidden) - 1 or self.activate_final:
                x = self.act(x)
        if self.split:
            loc, _ = torch.chunk(x, 2, dim=-1)
            return torch.tanh(loc)
        return x

and the self.split is always False in current code base.

please help to clarify this inconsistency. or just using uniform alpha and linear action it is still fine in your test

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions