trained policy networks for ANYmal c010 and c100 models.