-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update xarm6_robotiq to have a more natural rest position #929
base: main
Are you sure you want to change the base?
Conversation
0bd2f02
to
d4bd1eb
Compare
@@ -141,13 +141,9 @@ def compute_dense_reward(self, obs: Any, action: torch.Tensor, info: Dict): | |||
place_reward = 1 - torch.tanh(5 * obj_to_goal_dist) | |||
reward += place_reward * is_grasped | |||
|
|||
qvel_without_gripper = self.agent.robot.get_qvel() | |||
if self.robot_uids == "xarm6_robotiq": | |||
qvel_without_gripper = qvel_without_gripper[..., :-6] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually how come we don't keep this part? We need the joint velocity without gripper since the task objective is to ensure the robot stays still while grasping the cube.
Or at minimum keep the panda one (for compatability, all our previous experiments/numbers are based on qvel without gripper). If using the entire qvel works fine for other robots then we can just keep the exception for panda and use entire qvel for reward for the other robots.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My rationale was that at the final stage gripper qpos can't change anyway for grasping tasks like PickCube, since if the gripper opens a bit then you'll drop the object, and you can't close more either because the objects are rigid. Hence encouraging 0 gripper qvel also makes sense.
For now I'll keep it in for panda for compatibility. Updating the PR accordingly.
1.52969832e-04, | ||
1.20606723e00, | ||
1.66234924e-03, | ||
6.672368e-4, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the e-4 values are probably supposed to be 0 right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I had just copied joint positions from the GUI once I got the gripper at the right location. Quickly tested with qpos precision of 2 decimal places, that looks okay too. Updating the PR.
Update xarm6_robotiq rest qpos for completing tasks in fewer steps
Clean up pick cube reward to remove robot specific checks
Following discussion in #823