hyphi_gym.common.robot#

MuJoCo-Based Fetch (https://fetchrobotics.com) environment insipired by: - ‘Gymnasium Robotics’ by Rodrigo de Lazcano, Kallinteris Andreas, Jun Jet Tai, Seungjae Ryan Lee, Jordan Terry (https://github.com/Farama-Foundation/Gymnasium-Robotics) - ‘Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning’ by Tianhe Yu, Deirdre Quillen, Zhanpeng He, Ryan Julian, Avnish Narayan, Hayden Shively, Adithya Bellathur, Karol Hausman, Chelsea Finn, Sergey Levine. (https://github.com/Farama-Foundation/Metaworld) - ‘D4RL: Datasets for Deep Data-Driven Reinforcement Learning’ by Justin Fu, Aviral Kumar, Ofir Nachum, George Tucker, Sergey Levine. (https://github.com/Farama-Foundation/D4RL) Args: - agent (numpy array): base position of the agent gripper arm - block_gripper (boolean): whether or not the gripper is blocked (i.e. not movable) or not - continue_task (bool): whether to spawn a new target or reset the agent ore upon episode completion - distance_threshold (float): the threshold after which a goal is considered achieved - has_object (boolean): whether or not the environment has an object - target (numpy array): base position of the target - target_in_the_air (boolean): whether or not the target should be in the air above the table or on the table surface - target_noise (float): range of a uniform distribution for sampling a target - frame_skip (int): number of substeps the simulation runs on every call to step (prev: n_substeps) - position_noise (float): range of a uniform distribution for sampling initial object positions (prev: obj_range) - render_mode (str)

Module Contents#

Classes#

Robot

Continous-control robot base class

class hyphi_gym.common.robot.Robot(agent: numpy.ndarray | None = np.array([1, 1, 1]), block_gripper: bool = False, continue_task: bool = True, distance_threshold: float = 0.05, has_object: bool = False, target: numpy.ndarray | None = np.array([1, 1, 1]), target_in_the_air: bool = True, target_noise: float = 0.25, frame_skip: int = 20, position_noise: float = 0.25, render_mode=None, **kwargs)#

Bases: hyphi_gym.common.base.Base, hyphi_gym.common.simulation.Simulation

Continous-control robot base class

property tpos#
step_scale = 10#
metadata#
default_cam_config#
load_world()#

Helper function to load a generated world from self.model_path falling back to self.base_xml

_validate(layout, error=True, setup=False)#

Overwrite this function to validate layout return steps to solution

_generate() numpy.ndarray#

Random generator function for a layout of self.specs

_position_mocap(pos=None, rot=[1.0, 0.0, 1.0, 0.0])#
_randomize(layout: numpy.ndarray, key: str)#

Mutation function to randomize the position of key in layout

state_vector() numpy.ndarray#

Return the position and velocity joint states of the model

execute(action: numpy.ndarray) tuple[dict, dict]#

Executes the action, returns the new state, info, and distance between agent and target Action should be 4d array containing x,y,z, and gripper displacement in [-1,1]

render()#

Compute the render frames as specified by render_mode during the initialization of the environment.

The environment’s metadata render modes (env.metadata[“render_modes”]) should contain the possible ways to implement the render modes. In addition, list versions for most render modes is achieved through gymnasium.make which automatically applies a wrapper to collect rendered frames.

Note:

As the render_mode is known during __init__, the objects used to render the environment state should be initialised in __init__.

By convention, if the render_mode is:

  • None (default): no render is computed.

  • “human”: The environment is continuously rendered in the current display or terminal, usually for human consumption. This rendering should occur during step() and render() doesn’t need to be called. Returns None.

  • “rgb_array”: Return a single frame representing the current state of the environment. A frame is a np.ndarray with shape (x, y, 3) representing RGB values for an x-by-y pixel image.

  • “ansi”: Return a strings (str) or StringIO.StringIO containing a terminal-style text representation for each time step. The text can include newlines and ANSI escape sequences (e.g. for colors).

  • “rgb_array_list” and “ansi_list”: List based version of render modes are possible (except Human) through the wrapper, gymnasium.wrappers.RenderCollection that is automatically applied during gymnasium.make(..., render_mode="rgb_array_list"). The frames collected are popped after render() is called or reset().

Note:

Make sure that your class’s metadata "render_modes" key includes the list of supported modes.

Changed in version 0.25.0: The render function was changed to no longer accept parameters, rather these parameters should be specified in the environment initialised, i.e., gymnasium.make("CartPole-v1", render_mode="human")

reset(**kwargs) tuple[gymnasium.spaces.Space, dict]#

Reset the environment simulation and randomize if needed