hyphi_gym.common.robot
#
MuJoCo-Based Fetch (https://fetchrobotics.com) environment insipired by: - ‘Gymnasium Robotics’ by Rodrigo de Lazcano, Kallinteris Andreas, Jun Jet Tai, Seungjae Ryan Lee, Jordan Terry (https://github.com/Farama-Foundation/Gymnasium-Robotics) - ‘Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning’ by Tianhe Yu, Deirdre Quillen, Zhanpeng He, Ryan Julian, Avnish Narayan, Hayden Shively, Adithya Bellathur, Karol Hausman, Chelsea Finn, Sergey Levine. (https://github.com/Farama-Foundation/Metaworld) - ‘D4RL: Datasets for Deep Data-Driven Reinforcement Learning’ by Justin Fu, Aviral Kumar, Ofir Nachum, George Tucker, Sergey Levine. (https://github.com/Farama-Foundation/D4RL) Args: - agent (numpy array): base position of the agent gripper arm - block_gripper (boolean): whether or not the gripper is blocked (i.e. not movable) or not - continue_task (bool): whether to spawn a new target or reset the agent ore upon episode completion - distance_threshold (float): the threshold after which a goal is considered achieved - has_object (boolean): whether or not the environment has an object - target (numpy array): base position of the target - target_in_the_air (boolean): whether or not the target should be in the air above the table or on the table surface - target_noise (float): range of a uniform distribution for sampling a target - frame_skip (int): number of substeps the simulation runs on every call to step (prev: n_substeps) - position_noise (float): range of a uniform distribution for sampling initial object positions (prev: obj_range) - render_mode (str)
Module Contents#
Classes#
Continous-control robot base class |
- class hyphi_gym.common.robot.Robot(agent: numpy.ndarray | None = np.array([1, 1, 1]), block_gripper: bool = False, continue_task: bool = True, distance_threshold: float = 0.05, has_object: bool = False, target: numpy.ndarray | None = np.array([1, 1, 1]), target_in_the_air: bool = True, target_noise: float = 0.25, frame_skip: int = 20, position_noise: float = 0.25, render_mode=None, **kwargs)#
Bases:
hyphi_gym.common.base.Base
,hyphi_gym.common.simulation.Simulation
Continous-control robot base class
- property tpos#
- step_scale = 10#
- metadata#
- default_cam_config#
- load_world()#
Helper function to load a generated world from self.model_path falling back to self.base_xml
- _validate(layout, error=True, setup=False)#
Overwrite this function to validate layout return steps to solution
- _generate() numpy.ndarray #
Random generator function for a layout of self.specs
- _position_mocap(pos=None, rot=[1.0, 0.0, 1.0, 0.0])#
- _randomize(layout: numpy.ndarray, key: str)#
Mutation function to randomize the position of key in layout
- state_vector() numpy.ndarray #
Return the position and velocity joint states of the model
- execute(action: numpy.ndarray) tuple[dict, dict] #
Executes the action, returns the new state, info, and distance between agent and target Action should be 4d array containing x,y,z, and gripper displacement in [-1,1]
- render()#
Compute the render frames as specified by
render_mode
during the initialization of the environment.The environment’s
metadata
render modes (env.metadata[“render_modes”]) should contain the possible ways to implement the render modes. In addition, list versions for most render modes is achieved through gymnasium.make which automatically applies a wrapper to collect rendered frames.- Note:
As the
render_mode
is known during__init__
, the objects used to render the environment state should be initialised in__init__
.
By convention, if the
render_mode
is:None (default): no render is computed.
“human”: The environment is continuously rendered in the current display or terminal, usually for human consumption. This rendering should occur during
step()
andrender()
doesn’t need to be called. ReturnsNone
.“rgb_array”: Return a single frame representing the current state of the environment. A frame is a
np.ndarray
with shape(x, y, 3)
representing RGB values for an x-by-y pixel image.“ansi”: Return a strings (
str
) orStringIO.StringIO
containing a terminal-style text representation for each time step. The text can include newlines and ANSI escape sequences (e.g. for colors).“rgb_array_list” and “ansi_list”: List based version of render modes are possible (except Human) through the wrapper,
gymnasium.wrappers.RenderCollection
that is automatically applied duringgymnasium.make(..., render_mode="rgb_array_list")
. The frames collected are popped afterrender()
is called orreset()
.
- Note:
Make sure that your class’s
metadata
"render_modes"
key includes the list of supported modes.
Changed in version 0.25.0: The render function was changed to no longer accept parameters, rather these parameters should be specified in the environment initialised, i.e.,
gymnasium.make("CartPole-v1", render_mode="human")
- reset(**kwargs) tuple[gymnasium.spaces.Space, dict] #
Reset the environment simulation and randomize if needed