RL [omni.isaac.gym]
Base Environment Wrapper
- class VecEnvBase(headless: bool, sim_device: int = 0, enable_livestream: bool = False, enable_viewport: bool = False, launch_simulation_app: bool = True, experience: Optional[str] = None)
This class provides a base interface for connecting RL policies with task implementations. APIs provided in this interface follow the interface in gym.Env. This class also provides utilities for initializing simulation apps, creating the World, and registering a task.
- action_space: spaces.Space[ActType]
- close() None
Closes simulation.
- create_viewport_render_product(resolution=(1280, 720))
Create a render product of the viewport for rendering.
- metadata: dict[str, Any] = {'render_modes': []}
- property np_random: numpy.random._generator.Generator
Returns the environment’s internal
_np_random
that if not set will initialise with a random seed.- Returns
Instances of np.random.Generator
- property num_envs
Retrieves number of environments.
- Returns
Number of environments.
- Return type
num_envs(int)
- observation_space: spaces.Space[ObsType]
- render(mode='human') None
Run rendering without stepping through the physics.
- By convention, if mode is:
human: render to the current display and return nothing. Usually for human consumption.
rgb_array: Return an numpy.ndarray with shape (x, y, 3), representing RGB values for an x-by-y pixel image, suitable for turning into a video.
- Parameters
mode (str, optional) – The mode to render with. Defaults to “human”.
- property render_enabled
Whether rendering is enabled.
- Returns
is render enabled.
- Return type
render(bool)
- render_mode: str | None = None
- reset(seed=None, options=None)
Resets the task and updates observations.
- Parameters
seed (Optional[int]) – Seed.
options (Optional[dict]) – Options as used in gymnasium.
- Returns
Buffer of observation data. info(dict): Dictionary of extras data.
- Return type
observations(Union[numpy.ndarray, torch.Tensor])
- reward_range = (-inf, inf)
- seed(seed=- 1)
Sets a seed. Pass in -1 for a random seed.
- Parameters
seed (int) – Seed to set. Defaults to -1.
- Returns
Seed that was set.
- Return type
seed (int)
- set_task(task, backend='numpy', sim_params=None, init_sim=True, rendering_dt=0.016666666666666666) None
- Creates a World object and adds Task to World.
Initializes and registers task to the environment interface. Triggers task start-up.
- Parameters
task (RLTask) – The task to register to the env.
backend (str) – Backend to use for task. Can be “numpy” or “torch”. Defaults to “numpy”.
sim_params (dict) – Simulation parameters for physics settings. Defaults to None.
init_sim (Optional[bool]) – Automatically starts simulation. Defaults to True.
rendering_dt (Optional[float]) – dt for rendering. Defaults to 1/60s.
- signal_handler(sig, frame)
- property simulation_app
Retrieves the SimulationApp object.
- Returns
SimulationApp.
- Return type
simulation_app(SimulationApp)
- spec: EnvSpec | None = None
- step(actions)
- Basic implementation for stepping simulation.
Can be overriden by inherited Env classes to satisfy requirements of specific RL libraries. This method passes actions to task for processing, steps simulation, and computes observations, rewards, and resets.
- Parameters
actions (Union[numpy.ndarray, torch.Tensor]) – Actions buffer from policy.
- Returns
Buffer of observation data. rewards(Union[numpy.ndarray, torch.Tensor]): Buffer of rewards data. dones(Union[numpy.ndarray, torch.Tensor]): Buffer of resets/dones data. info(dict): Dictionary of extras data.
- Return type
observations(Union[numpy.ndarray, torch.Tensor])
- property unwrapped: gymnasium.core.Env[gymnasium.core.ObsType, gymnasium.core.ActType]
Returns the base non-wrapped environment.
- Returns
The base non-wrapped
gymnasium.Env
instance- Return type
Env
- update_task_params()
Multi-Threaded Environment Wrapper
- exception TaskStopException
Exception class for signalling task termination.
- args
- with_traceback()
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- class TrainerMT
A base abstract trainer class for controlling starting and stopping of RL policy.
- abstract run()
Runs RL loop in a new thread
- abstract stop()
Stop RL thread
- class VecEnvMT(headless: bool, sim_device: int = 0, enable_livestream: bool = False, enable_viewport: bool = False, launch_simulation_app: bool = True, experience: Optional[str] = None)
This class provides a base interface for connecting RL policies with task implementations in a multi-threaded fashion. RL policies using this class will run on a different thread than the thread simulation runs on. This can be useful for interacting with the UI before, during, and after running RL policies. Data sharing between threads happen through message passing on multi-threaded queues.
- action_space: spaces.Space[ActType]
- clear_queues()
Clears all queues.
- close() None
Closes simulation.
- create_viewport_render_product(resolution=(1280, 720))
Create a render product of the viewport for rendering.
- get_actions(block=True)
Retrieves actions from policy by waiting for actions to be sent to the queue from the RL thread.
- Parameters
block (Optional[bool]) – Whether to block thread when waiting for data.
- Returns
actions buffer retrieved from queue.
- Return type
actions (Union[np.ndarray, torch.Tensor, None])
- get_data(block=True)
Retrieves data from task by waiting for data dictionary to be sent to the queue from the simulation thread.
- Parameters
block (Optional[bool]) – Whether to block thread when waiting for data.
- Returns
data dictionary retrieved from queue.
- Return type
actions (Union[np.ndarray, torch.Tensor, None])
- initialize(action_queue, data_queue, timeout=30)
Initializes queues for sharing data across threads.
- Parameters
action_queue (queue.Queue) – Queue for passing actions from policy to task.
data_queue (queue.Queue) – Queue for passing data from task to policy.
timeout (Optional[int]) – Seconds to wait for data when queue is empty. An exception will be thrown when the timeout limit is reached. Defaults to 30 seconds.
- metadata: dict[str, Any] = {'render_modes': []}
- property np_random: numpy.random._generator.Generator
Returns the environment’s internal
_np_random
that if not set will initialise with a random seed.- Returns
Instances of np.random.Generator
- property num_envs
Retrieves number of environments.
- Returns
Number of environments.
- Return type
num_envs(int)
- observation_space: spaces.Space[ObsType]
- render(mode='human') None
Run rendering without stepping through the physics.
- By convention, if mode is:
human: render to the current display and return nothing. Usually for human consumption.
rgb_array: Return an numpy.ndarray with shape (x, y, 3), representing RGB values for an x-by-y pixel image, suitable for turning into a video.
- Parameters
mode (str, optional) – The mode to render with. Defaults to “human”.
- property render_enabled
Whether rendering is enabled.
- Returns
is render enabled.
- Return type
render(bool)
- render_mode: str | None = None
- reset(seed=None, options=None)
Resets the task and updates observations.
- Parameters
seed (Optional[int]) – Seed.
options (Optional[dict]) – Options as used in gymnasium.
- Returns
Buffer of observation data. info(dict): Dictionary of extras data.
- Return type
observations(Union[numpy.ndarray, torch.Tensor])
- reward_range = (-inf, inf)
- async run(trainer)
Main loop for controlling simulation and task stepping. This method is responsible for stepping task and simulation, collecting buffers from task, sending data to policy, and retrieving actions from policy. It also deals with the case when the policy terminates on completion and continues the simulation thread so that UI does not get affected.
- Parameters
trainer (TrainerMT) – A Trainer object that implements APIs for starting and stopping RL thread.
- seed(seed=- 1)
Sets a seed. Pass in -1 for a random seed.
- Parameters
seed (int) – Seed to set. Defaults to -1.
- Returns
Seed that was set.
- Return type
seed (int)
- send_actions(actions, block=True)
Sends actions from RL thread to simulation thread by adding actions to queue.
- Parameters
actions (Union[np.ndarray, torch.Tensor]) – actions buffer to be added to queue.
block (Optional[bool]) – Whether to block thread when writing to queue.
- send_data(data, block=True)
Sends data from task thread to RL thread by adding data to queue.
- Parameters
data (dict) – Dictionary containing task data.
block (Optional[bool]) – Whether to block thread when writing to queue.
- set_render_mode(render_mode)
- set_task(task, backend='numpy', sim_params=None, init_sim=True, rendering_dt=0.016666666666666666) None
- Creates a World object and adds Task to World.
Initializes and registers task to the environment interface. Triggers task start-up.
- Parameters
task (RLTask) – The task to register to the env.
backend (str) – Backend to use for task. Can be “numpy” or “torch”. Defaults to “numpy”.
sim_params (dict) – Simulation parameters for physics settings. Defaults to None.
init_sim (Optional[bool]) – Automatically starts simulation. Defaults to True.
rendering_dt (Optional[float]) – dt for rendering. Defaults to 1/60s.
- signal_handler(sig, frame)
- property simulation_app
Retrieves the SimulationApp object.
- Returns
SimulationApp.
- Return type
simulation_app(SimulationApp)
- spec: EnvSpec | None = None
- step(actions)
- Basic implementation for stepping simulation.
Can be overriden by inherited Env classes to satisfy requirements of specific RL libraries. This method passes actions to task for processing, steps simulation, and computes observations, rewards, and resets.
- Parameters
actions (Union[numpy.ndarray, torch.Tensor]) – Actions buffer from policy.
- Returns
Buffer of observation data. rewards(Union[numpy.ndarray, torch.Tensor]): Buffer of rewards data. dones(Union[numpy.ndarray, torch.Tensor]): Buffer of resets/dones data. info(dict): Dictionary of extras data.
- Return type
observations(Union[numpy.ndarray, torch.Tensor])
- property unwrapped: gymnasium.core.Env[gymnasium.core.ObsType, gymnasium.core.ActType]
Returns the base non-wrapped environment.
- Returns
The base non-wrapped
gymnasium.Env
instance- Return type
Env
- update_task_params()