-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split CompilerEnv.step() into two methods for singular or lists of actions (take 2) #627
Conversation
d0d464a
to
6acd592
Compare
Codecov Report
@@ Coverage Diff @@
## development #627 /- ##
===============================================
- Coverage 88.61% 79.76% -8.85%
===============================================
Files 115 115
Lines 7019 7058 39
===============================================
- Hits 6220 5630 -590
- Misses 799 1428 629
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## development #627 /- ##
===============================================
- Coverage 88.75% 88.23% -0.53%
===============================================
Files 115 115
Lines 7019 7061 42
===============================================
Hits 6230 6230
- Misses 789 831 42
Continue to review full report at Codecov.
|
@ChrisCummins, I think I fixed the tests and examples. It should be ready for a review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @sogartar, thanks for pushing ahead with this. I think overloading multistep()
makes total sense.
A few small changes requested, otherwise, LGTM!
compiler_gym/wrappers/core.py
Outdated
# Undo the episode_reward update and reapply it once we have transformed | ||
# the reward. | ||
# | ||
# TODO(cummins): Refactor step() so that we don't have to do this | ||
# recalculation of episode_reward, as this is prone to errors if, say, | ||
# the base reward returns NaN or an invalid type. | ||
if reward is not None and self.episode_reward is not None: | ||
self.unwrapped.episode_reward -= reward | ||
reward = self.reward(reward) | ||
self.unwrapped.episode_reward = reward |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this workaround needs to be included, because multistep()
runs the update to self.episode_reward
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There should really be a unit test to cover this (not saying you need to add it 🙂 , just note to self)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added it back.
observation_spaces=None, | ||
reward_spaces=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also needs to accept observations
and rewards
arguments for backwards compatibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -5,6 5,8 @@ | |||
|
|||
cg_add_all_subdirs() | |||
|
|||
return() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this meant to be in here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May bad. I removed it.
tests/wrappers/core_wrappers_test.py
Outdated
@@ -264,11 272,9 @@ def reward(self, reward): | |||
env.reset() | |||
_, reward, _, _ = env.step(0) | |||
assert reward == -5 | |||
assert env.episode_reward == -5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these two deleted lines can be restored now that you're overloading multistep()
rather than raw_step()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Heads up that |
CompilerEnv.step() currently accepts two types for the "action" argument: a scalar action, or an iterable of actions. This kind of type overloading does not work for list types. This adds a new method, CompilerEnv.multistep(), that explicitly takes takes an iterable sequence of actions. If you want to run multiple actions in a single step, call this new method. Calling CompilerEnv.step() with a list of actions still works, though with a deprecation warning. In the v0.2.4 release support for lists of actions in CompilerEnv.step() will be removed. Fixes #610.
This makes the following changes: - Changes env.step() `action` to accept only a single action, with a deprecation warning if a list of actions are provided. - Renames env.step() `observations` to `observation_spaces`. The old parameter name is still accepted with a deprecation warning. - Renames env.step() `rewards` to `reward_spaces`. The old parameter name is still accepted with a deprecation warning.
4360737
to
8cd9679
Compare
@ChrisCummins, I addressed your comments. If your OK with them I think it is ready to merge after the CI tests have passed. |
Brill, thanks! Merging 🎉 |
This release adds a new compiler environment, new APIs, and a suite of backend improvements to improve the flexibility of CompilerGym environments. Many thanks to code contributors: @sogartar, @KyleHerndon, @SoumyajitKarmakar, @uduse, and @anthony0727! Highlights of this release include: - [mlir] Began work on a new environment for matrix multiplication using MLIR ([#652](#652), thanks @KyleHerndon and @sogartar!). Note this environment is not yet included in the pypi package and must be [compiled from source](https://github.com/facebookresearch/CompilerGym/blob/development/INSTALL.md#building-from-source-with-cmake). - [llvm] Added a new `env.benchmark_from_clang_invocation()` method ([#577](#577)) that can be used for constructing LLVM environment automatically from C/C compiler invocations. This makes it much easier to integrate CompilerGym with your existing build scripts. - Added three new wrapper classes: `Counter`, that provides op counts for analysis ([#683](#683)); `SynchronousSqliteLogger`, that provides logging of environment interactions to a relational database ([#679](#679)), and `ForkOnStep` that provides an `undo()` operation ([#682](#682)). - Added `reward_space` and `observation_space` parameters to `env.reset()` ([#659](#659), thanks @SoumyajitKarmakar!) This release includes a number of improvements to the backend APIs that make it easier to write new CompilerGym environments: - Refactored the backend to make `CompilerEnv` an abstract interface, and `ClientServiceCompilerEnv` the concrete implementation of this interface. This enables new environments to be implemented without using gRPC ([#633](#633), thanks @sogartar!). - Extended the support for different types of action and observation spaces ([#641](#641), [#643](#643), thanks @sogartar!), including new `Permutation` and `SpaceSequence` spaces ([#645](#645), thanks @sogartar!).. - Added a new `disk/` subdirectory to compiler service's working directories, which is symlinked to an on-disk location for devices which support in-memory working directories. This fixes a bug with leftover temporary directories from LLVM ([#672](#672)). This release also includes numerous bug fixes and improvements, many of which were reported or fixed by the community. For example, fixing a bug in cache file locations ([#656](#656), thanks @uduse!), and a missing flag definition in example code ([#684](#684), thanks @anthony0727!). **Full Changelog**: v0.2.3...v0.2.4 This release brings in deprecating changes to the core `env.step()` routine, and lays the groundwork for enabling new types of compiler optimizations to be exposed through CompilerGym. Many thanks to code contributors: @mostafaelhoushi, @sogartar, @KyleHerndon, @uduse, @parthchadha, and @xtremey! Highlights of this release include: - Added a new `TextSizeInBytes` observation space for LLVM ([#575](#575)). * Added a new PPO leaderboard entry ([#580](#580). Thanks @xtremey! - Fixed a bug in which temporary directories created by the LLVM environment were not cleaned up ([#592](#592)). - **[Backend]** The function `createAndRunCompilerGymService` now returns an int, which is the exit return code ([#592](#592)). - Improvements to the examples documentation ([#548](#548)) and FAQ ([#586](#586)) Deprecations and breaking changes: - `CompilerEnv.step` no longer accepts a list of actions ([#627](#627)). A new method, `CompilerEnv.multistep` provides this functionality. This is to provide compatibility with environments whose action spaces are lists. To update your code, replace any calls to `env.step()` which take a list of actions to use `env.multistep()`. Thanks @sogartar! - The arguments `observations` and `rewards` to `step()` have been renamed `observation_spaces` and `reward_spaces`, respectively ([#627](#627)). - `Reward.id` has been renamed `Reward.name` ([#565](#565), [#612](#612)). Thanks @parthchadha! * The backend protocol buffer schema has been updated to natively support more types of observation and action, and to support nested spaces ([#531](#531)). Thanks @sogartar!
This release adds a new compiler environment, new APIs, and a suite of backend improvements to improve the flexibility of CompilerGym environments. Many thanks to code contributors: @sogartar, @KyleHerndon, @SoumyajitKarmakar, @uduse, and @anthony0727! Highlights of this release include: - [mlir] Began work on a new environment for matrix multiplication using MLIR ([#652](#652), thanks @KyleHerndon and @sogartar!). Note this environment is not yet included in the pypi package and must be [compiled from source](https://github.com/facebookresearch/CompilerGym/blob/development/INSTALL.md#building-from-source-with-cmake). - [llvm] Added a new `env.benchmark_from_clang_invocation()` method ([#577](#577)) that can be used for constructing LLVM environment automatically from C/C compiler invocations. This makes it much easier to integrate CompilerGym with your existing build scripts. - Added three new wrapper classes: `Counter`, that provides op counts for analysis ([#683](#683)); `SynchronousSqliteLogger`, that provides logging of environment interactions to a relational database ([#679](#679)), and `ForkOnStep` that provides an `undo()` operation ([#682](#682)). - Added `reward_space` and `observation_space` parameters to `env.reset()` ([#659](#659), thanks @SoumyajitKarmakar!) This release includes a number of improvements to the backend APIs that make it easier to write new CompilerGym environments: - Refactored the backend to make `CompilerEnv` an abstract interface, and `ClientServiceCompilerEnv` the concrete implementation of this interface. This enables new environments to be implemented without using gRPC ([#633](#633), thanks @sogartar!). - Extended the support for different types of action and observation spaces ([#641](#641), [#643](#643), thanks @sogartar!), including new `Permutation` and `SpaceSequence` spaces ([#645](#645), thanks @sogartar!).. - Added a new `disk/` subdirectory to compiler service's working directories, which is symlinked to an on-disk location for devices which support in-memory working directories. This fixes a bug with leftover temporary directories from LLVM ([#672](#672)). This release also includes numerous bug fixes and improvements, many of which were reported or fixed by the community. For example, fixing a bug in cache file locations ([#656](#656), thanks @uduse!), and a missing flag definition in example code ([#684](#684), thanks @anthony0727!). **Full Changelog**: v0.2.3...v0.2.4 This release brings in deprecating changes to the core `env.step()` routine, and lays the groundwork for enabling new types of compiler optimizations to be exposed through CompilerGym. Many thanks to code contributors: @mostafaelhoushi, @sogartar, @KyleHerndon, @uduse, @parthchadha, and @xtremey! Highlights of this release include: - Added a new `TextSizeInBytes` observation space for LLVM ([#575](#575)). * Added a new PPO leaderboard entry ([#580](#580). Thanks @xtremey! - Fixed a bug in which temporary directories created by the LLVM environment were not cleaned up ([#592](#592)). - **[Backend]** The function `createAndRunCompilerGymService` now returns an int, which is the exit return code ([#592](#592)). - Improvements to the examples documentation ([#548](#548)) and FAQ ([#586](#586)) Deprecations and breaking changes: - `CompilerEnv.step` no longer accepts a list of actions ([#627](#627)). A new method, `CompilerEnv.multistep` provides this functionality. This is to provide compatibility with environments whose action spaces are lists. To update your code, replace any calls to `env.step()` which take a list of actions to use `env.multistep()`. Thanks @sogartar! - The arguments `observations` and `rewards` to `step()` have been renamed `observation_spaces` and `reward_spaces`, respectively ([#627](#627)). - `Reward.id` has been renamed `Reward.name` ([#565](#565), [#612](#612)). Thanks @parthchadha! * The backend protocol buffer schema has been updated to natively support more types of observation and action, and to support nested spaces ([#531](#531)). Thanks @sogartar!
This release adds a new compiler environment, new APIs, and a suite of backend improvements to improve the flexibility of CompilerGym environments. Many thanks to code contributors: @sogartar, @KyleHerndon, @SoumyajitKarmakar, @uduse, and @anthony0727! Highlights of this release include: - [mlir] Began work on a new environment for matrix multiplication using MLIR ([#652](#652), thanks @KyleHerndon and @sogartar!). Note this environment is not yet included in the pypi package and must be [compiled from source](https://github.com/facebookresearch/CompilerGym/blob/development/INSTALL.md#building-from-source-with-cmake). - [llvm] Added a new `env.benchmark_from_clang_invocation()` method ([#577](#577)) that can be used for constructing LLVM environment automatically from C/C compiler invocations. This makes it much easier to integrate CompilerGym with your existing build scripts. - Added three new wrapper classes: `Counter`, that provides op counts for analysis ([#683](#683)); `SynchronousSqliteLogger`, that provides logging of environment interactions to a relational database ([#679](#679)), and `ForkOnStep` that provides an `undo()` operation ([#682](#682)). - Added `reward_space` and `observation_space` parameters to `env.reset()` ([#659](#659), thanks @SoumyajitKarmakar!) This release includes a number of improvements to the backend APIs that make it easier to write new CompilerGym environments: - Refactored the backend to make `CompilerEnv` an abstract interface, and `ClientServiceCompilerEnv` the concrete implementation of this interface. This enables new environments to be implemented without using gRPC ([#633](#633), thanks @sogartar!). - Extended the support for different types of action and observation spaces ([#641](#641), [#643](#643), thanks @sogartar!), including new `Permutation` and `SpaceSequence` spaces ([#645](#645), thanks @sogartar!).. - Added a new `disk/` subdirectory to compiler service's working directories, which is symlinked to an on-disk location for devices which support in-memory working directories. This fixes a bug with leftover temporary directories from LLVM ([#672](#672)). This release also includes numerous bug fixes and improvements, many of which were reported or fixed by the community. For example, fixing a bug in cache file locations ([#656](#656), thanks @uduse!), and a missing flag definition in example code ([#684](#684), thanks @anthony0727!). **Full Changelog**: v0.2.3...v0.2.4 This release brings in deprecating changes to the core `env.step()` routine, and lays the groundwork for enabling new types of compiler optimizations to be exposed through CompilerGym. Many thanks to code contributors: @mostafaelhoushi, @sogartar, @KyleHerndon, @uduse, @parthchadha, and @xtremey! Highlights of this release include: - Added a new `TextSizeInBytes` observation space for LLVM ([#575](#575)). * Added a new PPO leaderboard entry ([#580](#580). Thanks @xtremey! - Fixed a bug in which temporary directories created by the LLVM environment were not cleaned up ([#592](#592)). - **[Backend]** The function `createAndRunCompilerGymService` now returns an int, which is the exit return code ([#592](#592)). - Improvements to the examples documentation ([#548](#548)) and FAQ ([#586](#586)) Deprecations and breaking changes: - `CompilerEnv.step` no longer accepts a list of actions ([#627](#627)). A new method, `CompilerEnv.multistep` provides this functionality. This is to provide compatibility with environments whose action spaces are lists. To update your code, replace any calls to `env.step()` which take a list of actions to use `env.multistep()`. Thanks @sogartar! - The arguments `observations` and `rewards` to `step()` have been renamed `observation_spaces` and `reward_spaces`, respectively ([#627](#627)). - `Reward.id` has been renamed `Reward.name` ([#565](#565), [#612](#612)). Thanks @parthchadha! * The backend protocol buffer schema has been updated to natively support more types of observation and action, and to support nested spaces ([#531](#531)). Thanks @sogartar!
This MR supersedes #611. It is the same thing except the source branch is different.