Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tokens: speed up duplication of Tokens objects #5325

Merged
merged 6 commits into from
Jan 25, 2023

Conversation

oliver-sanders
Copy link
Member

Sometimes we need to duplicate a Tokens object.

Currently this works by doing tokenise(detokenise(tokens)) which is really inefficient (I should know better, I've moaned about this in the cycler classes before!).

This PR switches to using the already parsed values held in the tokens object.

Using the example from #5315 with -s TASKS=25 -s CYCLES=1 this reduces the time taken by increment_graph_window by ~50% after the changes from #5319 and #5321 have been applied.

Before (with #5319 & #5321)

Screenshot from 2023-01-24 11-32-10

After (still with #5319 & #5321)

Screenshot from 2023-01-24 11-32-21

Check List

  • I have read CONTRIBUTING.md and added my name as a Code Contributor.
  • Contains logically grouped changes (else tidy your branch by rebase).
  • Does not contain off-topic changes (use other PRs for other changes).
  • Applied any dependency changes to both setup.cfg and conda-environment.yml.
  • Tests are included (or explain why tests are not needed).
  • CHANGES.md entry included if this is a change that can affect users
  • Cylc-Doc pull request opened if required at cylc/cylc-doc/pull/XXXX.
  • If this is a bug fix, PR should be raised against the relevant ?.?.x branch.

@oliver-sanders oliver-sanders added this to the 8.1.1 milestone Jan 24, 2023
@oliver-sanders oliver-sanders added the efficiency For notable efficiency improvements label Jan 24, 2023
@oliver-sanders oliver-sanders changed the base branch from master to 8.1.x January 24, 2023 11:37
cylc/flow/id.py Outdated Show resolved Hide resolved
Co-authored-by: Ronnie Dutta <61982285 [email protected]>
@hjoliver
Copy link
Member

Nice!

Copy link
Contributor

@datamel datamel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the integration tests, the id is currently a posix path which is breaking them with this change. I'm guessing mypy doesn't check tests.

This diff fixes it:

diff --git a/tests/integration/test_reinstall.py b/tests/integration/test_reinstall.py
index 02be406bc..8b13c5574 100644
--- a/tests/integration/test_reinstall.py
    b/tests/integration/test_reinstall.py
@@ -80,7  80,7 @@ def one_run(one_src, test_dir, run_dir):
     )
     return SimpleNamespace(
         path=w_run_dir,
-        id=w_run_dir.relative_to(run_dir),
         id=str(w_run_dir.relative_to(run_dir)),
     )
 

@hjoliver
Copy link
Member

That's a genuine test fail, in tests/integration/test_reinstall.py:

        if args:
            if len(args) > 1:
                raise ValueError()
            if isinstance(args[0], str):
                kwargs = tokenise(args[0], relative)
            else:
>               kwargs = dict(args[0])
E               TypeError: 'PosixPath' object is not iterable

cylc/flow/id.py:113: TypeError

@hjoliver
Copy link
Member

Fix?

diff --git a/tests/integration/test_reinstall.py b/tests/integration/test_reinstall.py
index 02be406bc..8b13c5574 100644
--- a/tests/integration/test_reinstall.py
    b/tests/integration/test_reinstall.py
@@ -80,7  80,7 @@ def one_run(one_src, test_dir, run_dir):
     )
     return SimpleNamespace(
         path=w_run_dir,
-        id=w_run_dir.relative_to(run_dir),
         id=str(w_run_dir.relative_to(run_dir)),
     )

@hjoliver
Copy link
Member

hjoliver commented Jan 24, 2023

Argghhh, @datamel bet me to it!!! (she must have been working late!). Still, same fix suggested - great minds think alike 😁

@oliver-sanders
Copy link
Member Author

Ta for the fix both :)

@oliver-sanders
Copy link
Member Author

Added a collective changelog for all three efficiency changes.

Copy link
Member

@MetRonnie MetRonnie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(only linkcheck failing)

@MetRonnie MetRonnie merged commit c122bec into cylc:8.1.x Jan 25, 2023
@oliver-sanders oliver-sanders deleted the tokens-duplicate branch January 25, 2023 18:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
efficiency For notable efficiency improvements small
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants