Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add source code for Lumos v0.0.3 #1

Merged
merged 1 commit into from
Aug 17, 2022
Merged

Add source code for Lumos v0.0.3 #1

merged 1 commit into from
Aug 17, 2022

Conversation

tlp19
Copy link
Collaborator

@tlp19 tlp19 commented Aug 16, 2022

Lumos v0.0.3 - Changelog

Table of contents


Enhancement to existing QC pipeline

Performance

1. Modified pipeline

What

The resizing operation of the images in the pipeline was moved to be performed before all other computations on the images.

Why

This resizing was previously performed after all other operations. But with a resizing factor of usually 0.1 this was a big opportunity for performance and storage efficiency improvements.

As all other computations are element-wise, moving this resizing operation should have no impact on the output, and was an obvious improvement to carry.

This would both accelerate the computation and reduce the temporary storage space taken by the program.

How

The resizing operation was moved right after the images are being loaded inside the program, instead of after all other computations were done (such as the conversion of the images from 16bit to 8bit).

Compare the changes in the code for more details.

Side effects

There should be no significant side effects on the final render of the images. All other operations performed on the image are element-wise, so resizing it before or after them should not have any impact on the output.

2. Parallelism

What

Lumos can now utilize multiple CPU cores to perform the computation of each of a plate's channels in parallel.

Why

One of the main requested features was performance improvements. To this extent, parallelism seemed like a somewhat simple solution to this, as Python scripts only utilize one CPU core by default.

How

The multiprocessing standard library was used in order to implement this feature. Threads are spawned from the paralleled functions using the Pool object.

To ensure concurrency, the only paralleled function is currently render_single_plate_plateview (its parallel form being render_single_plate_plateview_parallelism). It spawns a new process/thread for each of the channels of the current plate to be rendered. The render_single_run_plateview function relies on either of the form of this function, according to if the number of cores requested by the user for the execution of the program is more than one.

Compare the changes in the code for more details.

Side effects

The KeyboardInterrupts (Ctrl C) do not work when parallelism is enabled. The only way to halt the execution of the program prematurely is to close the terminal.

In the Windows Terminal (this did not occur in the Ubuntu terminal during testing), the TQDM progress bars to track the evolution of the program sometimes get printed on a new line, and this makes the console illegible. To counteract this issue, during a paralleled execution of the program, prints are limited to a single progress bar tracking the plates of a run being processed.

Logging also breaks when running several processes in parallel. This is because they are all trying to write to the same file at the same time. To prevent errors from happening in the middle of the program's execution, logs are disabled when parallelism is enabled.


Functionality

1. Improved Command-Line Interface (CLI)

What

a. Some arguments have been given alias/shortcuts to make the Lumos commands clearer and more succinct.

b. New arguments have been introduced:

  • --brightfield/-b
  • --parallelism/-p

c. A debug argument, --keep-temp-files has been added but should not be used by end-users.

Why

a. Because it makes the usage of the CLI cleaner, and was very easy to do.

b.

  • --brightfield is to control which Brightfield channels get rendered. This feature was requested as Brightfield channels are not often used in QC, and omitting them from the rendering process would speed up the total execution time of the program), and *
  • --parallelism is to control how many CPU cores get used for parallelism.

c. This argument has been added to speed up testing on-the-fly during development.

How

All those modifications were implemented by changing the click interface of the program.

Refer to the readme.md documentation for more details on the refactored and newly introduced arguments.

Side effects

Those modifications have no side effects.

However, changes made to CLI related to the new Cell Painting mode of Lumos are breaking to the previous Command-line Interface of Lumos. These changes and their effects will be detailed in the relevant section of this current document.

2. Missing images markers

What

This functionality reveals more strikingly than before missing images from the database, or images that may have been corrupted during the copying process.

It adds custom markers to each of those missing images, on top of a solid background, and the result serves as a placeholder for those missing images through the rest of the QC pipeline.

Why

This improved feature was requested as previously a simple solid grey-colored placeholder image was used to replace missing files. This was not very visible and a more visually distinctive placeholder was needed.

How

A new function draw_markers has been added to the lumos/toolbox.py module. This function first computes the properties of the figures to draw on the input image and then draws then is the specified input color.

This function is used after a dummy placeholder image is created when the loading of a site image fails.

Compare the changes in the code for more details.

Side effects

There are no side effects to this. This is purely cosmetic.

This can however not be implemented for the new Cell-Painting operation mode as this would go through the RGB blending pipeline and give distracting images.

3. Upgraded logger

What

a. The logger now writes to a new file when the current one reaches 2MB. (Note: A better functionality of the logger, however, would be to only keep the last 2MB of logs in a single file. This does not appear to be easily implemented with the existing logging standard Python package).

b. Also, the logger instance now changes behavior based on if the program is running using parallelism.

Why

a. This allows the user to freely delete older logs if wanted, while keeping the most recent ones.

b. As detailed in this previous section on parallelism, the logging functionality breaks when using parallelism, so logs are disabled in that case.

How

a. A new RotatingFileHandler from the logging library is used to define the behavior of the logger.

b. A new lumos/logger.py module has been created to allow the logger to have an internal state variable. This state variable is a boolean that indicates if the Lumos program currently uses parallelism. If it is the case, no logs are stored, and selected prints are sent to the console.

Compare the changes in the code for more details.

Side effects

There are no side effects "per say" to this implementation as everything works as intended: in some cases logs wont get written to the log file, or prints won't get printed to the console, but this is the desired behavior.



New implementation of Cell Painting

What

A Cell Painting mode is now included in Lumos. It allows the user to render an RGB image of a plate with all its channels color-coded and blended together.

The style and algorithm used for the blending can be chosen by the user, and an "accurate" style is provided to try to match the emission wavelength of each channel when colorizing them.

Why

A Cell Painting mode was requested for 2 reasons:

  • Explore its possible usefulness in the Quality Control process carried by the team.
  • Produce "pretty" and somewhat accurate images for future communications about the project.

How

The basic pipeline for Cell Painting was taken from the previous explorative work carried by Nicolas Boisseau (@nicolasboisseau) and was built upon.

Most notably, an algorithm was designed in order blend the different channels of each plate together in a way that was as close as possible to the actual emission wavelength of each channels.

Side effects

The command-line $ lumos command has changed syntax. This is a breaking change.

To distinguish which operation mode to use, its identifier needs to be indicated after the lumos keyword before typing any arguments.

The two operation modes' identifiers are:

  • qc for the Quality Control (legacy) mode
  • cp for the Cell Painting (new) mode

E.g. $ lumos qc --scope run --source-path ./source/run1 --output-path ./output/run1

For more information on both the new operating mode and its associated arguments, please refer to the readme.md documentation



Other work and changes

HTML documentation

What

A more thorough XML documentation has been carried in the codebase, and an HTML documentation has been generated in the /docs folder. The latter can be viewed in any web-browser by opening the docs/index.html file.

Why

Simple and clean documentation of the codebase makes the onboarding of new developers wishing to improve or maintain the program easier.

How

The HTML documentation was generated from the new XML code documentation using the pdoc3 package.

Side effects

There are no side effects to this.

Testing

What

Two new tests were added to the /tests folder to check that both the Quality Control and Cell Painting pipelines still work as expected during development:

  • test_2_qc_pipeline.py
  • test_3_cp_pipeline.py

Why

These tests can be used during development so that modifications made to the codebase are non-breaking to the basic functionality of the program.

How

These tests are written and intended to be used with the Pytest package.

For exact implementation of these tests, please refer to their respective script files.

Side effects

There are no side effects to this.

Copy link
Owner

@nicolasboisseau nicolasboisseau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addition of our old 0.0.3 version coded by @tlp19 !

@tlp19 tlp19 merged commit 6111da6 into main Aug 17, 2022
@tlp19 tlp19 deleted the lumos-0.0.3 branch August 25, 2022 06:46
@tlp19 tlp19 self-assigned this Aug 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants