A few optimizations #280

bconstanzo · 2022-08-16T15:51:39Z

I've been profiling and testing a few changes that have a major impact on the Windows performance of ALEAPP.

The reported time goes down by about 3x on my benchmarks. The changes don't affect behavior as far as I could test, the results are consistent between runs, it just goes faster.

Mainly this is achived by placing a few caches in place (functools.lru_cache with maxsize set to None), and avoiding duplicating work (fnmatch.fnmatch usage is changed by a "deconstructed" version of it). Tried to keep the code as simple and readable as it could be, while netting some nice speedups.

There's also a small bit that covers specifically snappy decompression, where I managed to improve the algorithm just a bit. Enough to get a 4-5% performance increase, without resorting to overly complicated code.

Methodology:

Ran ALEAPP against Josh Hickman's Android 12 test image.
Checked reported run time.
Profiled everything and analyzed the call graphs and functions (with cProfile, gprof2dot, and the amazing line_profiler).
Got millions of files and hundreds of gigabytes of storage space used up on my disk.
Found out a few spots where things could be improved.
Rinse and repeat.

This substantially speeds up the program under Windows (about twice as fast) without any changes to the behavior and results.

…epeteadly This brings a ~40% speedup* on top of the previous commit. * note I'm basing this timings no cProfile runs, but it holds quite nicely on "normal" runs

This changes how things are processed and speeds things up a bit, however now shutil.copy2() is taking a significant % of time of this function, probably due to some of the newer artifacts

Helps another ~10% or so with running times (because we're normcasing every filepath over and over again for every artifact)

makes this function about twice as fast replaced a for that wrote one byte at a time with a bit write plus a condition for the (rare) case where you're writing past the end and you have to repeat part of the written-out output

I checked fnmatch.filter() code on the standard library, and then went with the same style that is used in the other seekers. Basically fnmatch.filter() does the same as the other seekers were doing already, so it's the same (haven't tested it though)

jijames · 2022-08-16T16:41:32Z

@bconstanzo do you have the tests you used for this? Just for reference.
@abrignoni looks great.

Real results without usagestats on Linux:
Before
real 0m24.984s
user 0m9.424s
sys 0m0.541s

After
real 0m22.285s
user 0m9.364s
sys 0m0.469s

bconstanzo · 2022-08-16T20:09:40Z

Under Windows 10 Home 64 bit, I set up a virtual env with Python 3.10.3 and all the dependencies from the requirements.txt file.

Then it was just cloning ALEAPP on sunday night, and ran:

python aleapp.py -t tar -i path_to_image -o path_for_reports

I have a Ryzen 4600H, 24GB RAM and ran against a SATA HDD just for going with the worst case scenario, though I also tested against an nvme drive and it ran just as fast.

For profiling I'd run it as python -m cProfile -o profile_file.pstats aleap.py... or add the @Profile decorator and go with a kernprof -lv aleapp.py ....

The timings and speedups I commented about in the commits are based on what I saw during the profile runs, which under cProfile were running about half as fast as it'd run normally, and the reported time by the tool.

Right now I just benchmarked with a very simple script:

import time
import subprocess

t0 = time.perf_counter()
p = subprocess.run(
  r'python aleapp.py -t tar -i "D:\Test\xLEAPP\test_data\Magnet Acquire\Android 12 - Data.tar" -o "D:\Test\xLEAPP\output\aleapp-magnet"',
  shell=True
)
t1 = time.perf_counter()

print()
print()
print(f"Time taken: {t1 - t0}")

And it gave me 477 seconds for a clone of ALEAPP (which the tool reported as 6 minutes and 34 seconds) and 202 seconds for the patched version (which the tool reported as 2 minutes sharp). That is just shy of 2.4x faster, and there clearly is something off with the reported time.

bconstanzo added 7 commits August 14, 2022 22:25

adds a cache to os.path.basename()

b6dcb64

This substantially speeds up the program under Windows (about twice as fast) without any changes to the behavior and results.

precompiles the fnmatch patterns to avoid calling os.path.normcas() r…

c175ba0

…epeteadly This brings a ~40% speedup* on top of the previous commit. * note I'm basing this timings no cProfile runs, but it holds quite nicely on "normal" runs

speeds up ilapfuncs.media_to_html() with a filter

90a4c36

This changes how things are processed and speeds things up a bit, however now shutil.copy2() is taking a significant % of time of this function, probably due to some of the newer artifacts

caching os.path.normcase()

e3297e5

Helps another ~10% or so with running times (because we're normcasing every filepath over and over again for every artifact)

FileSeekerZip precompiled pattern matching

424beab

abrignoni merged commit f7541e8 into abrignoni:master Aug 16, 2022

jijames mentioned this pull request Aug 17, 2022

ALEAPP ports - aleapp-style profiles, optimizations abrignoni/iLEAPP#331

Merged

jijames added a commit to DFIRScience/RLEAPP that referenced this pull request Aug 17, 2022

optimizations from abrignoni/ALEAPP#280

9611293

jijames mentioned this pull request Aug 17, 2022

Artifact plugin system, GUI profiles, Optimization abrignoni/RLEAPP#111

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A few optimizations #280

A few optimizations #280

bconstanzo commented Aug 16, 2022 •

edited

Loading

jijames commented Aug 16, 2022

bconstanzo commented Aug 16, 2022

A few optimizations #280

A few optimizations #280

Conversation

bconstanzo commented Aug 16, 2022 • edited Loading

jijames commented Aug 16, 2022

bconstanzo commented Aug 16, 2022

bconstanzo commented Aug 16, 2022 •

edited

Loading