-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
reformulated neighboring codes, introducing full wavelet transformation and image denoising #64
Conversation
- neighboring code was changed, first of all neighboring indices were reassigned to 1-56, for 1-24=sides, 25-48=corners and 49-56=edges. - hvy_neighbor now has 3*56 length as for cvs different definition can coexist - whole code for calc send bounds and calc recv bounds has been rewritten, it is now formulas and not list, drastically reducing code clutter - all patch definition moved to treelib library in neighborhood.f90 - then module_mpi and module_wavelt just include those and call them - neighborhood definition now consistent between 2D and 3D - more documentation will follow, but for now there is a small function write_neighborhood_info that can help - unitTests have moved into its own unitTest module which already existed but now has more tests - some basic tests got more command line functionality and a new sync test was added to check if patches are synced correctly, this test is not added to any testing framework though yet - find neighbors was rewritten to accomodate new neighbor relations - IO has option to save ghosts now as additional parameter, gonna add full hdf2vtk support for that soon - CE now modifies separately inside the domain and on the ghost patches, the last is actually just used for overwriting values for coarser neighbors - dump block was updated, it now also exists in fancy - this helps me with debugging what is going on if I don't want to use paraview and just check block results - adaptive tests save and compare refined values only for unlifted wavelets to save space
- Ghost patches and interior patches are now always defined from the block that is requesting the indices, inverting takes place once we go from sender-perspective to receiver-perspective - meaning, that ijk_patches with same indices for sender and receiver are not the correspondant parts but we need to invert the relation beforehand - code was tested and should be consistent with what is written in output - some aesthetical output changes
- because it was made not necessary and in order to avoid confusion I've removed it - to accomodate, family relations now stretch over 16 indices for all levels - I also did some more aesthetics and added sync test to ghost sync testing block for simulation runs
- and some very little further preparation for CVS
- now there is a string which defines what we want to do - i hope thats more clear now
- made all available sync cases compatible with overfull CVS grid - special cases for RHS which is leaf-only is implemented - all other cases always assume that the full grid has valid values
- buffer Zone was changed to Nwc 2, this changed some results for lifted cases. To be exact it changed blob_adaptive_CDF42, *_CDF44, _*CDF62 and acm_CDF44 - I updated only the values that changed - also reduced the dataload of acm by reducing the saving frequency
- it works (for now) - some differences in order 1e-11 are seen in blob tests though, need to investigate that - also something in init function crashes for the adapt_tree loop which I am still investigating - included more case for overfull grid in order for coarse extension to be applied correctly
- loop now correctly creates mothers - I did a rudimentary check for block size criteria and yes, CDF42 works with BS=12 and CDF44 with BS=16 - There is still an induced difference between CVS and non-CVS loop results which I need to discuss with Thomas - checked performance with simple tests and for 3D it takes around 15% longer. This possibly worsens with more blocks but it's not too bad
- I finished the CVS adapt tree loop, it still needs to be polished and checked more, but it looks good for now - all parts of the CVS loop got neat functions - azzalinis iterative loop is implemented to estimate the thresholding for CVS or image denoising - image denoising got a post function, testing invertibility as well - invertibility also got it's own unit test - added two new tests, test results should look nice now - more smaller changes
- loop now first creates whole grid and then fills it - this means there are more refinement flags yejjjj (I am sorry, this is getting out of hand) - I think this should drastically reduce the overhead of sync_lgt_data and update_metaData as their usage was reduced - for adapt_tree_CVS I fixed a bug where interior blocks tried to sync with finer neighbors and this should not be allowed as we would need to respect this with CE, but the less CE the better (just at leaf-layer were we can't avoid it) - a few more family functions - some functions were adapted to handle new refinement flag that basically tells us a block exists but has trash in it
- this sounds confusing but basically we decompose all leaf blocks first as this is the most expensive thing to do and then go lvl-wise, update the mother blocks and decompose them until all are there - with this we do not need to do any lgt_data sync withing the decomposition or reconstruction loop and this is very nice for paralellization - added a prototype for refinement for only significant blocks but this is very experimental
- old adapt_tree loop is made obsolete and exchanged with new full tree version - why? There was a bug in there, not a too bad one but then we made the design choice to just stick with the full wavelet transformation as I will focus on this anyways and it is more clearer - this was made even more clean, we now have 3 grid updates only and no lgt_data sync in decomposition or reconstruction loops - I polished the functions a bit, did some renaming and moving around. Coarsening_indicator_tree looks cleaner now - some flags are still to be moved but the refinement flag hell was dampened a bit - increased the block limit checker in the main loop to accommodate for the increased amount of blocks in the full wavelet transformation (factor 8/7 or 4/3)
- updating test results for new full wavelet transformation adaption - update test initial conditions to be more complex, this takes a bit more time but catches more things which are going on - updated blockmin to what i've found from unit tests - image denoising test in post processing
- got rid of all weird temporary refinement flags that I have introduced - I love how clean it is starting to look
- ney insects tests for ry-run in order to verify the creation and movement of objects - I revisited the block estimate and adapted it to recent changes. Now we can squeeze more blocks in. Since we are seldom memory-bound this is probably only nice for development testing - seems like I forgot some test results so in this commit they are all included
LIB/INI/setInitialCondition_tree.f90
Outdated
@@ -160,9 160,9 @@ subroutine setInitialCondition_tree(params, hvy_block, tree_ID, adapt, time, ite | |||
! NOTE: the grid adaptation can take the mask function into account (such that the fluid/solid | |||
! interface is on the finest level). | |||
if (present(hvy_mask) .and. params%threshold_mask) then | |||
call adapt_tree( time, params, hvy_block, tree_ID, params%coarsening_indicator_inicond, hvy_tmp, hvy_mask=hvy_mask ) | |||
call adapt_tree( time, params, hvy_block, tree_ID, params%coarsening_indicator, hvy_tmp, hvy_mask=hvy_mask) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why has this been changed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, I accidently overwrote it by copying.
It is only applied when NOT reading in from file in master currently. Is this correct or should it be applied for both?
call refine_tree( params, hvy_block, hvy_tmp, "everywhere", tree_ID=tree_ID_flow ) | ||
! refine the mesh, normally "everywhere" | ||
if (params%threshold_mask) then | ||
call refine_tree( params, hvy_block, hvy_tmp, "everywhere", tree_ID=tree_ID_flow, hvy_mask=hvy_mask ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is for your experimental "refine_selective" indicator, I assume ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly, selective indicator needs mask for coarsening indicator so it is passed on here.
end function | ||
!> Convert a neighborhood to a different lvl_diff and give back the lower bound of possible indices. | ||
!> This is needed as neighborhood relations might have multiple counterparts (up to 4) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get what this function does. can you add a few more words?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any neighborhood patch can have a specific level-difference. However for different level-differences those patches are occupied differently. For example, a left (-x) neighbor with possible indices 1-4 looks the following:
lvl_diff=0 (same level) : 1 is occupied, for lvl_diff=0 it is always the first available number of a patch
lvl_diff= 1 (coarser neighbor) : one of the entries from 1-4 is occupied. It depends on which part of the face of the larger block faces us as a neighbor (-y-z, y-z, -y z, y z) and is assigned the number accordingly
lvl_diff=-1 (finer neighbor) : all entries 1-4 are occupied by different blocks. Their number tells us which quarter of the face they lay (again -y-z, y-z, -y z, y z)
When I now want to know "Do I have a same level neighbor for the same patch" or "Do I have a coarser neighbor for the same patch" the easiest would be to add or subtract multiples of 56, but I might miss neighbors that have not exactly this difference (from 1 to 56 2 for example).
I therefore simply compute the bounds of possible indices (1 and 4) and check if ANY of those are taken up with the given level-difference. In this case I know that there is a neighbor if one of them is occupied. For Faces we have 4 possible indices, for edges 2 and for corners 1.
- obviously Jmax of a simulation can still not be smaller than JMaxActive of the file because I am not a magician who wants to make blocks disappear - the rest works fine - for both treecode representation in files
! values from MPI - reference at module_mpi%init_ghost_nodes | ||
mem_per_block = mem_per_block & | ||
2.0 * real(N_MAX_COMPONENTS) * real(product(Bs(1:dim) 2*g) -product(Bs(1:dim))) & ! real buffer ghosts, 1send 1recv | ||
3.0 * real(max_neighbors_used) * 5.0 / 2.0 & ! int buffer (4byte hence /2), sending 6 values each, 3 buffers exist |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
justa detail but if its 6 values each, then 5.0 --> 6.0, right ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked at it and made sure it coincides with what is in module_mpi:
- One buffer has 7 values each
- Other has 4 or 5 (DEV)
max_family = 9 | ||
max_neighbors = 56*2 32 ! 2D has only 32 possibilites but 56 from 3D is used | ||
max_neighbors_used = 56 24 24 ! lvl J 1,J,J-1 for full tree grid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this the maximum number of possibly active neighbors? i.e. a block can have at most 56 24 24 neighbors on the three levels ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes!
- Level J 1 can have all possible relations full (being 56 in 3D and 12 in 2D)
- Level J can have 24 relations occupied (8 in 2D), which is 1 for each patch
- Level J-1 has 24 max relations? - in practice I think even drastically less? I guess it could be 3 for 2D as illustrated below, for 3D I don't know (6 maybe?) but I just cap it together at 24 (and 8 for 2D)
1122
1122
33BX
33XX
B: block where we look at neighbors
1-3: Neighbors on level J-1
X: sisters of B on level J
logical, intent(in), optional :: no_deletion | ||
character(len=*) :: sync_case !< String representing which kind of syncing we want to do | ||
!> Additional value to be considered for syncing logic, can be level or refinement status from which should be synced, dependend on sync case | ||
integer(kind=ik), intent(in), optional :: s_val |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks to me like this is not actually optional, or otherwise a IF PRESENT(s_val) should appear. Maybe just omit the OPTIONAL?
Also, please add a few lines on what sync_case does.
sync_case='level' -> only sync blocks on the given level "s_val"
sync_case='ref' -> only sync blocks with identical refinement status "s_val"
is that correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, it is not optional.
At first I thought to include sync case "full" like for ghost point synching but eventually I decided it makes no sense for family synch on full tree grid.
Reformulated neighboring codes
Previously, the code was only able to handle one neighbor for each ghost layer patch. However, for the full wavelet transformation several neighbors on different levels for the same patch will be needed. A new neighbor formulation enables the coexistence of neighbors on different levels. For all 56 possible patches in 3D we have now 3x56 possible neighbor relations. 2D and 3D codes have been merged to avoid confusion of neighbor codes. Taken from the code, the neighbor formulations are now constructed the following way:
As one can see, sides have 4 possible codes for the 4 possible configurations of blocks on finer level, edges have 2 and corners 1 possible configurations. 2D codes onlz have X- and Y-side with 2 configurations and the X-Y edges with 1 configuration each.
With the change in code formulation the concerning code to compute the boundaries has been revisited and greatly simplified. It should hopefully be clear enough now.
Full wavelet transformation -
adapt_tree_cvs
The full wavelet transformation on all levels has been implemented. It works by doing the following steps:
JMin
. New blocks are marked as empty with the refinement flag.JMin
) can only be decomposed if they have correctly synched blocks from all neighbors on the same level. This way we avoid the need of any coarse extension on blocks besides the leaf-layer blocks. It also effectively creates an iterative decomposition algorithm that starts on the leaf-layer and for later steps mostly works level-wise down toJMin
due to restrictions of blocks waiting for their medium-lvl neighbors. Luckily, as the grid information is already present, only synching block values and lgt-data is needed for each step.This algorithm has the following benefits:
Image denoising
The iterative algorithm by Azzalini was implemented in order to compute image denoising. This was introduced as a new step after full wavelet decomposition and before the grid adaption. Possible, CVS iterative algorithm will be introduced here as well but is not yet tested. A new post-processing tool (wabbit-post --denoise) deploys the denoising algorithm on wabbit files of uniform grid to be denoised and a test was added to check the results. Tests show that the denoising algorithm works quite well for the bi-orthogonal wavelets.
Miscellaneous changes