stencil blocking may have foobarred performance... #261

jeffhammond · 2017-08-16T00:31:54Z

Need to investigate but the recent commits have shown a massive regression in some cases.

rfvander · 2017-08-16T05:03:43Z

That's strange. I used to have tiling for all three stencil implementations, in the good old days of SERIAL, MPI, and OpenMP. But I hardly ever saw a benefit. and it complicated the code, so I eliminated it for all but the serial implementation. In principle it should allow better reuse, but it takes a LARGE grid to see that happen. If performance drops precipitously because of it, there's a pathology (bug).

jeffhammond · 2017-08-16T05:05:16Z

This was because omitting the blocking argument meant that measurements used star 2 instead of star 4, but we still have to deal with the fact that huge tile sizes led to inadequate parallelism. We should branch on (grid_size/tile_size)^2<num_threads and not bother tiling there.

rfvander · 2017-08-16T05:08:31Z

Yes, we saw the same with transpose, as you may recall. But I wouldn't do anything automatic. Users should always be allowed to shoot themselves in the foot.

rfvander · 2017-08-16T05:08:52Z

But maybe we can warn them of the bullet holes.

rfvander · 2017-08-16T05:19:24Z

I meant to ask you if you ever get requests for box-shaped stencils (instead of star stencils). For the AMR code I effectively had to support that in MPI (too complicated to explain why, and not worth it), and it was actually very easy. I'd like to add that to our MPI variants.

jeffhammond · 2017-08-16T05:41:10Z

TBB wins big time on KNL because of tiling. Tiling helps for dimension 2000-16000 with star radius 4.

jeffhammond · 2017-08-16T05:42:35Z

I am the user and I am protecting my feet by making the code disable tiling when it is going to serialize.

jeffhammond · 2017-08-16T05:43:41Z

My code generator is supposed to support square pattern but there's a bug in it.

jeffhammond added the C label Aug 16, 2017

jeffhammond self-assigned this Aug 16, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

stencil blocking may have foobarred performance... #261

stencil blocking may have foobarred performance... #261

jeffhammond commented Aug 16, 2017

rfvander commented Aug 16, 2017

jeffhammond commented Aug 16, 2017

rfvander commented Aug 16, 2017

rfvander commented Aug 16, 2017

rfvander commented Aug 16, 2017

jeffhammond commented Aug 16, 2017 via email

jeffhammond commented Aug 16, 2017 via email

jeffhammond commented Aug 16, 2017 via email

stencil blocking may have foobarred performance... #261

stencil blocking may have foobarred performance... #261

Comments

jeffhammond commented Aug 16, 2017

rfvander commented Aug 16, 2017

jeffhammond commented Aug 16, 2017

rfvander commented Aug 16, 2017

rfvander commented Aug 16, 2017

rfvander commented Aug 16, 2017

jeffhammond commented Aug 16, 2017 via email

jeffhammond commented Aug 16, 2017 via email

jeffhammond commented Aug 16, 2017 via email