| 
|  |  | The only dependency for the shapes & stencils is the *edge* size of the super cell - we use up to one super cell for caching & communication (8x8x4). 
 One can build a supercell with a broken number of warps, but you loose performance. On nvidia gpus, just keep our default.
 
 Axel
 
 On October 19, 2016 6:17:59 PM CEST, Axel Huebl <a.huebl@hzdr.de> wrote:
 >Hi Danila,
 >
 >32 is the CUDA "warp" size, a group of threads performing a SIMD
 >operation on a GPU:
 >  https://docs.nvidia.com/cuda/cuda-c-programming-guide/
 >
 >Axel
 >
 >On October 19, 2016 6:00:11 PM CEST, Khikhlukha Danila
 ><Danila.Khikhlukha@eli-beams.eu> wrote:
 >>Dear René,
 >>thanks a lot, it is  way more clear now.
 >>However I would like to ask you if you can recommend me some further
 >>reading to develop more intuition regarding this dependency on the
 >>stencil and macro-particle shape?
 >>Unfortunately at the moment I can't understand where 32 is coming from
 >>for the classical Yee scheme + CIC particle for instance.
 >>
 >>Thank you in advance,
 >>Danila
 >>________________________________
 >>From: picongpu-users@hzdr.de [picongpu-users@hzdr.de] on behalf of
 >René
 >>Widera [r.widera@hzdr.de]
 >>Sent: Wednesday, October 19, 2016 5:25 PM
 >>To: picongpu-users@hzdr.de
 >>Subject: Re: [PIConGPU-Users] [PIConGPU-Users] [PIConGPU-Users]
 >>[PIConGPU-Users] Supercell concept
 >>
 >>Dear Danila,
 >>
 >>the supercell size defines the number of worker threads and the shared
 >>memory cache.
 >>256 is a good value to utilize the most gpus.
 >>A supercell size for each directions needs to be greater or equal to
 >>the needed neighbors of the stencil for the algorithms. This condition
 >>is checked at compile time and depends on the selected solvers and the
 >>species shape.
 >>The size of the supercell per direction is independent. The volume
 >>x×y×z of the supercell should be a multible of 32.
 >>
 >>best,
 >>René
 >>
 >>Am 19. Oktober 2016 17:14:28 MESZ, schrieb Khikhlukha Danila
 >><Danila.Khikhlukha@eli-beams.eu>:
 >>Dear René,
 >>thank you for the prompt replay. Indeed the problem is quite simple.
 >So
 >>wrap it up, the number of cell in each direction should give N %
 >(N_gpu
 >>* N_supercells) == 0.
 >>
 >>Just for educational purpose: what is so special about the number 128
 >>(I guess it is a size of a cache)? Could in be for instance 256? Would
 >>it be possible to specify the same number of super cells in X and Z
 >>direction?
 >>
 >>Thanks a lot,
 >>Danila.
 >>________________________________
 >>From: picongpu-users@hzdr.de [picongpu-users@hzdr.de] on behalf of
 >René
 >>Widera [r.widera@hzdr.de]
 >>Sent: Wednesday, October 19, 2016 4:59 PM
 >>To: picongpu-users@hzdr.de
 >>Subject: Re: [PIConGPU-Users] [PIConGPU-Users] Supercell concept
 >>
 >>Dear Danila,
 >>
 >>the volume per gpu needs to be a multiple of the superCell size.
 >>
 >>In your case 4176/8gpus=522. 522 is not dividable bei 8.
 >>4160 cells in y direction should solve your problem.
 >>
 >>Please keep in mind if you change the superCell size to a value
 >smaller
 >>than 128cells the most simulation run slower.
 >>The default size with 8x8x4 shows the best result for the most cases.
 >>
 >>best,
 >>
 >>René
 >>
 >>Am 19. Oktober 2016 16:46:28 MESZ, schrieb Khikhlukha Danila
 >><Danila.Khikhlukha@eli-beams.eu>:
 >>Dear all,
 >>I have some troubles trying to specify a computational grid with
 >moving
 >>windows using picongpu v0.2.0. We were discussing this topic
 >>previously, however I have the same problem again, so likely I
 >>misunderstand something from the last time.
 >>
 >>So I'm trying to launch the simulation using 4 K80 cards: 8 GPU
 >devices
 >>overall. In the memory.param file I have specified the SuperCell
 >layout
 >>as (2,8,2). I want to have one GPU in transversal direction and 8 in
 >>longitudinal. So in cfg file I specified:
 >>TBG_gpu_x=1
 >>TBG_gpu_y=8
 >>TBG_gpu_z=1
 >>
 >>Then I would like my real computational domain to have 256 x 3712 x
 >256
 >>cell. Since the moving window reduces the real domain by 1 GPU in y
 >>direction, I specified my grid as 256 x 4176 x 256. (4176 = 9/8*3712)
 >>TBG_gridSize="-g 256 4176 256"
 >>
 >>However, trying to submit such a cfg file I'm receiving an assertion
 >>fail:
 >>
 >>void
 >>picongpu::MySimulation::checkGridConfiguration(PMacc::DataSpace<DIM>,PMacc::GridLayout<DIM>)
 >>[with unsigned int DIM = 3u]: Assertion`gridSizeLocal[i] %
 >>MappingDesc::SuperCellSize::toRT()[i] == 0' failed.
 >>
 >>However 4176 % 8 == 0 and 256 % 2 == 0.
 >>
 >>Could you please guide me how to solve this issue? Looks like I
 >>misunderstand the concept of the SuperCell.
 >>
 >>Thank you in advance,
 >>Danila.
 >>
 >>
 >>
 >>--
 >>Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail
 >>gesendet.
 >
 >
 >#############################################################
 >This message is sent to you because you are subscribed to
 >  the mailing list <picongpu-users@hzdr.de>.
 >To unsubscribe, E-mail to: <picongpu-users-off@hzdr.de>
 >To switch to the DIGEST mode, E-mail to <picongpu-users-digest@hzdr.de>
 >To switch to the INDEX mode, E-mail to <picongpu-users-index@hzdr.de>
 >Send administrative queries to  <picongpu-users-request@hzdr.de>
 
 
 |  |