Return-Path: <a.huebl@hzdr.de> Received: from [178.24.5.94] (account huebl@hzdr.de HELO [192.168.178.21]) by hzdr.de (CommuniGate Pro SMTP 6.1.12) with ESMTPSA id 14797277; Wed, 19 Oct 2016 18:18:02 +0200 In-Reply-To: <list-14797312@cg1.fz-rossendorf.de> References: <list-14797312@cg1.fz-rossendorf.de> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=UTF-8 Subject: Re: [PIConGPU-Users] [PIConGPU-Users] [PIConGPU-Users] [PIConGPU-Users] [PIConGPU-Users] Supercell concept From: Axel Huebl <a.huebl@hzdr.de> Date: Wed, 19 Oct 2016 16:17:59 +0000 To: picongpu-users@hzdr.de,Khikhlukha Danila <Danila.Khikhlukha@eli-beams.eu> Message-ID: <B69B872F-3C2C-4C2D-9114-FD2FEAFF8139@hzdr.de> Hi Danila, 32 is the CUDA "warp" size, a group of threads performing a SIMD operation on a GPU: https://docs.nvidia.com/cuda/cuda-c-programming-guide/ Axel On October 19, 2016 6:00:11 PM CEST, Khikhlukha Danila <Danila.Khikhlukha@eli-beams.eu> wrote: >Dear René, >thanks a lot, it is way more clear now. >However I would like to ask you if you can recommend me some further >reading to develop more intuition regarding this dependency on the >stencil and macro-particle shape? >Unfortunately at the moment I can't understand where 32 is coming from >for the classical Yee scheme + CIC particle for instance. > >Thank you in advance, >Danila >________________________________ >From: picongpu-users@hzdr.de [picongpu-users@hzdr.de] on behalf of René >Widera [r.widera@hzdr.de] >Sent: Wednesday, October 19, 2016 5:25 PM >To: picongpu-users@hzdr.de >Subject: Re: [PIConGPU-Users] [PIConGPU-Users] [PIConGPU-Users] >[PIConGPU-Users] Supercell concept > >Dear Danila, > >the supercell size defines the number of worker threads and the shared >memory cache. >256 is a good value to utilize the most gpus. >A supercell size for each directions needs to be greater or equal to >the needed neighbors of the stencil for the algorithms. This condition >is checked at compile time and depends on the selected solvers and the >species shape. >The size of the supercell per direction is independent. The volume >x×y×z of the supercell should be a multible of 32. > >best, >René > >Am 19. Oktober 2016 17:14:28 MESZ, schrieb Khikhlukha Danila ><Danila.Khikhlukha@eli-beams.eu>: >Dear René, >thank you for the prompt replay. Indeed the problem is quite simple. So >wrap it up, the number of cell in each direction should give N % (N_gpu >* N_supercells) == 0. > >Just for educational purpose: what is so special about the number 128 >(I guess it is a size of a cache)? Could in be for instance 256? Would >it be possible to specify the same number of super cells in X and Z >direction? > >Thanks a lot, >Danila. >________________________________ >From: picongpu-users@hzdr.de [picongpu-users@hzdr.de] on behalf of René >Widera [r.widera@hzdr.de] >Sent: Wednesday, October 19, 2016 4:59 PM >To: picongpu-users@hzdr.de >Subject: Re: [PIConGPU-Users] [PIConGPU-Users] Supercell concept > >Dear Danila, > >the volume per gpu needs to be a multiple of the superCell size. > >In your case 4176/8gpus=522. 522 is not dividable bei 8. >4160 cells in y direction should solve your problem. > >Please keep in mind if you change the superCell size to a value smaller >than 128cells the most simulation run slower. >The default size with 8x8x4 shows the best result for the most cases. > >best, > >René > >Am 19. Oktober 2016 16:46:28 MESZ, schrieb Khikhlukha Danila ><Danila.Khikhlukha@eli-beams.eu>: >Dear all, >I have some troubles trying to specify a computational grid with moving >windows using picongpu v0.2.0. We were discussing this topic >previously, however I have the same problem again, so likely I >misunderstand something from the last time. > >So I'm trying to launch the simulation using 4 K80 cards: 8 GPU devices >overall. In the memory.param file I have specified the SuperCell layout >as (2,8,2). I want to have one GPU in transversal direction and 8 in >longitudinal. So in cfg file I specified: >TBG_gpu_x=1 >TBG_gpu_y=8 >TBG_gpu_z=1 > >Then I would like my real computational domain to have 256 x 3712 x 256 >cell. Since the moving window reduces the real domain by 1 GPU in y >direction, I specified my grid as 256 x 4176 x 256. (4176 = 9/8*3712) >TBG_gridSize="-g 256 4176 256" > >However, trying to submit such a cfg file I'm receiving an assertion >fail: > >void >picongpu::MySimulation::checkGridConfiguration(PMacc::DataSpace<DIM>,PMacc::GridLayout<DIM>) >[with unsigned int DIM = 3u]: Assertion`gridSizeLocal[i] % >MappingDesc::SuperCellSize::toRT()[i] == 0' failed. > >However 4176 % 8 == 0 and 256 % 2 == 0. > >Could you please guide me how to solve this issue? Looks like I >misunderstand the concept of the SuperCell. > >Thank you in advance, >Danila. > > > >-- >Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail >gesendet.