Mailing List picongpu-users@hzdr.de Message #211
From: Khikhlukha Danila <Danila.Khikhlukha@eli-beams.eu>
Subject: RE: [PIConGPU-Users] [PIConGPU-Users] Restart failure
Date: Mon, 13 Feb 2017 14:47:08 +0000
To: picongpu-users@hzdr.de <picongpu-users@hzdr.de>
Hi René,
sure, pls. see the attachment. Please let me know if more information is needed.  

D.
________________________________________
From: picongpu-users@hzdr.de [picongpu-users@hzdr.de] on behalf of René Widera [r.widera@hzdr.de]
Sent: Monday, February 13, 2017 3:39 PM
To: picongpu-users@hzdr.de
Subject: Re:  [PIConGPU-Users] [PIConGPU-Users] Restart failure

Dear Danila,

could you please send use the `stdout`, `stderr` and the files from the
`tbg` folder?

best,

René

On 02/13/2017 03:11 PM, Khikhlukha Danila wrote:
> Dear all,
> currently I was trying to setup PoG in the Jureca machine. It all worked
> fine for the LWFA example, however when I tried to restart the
> simulation I received a segfault almost immediately.
> My tool chain is as follows
>
> GCC/5.4.0
> CUDA/8.0.44
> MVAPICH2/2.2-GDR
> HDF5/1.8.17
> Boost/1.61.0
>
> So, the first run didn't have any problems -- pictures, save points and
> data dumps were created. When I tried to launch the restart it crashes
> although I explicitly specify the savepoint directory.
>
> test$ diff -r 0002/submit/ 0002_restart/submit/
> diff -r 0002/submit/0008gpus.cfg 0002_restart/submit/0008gpus.cfg
> 39c39
> < TBG_steps="-s 1024"
> ---
>> TBG_steps="-s 2048"
> 41a42
>> TBG_restart="--restart --restart-directory
> /work/hhh20/hhh20z/run_0002/simOutput/checkpoints"
> 67a69
>>                    !TBG_restart      \
>
> I also checked that it exists and accessible. I tried to switch on some
> debug information, with the following command:
>
> $PICSRC/configure -c"-DCMAKE_VERBOSE_MAKEFILE=ON -DPIC_VERBOSE_LVL=29
> -DPMACC_VERBOSE_LVL=7"
>
> however I didn't find any information except a standard message:
> [jrc0007:mpi_rank_4][error_sighandler] Caught error: Segmentation fault
> (signal 11)
>
> Could you please advice me if there are another way how to diagnose the
> problem (except launching a gdb). may be I'm doing something wrong?
> However restart used to work on other machines...
>
>
> Thank you in advance,
> Danila.
>

--
René Widera
Abteilung Laser-Teilchenbeschleunigung (FWKT)
Helmholtz-Zentrum Dresden-Rossendorf
Tel: +49 (0351) 260 3543
r.widera@hzdr.de
http://www.hzdr.de

Vorstand: Prof. Dr. Dr. h. c. Roland Sauerbrey,
           Prof. Dr. Dr. h. c. Peter Joehnk
Vereinsregister: VR 1693 beim Amtsgericht Dresden

#############################################################
This message is sent to you because you are subscribed to
  the mailing list <picongpu-users@hzdr.de>.
To unsubscribe, E-mail to: <picongpu-users-off@hzdr.de>
To switch to the DIGEST mode, E-mail to <picongpu-users-digest@hzdr.de>
To switch to the INDEX mode, E-mail to <picongpu-users-index@hzdr.de>
Send administrative queries to  <picongpu-users-request@hzdr.de>

Subscribe (FEED) Subscribe (DIGEST) Subscribe (INDEX) Unsubscribe Mail to Listmaster