|
Dear Danila,
could you please send use the `stdout`, `stderr` and the files from the `tbg` folder?
best,
René
On 02/13/2017 03:11 PM, Khikhlukha Danila wrote:
Dear all,
currently I was trying to setup PoG in the Jureca machine. It all worked
fine for the LWFA example, however when I tried to restart the
simulation I received a segfault almost immediately.
My tool chain is as follows
GCC/5.4.0
CUDA/8.0.44
MVAPICH2/2.2-GDR
HDF5/1.8.17
Boost/1.61.0
So, the first run didn't have any problems -- pictures, save points and
data dumps were created. When I tried to launch the restart it crashes
although I explicitly specify the savepoint directory.
test$ diff -r 0002/submit/ 0002_restart/submit/
diff -r 0002/submit/0008gpus.cfg 0002_restart/submit/0008gpus.cfg
39c39
< TBG_steps="-s 1024"
---
TBG_steps="-s 2048"
41a42
TBG_restart="--restart --restart-directory
/work/hhh20/hhh20z/run_0002/simOutput/checkpoints"
67a69
!TBG_restart \
I also checked that it exists and accessible. I tried to switch on some
debug information, with the following command:
$PICSRC/configure -c"-DCMAKE_VERBOSE_MAKEFILE=ON -DPIC_VERBOSE_LVL=29
-DPMACC_VERBOSE_LVL=7"
however I didn't find any information except a standard message:
[jrc0007:mpi_rank_4][error_sighandler] Caught error: Segmentation fault
(signal 11)
Could you please advice me if there are another way how to diagnose the
problem (except launching a gdb). may be I'm doing something wrong?
However restart used to work on other machines...
Thank you in advance,
Danila.
--
René Widera
Abteilung Laser-Teilchenbeschleunigung (FWKT)
Helmholtz-Zentrum Dresden-Rossendorf
Tel: +49 (0351) 260 3543
r.widera@hzdr.de
http://www.hzdr.de
Vorstand: Prof. Dr. Dr. h. c. Roland Sauerbrey,
Prof. Dr. Dr. h. c. Peter Joehnk
Vereinsregister: VR 1693 beim Amtsgericht Dresden
|
|