From: "Andrei Berceanu berceanu@runbox.com" Received: from mx2.fz-rossendorf.de ([149.220.142.12] verified) by hzdr.de (CommuniGate Pro SMTP 6.2.4) with ESMTP id 19576462 for picongpu-users@cg.hzdr.de; Tue, 15 May 2018 17:44:14 +0200 Received: from a1911.mx.srv.dfn.de (a1911.mx.srv.dfn.de [194.95.232.137]) by mx2.fz-rossendorf.de (Postfix) with ESMTP id 2410E403BC for ; Tue, 15 May 2018 17:44:14 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at mgw2-han.srv.dfn.de Authentication-Results: mgw2-han.srv.dfn.de (amavisd-new); dkim=pass (2048-bit key) header.d=runbox.com Received: from aibo.runbox.com (aibo.runbox.com [91.220.196.211]) by a1911.mx.srv.dfn.de (Postfix) with ESMTPS id 974A3E0091 for ; Tue, 15 May 2018 17:44:12 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=runbox.com; s=rbselector1; h=Message-Id:In-Reply-To:Date:Subject:CC:To:From:MIME-Version :Content-Transfer-Encoding:Content-Type; bh=L8f0CMgKU3cJDPWf10kW0anHBHbGFGFBpii1Z0otEwA=; b=f+GqLOsb2OJ0ixaE1zEE3nqebm MYWGsTMLj8xS/IoAg1LLd5bH2aqj4E6cwsVAWXg0+nUttktQFifDQdgqYoguQl0w7WoaANLJcZFI2 otjlmgUHQCefpn3/Agw44s48jRgbKiLES2K1VbPLYA5ylt1Vq7Ds4huVGsdOID5RFwHXwWHv3vm7X 3lSi+X94G2S7Y+9draxYooodYqal9zqFYdICJK/pSH/zY4cJLSYhuTLM5eEqK+DUZaTBXprGwzclW hTgkZ+V03MBYX8cbVtjFH+f0JtYcyRMZ9KCw1BGB1//6b/H6sRiWr3EfbSt8X0p4Gz5fS9WNZoWnB d7O2v9tQ==; Received: from [10.9.9.127] (helo=rmmprod05.runbox) by mailtransmit02.runbox with esmtp (Exim 4.86_2) (envelope-from ) id 1fIc7Q-0005AO-RA for picongpu-users@hzdr.de; Tue, 15 May 2018 17:44:05 +0200 Received: from mail by rmmprod05.runbox with local (Exim 4.86_2) (envelope-from ) id 1fIc7Q-0006Ux-Pq; Tue, 15 May 2018 17:44:04 +0200 Content-Type: text/plain; charset="utf-8" Content-Disposition: inline Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Received: from [Authenticated user (811097)] by runbox.com with http (RMM6); Tue, 15 May 2018 15:44:04 GMT To: "picongpu-users" CC: "picongpu-users" Subject: Re: [PIConGPU-Users] DGX-1 Date: Tue, 15 May 2018 17:44:04 +0200 (CEST) X-Mailer: RMM6 In-Reply-To: Message-Id: That is great, thank you guys for the detailed replies! On Mon, 14 May 2018 14:04:37 +0200, "Axel Huebl a.huebl@hzdr.de" wrote: > Just to add to the arguments, >=20 > the additional (non-tensor core) Flop/s in V100 vs. the 2 year-old P100 > satisfy the additional costs, that's why V100 are ok. Also, the V100 > 32GB are too my knowledge available for the same price as the "first" > V100 (16 GB). >=20 > So ideally, use the 32GB variants for larger problem sizes! >=20 > Regarding NVlink & NVswitch in newer stations: we can enable RDMA over > those via GPU direct in PIConGPU, although it's not yet mainline. In > most settings outside of heavy strong-scaling tough, we are hiding > latency well enough so that a smaller BW and longer latency won't slow > down your simulation. (Read: the intra- and interconnet is not too > important for PIConGPU since we assume the worst.) >=20 > Anyway, if you plan things like in-node global FFTs, e.g. as an in situ > plugin to get the envelope of a laser via a Hankel transform, NVlink and > NVswitch will pay off. >=20 >=20 > Cheers, > Axel >=20 > On 5/14/18 1:18 PM, Ren=C3=A9 Widera r.widera@hzdr.de wrote: > > Dear Andrei, > >=20 > >> My question is, would PIConGPU run on the DGX-1 and can it make use of > > NVLink [2] v2.0? > >=20 > > Currently we are not using NVLink. But it is planed to add support for > > MPI GPU-Direct which should than use NVLink. > >=20 > >> Also, I'm guessing it can't use the tensor cores in the V100 version > > of DGX-1? > >=20 > > Currently we are not using tensor cores. It is not fully clear if tensor > > cores will give an advantage. ONe drawback of the tensor cores is that > > the using fp16. In PIConGPU we use at least fp32 if you not activate > > fp64 support. > >=20 > > Never the less I think a DGX-1 with V100 is the right system. > >=20 > > Ren=C3=A9 (psychocoderHPC) > >=20 > > On 05/14/2018 12:36 PM, Andrei Berceanu berceanu@runbox.com wrote: > >> Hi, > >> > >> First of all, let me provide some context: we are considering > >> purchasing a DGX-1 system [1] from Nvidia for PIConGPU and are trying > >> to decide between the P100 and V100 versions. > >> > >> My question is, would PIConGPU run on the DGX-1 and can it make use of > >> NVLink [2] v2.0? > >> > >> Also, I'm guessing it can't use the tensor cores in the V100 version > >> of DGX-1? > >> > >> Regards, > >> Andrei > >> > >> [1] https://en.wikipedia.org/wiki/Nvidia_DGX-1 > >> [2] https://en.wikipedia.org/wiki/NVLink > >> ############################################################# > >> This message is sent to you because you are subscribed to > >> =C2=A0=C2=A0 the mailing list . > >> To unsubscribe, E-mail to: > >> To switch to the DIGEST mode, E-mail to > >> To switch to the INDEX mode, E-mail to > >> Send administrative queries to=C2=A0 > >> > >=20 >=20 > --=20 >=20 > Axel Huebl > Phone: +49 351 260 3582 > Institute of Radiation Physics > http://www.hzdr.de/crp > Helmholtz-Zentrum Dresden - Rossendorf (HZDR) > Bautzner Landstr. 400 | 01328 Dresden | Germany > Board of Directors: > Prof. Dr. Dr. h. c. Roland Sauerbrey, Dr. Ulrich Breuer > Company Registration Number VR 1693, Amtsgericht Dresden >=20 > ############################################################# > This message is sent to you because you are subscribed to > the mailing list . > To unsubscribe, E-mail to: > To switch to the DIGEST mode, E-mail to > To switch to the INDEX mode, E-mail to > Send administrative queries to