Performance and accuracy of hardware-oriented native-, emulated- and 
mixed-precision solvers in FEM simulations

Göddeke, Dominik; Strzodka, Robert; Turek, Stefan

doi:10.1080/17445760601122076

Performance and accuracy of hardware-oriented native-, emulated- and mixed-precision solvers in FEM simulations

Göddeke, D., Strzodka, R., & Turek, S. (2007). Performance and accuracy of hardware-oriented native-, emulated- and mixed-precision solvers in FEM simulations. International Journal of Parallel, Emergent and Distributed Systems, 22(4), 221-256. doi:10.1080/17445760601122076.

Item is 公開

表示: 全項目非表示: 全項目

基本情報

表示: 非表示:

アイテムのパーマリンク: https://hdl.handle.net/11858/00-001M-0000-000F-204C-6 版のパーマリンク: https://hdl.handle.net/11858/00-001M-0000-000F-204D-4

資料種別: 学術論文

ファイル

表示: ファイル

作成者

表示:

非表示:

作成者:
Göddeke, Dominik, 著者
Strzodka, Robert^{1, 2}, 著者
Turek, Stefan, 著者

所属:
1Computer Graphics, MPI for Informatics, Max Planck Society, ou_40047
2Graphics - Optics - Vision, MPI for Informatics, Max Planck Society, ou_1116549

内容説明

表示:

非表示:

キーワード: -

要旨: In this survey paper, we compare native double precision solvers with emulated- and mixed- precision solvers of linear systems of equations as they typically arise in finite element discretisations. The emulation utilises two single float numbers to achieve higher precision, while the mixed precision iterative refinement computes residuals and updates the solution vector in double precision but solves the residual systems in single precision. Both techniques have been known since the 1960s, but little attention has been devoted to their performance aspects. Motivated by changing paradigms in processor technology and the emergence of highly parallel devices with outstanding single float performance, we adapt the emulation and mixed precision techniques to coupled hardware configurations, where the parallel devices serve as scientific co-processors. The performance advantages are examined with respect to speedups over a native double precision implementation (time aspect) and reduced area requirements for a chip (space aspect). The paper begins with an overview of the theoretical background, algorithmic approaches and suitable hardware architectures. We then employ several conjugate gradient and multigrid solvers and study their behaviour for different parameter settings of the iterative refinement technique. Concrete speedup factors are evaluated on the coupled hardware configuration of a general-purpose CPU and a graphics processor. The dual performance aspect of potential area savings is assessed on a field programmable gate array. In the last part, we test the applicability of the proposed mixed precision schemes with ill-conditioned matrices. We conclude that the mixed precision approach works very well with the parallel co-processors gaining speedup factors of four to five, and area savings of three to four, while maintaining the same accuracy as a reference solver executing everything in double precision.

資料詳細

表示:

非表示:

言語: eng - English

日付: 修正: 2008-03-14出版: 2007

出版の状態: 出版

ページ: -

出版情報: -

目次: -

査読: 査読あり

識別子（DOI, ISBNなど）: eDoc: 356526
DOI: 10.1080/17445760601122076
その他: Local-ID: C12573CC004A8E26-9C4C4C20B79BB54EC12573B4005E74F3-GoStTu07mixedPrec

学位: -

訴訟

表示:

Project information

表示:

出版物 1

表示:

非表示:

出版物名: International Journal of Parallel, Emergent and Distributed Systems

種別: 学術雑誌

著者・編者:

所属:

出版社, 出版地: -

ページ: - 巻号: 22 (4) 通巻号: - 開始・終了ページ: 221 - 256 識別子（ISBN, ISSN, DOIなど）: -

アイテム詳細

基本情報

ファイル

関連URL

作成者

内容説明

資料詳細

関連イベント

訴訟

Project information

出版物 1