[. References and . Vecil, , 2014.

J. Cáceres, C. Sampedro, A. Godoy, and F. Gámiz, A parallel deterministic solver for the Schrödinger-Poisson-Boltzmann system in ultrashort DG-MOSFETs: Comparison with Monte-Carlo, Computers and Mathematics with Applications, vol.67, issue.9, pp.1703-1721, 2014.

N. Ben-abdallah, M. J. Cáceres, J. A. Carrillo, and F. Vecil, A deterministic solver for a hybrid quantum-classical transport model in nanoMOSFETs, Journal of Computational Physics, vol.228, issue.17, pp.6553-6571, 2009.

J. M. Mantas and M. J. Cáceres, Efficient deterministic parallel simulation of 2D semiconductor devices based on WENO-Boltzmann schemes, Computer Methods in Applied Mechanics and Engineering, vol.198, issue.5-8, pp.693-704, 2009.

J. A. Carrillo, I. M. Gamba, A. Majorana, and C. Shu, A WENO-solver for the transients of Boltzmann?Poisson system for semiconductor devices: performance and comparisons with Monte Carlo methods, Journal of Computational Physics, vol.184, issue.2, pp.498-525, 2003.

J. A. Carrillo, I. M. Gamba, A. Majorana, and C. Shu, A WENO-solver for the transients of Boltzmann?Poisson system for semiconductor devices: performance and comparisons with Monte Carlo methods, Journal of Computational Physics, vol.184, issue.2, pp.498-525, 2003.

A. Suzuki, T. Kamioka, Y. Kamakura, and T. Watanabe, Particle-based Semiconductor Device Simulation Accelerated by GPU computing, Journal of Advanced Simulation in Science and Engineering, vol.2, issue.1, pp.211-224, 2015.

Y. Y. Li and S. Yu, A parallel adaptive finite volume method for nanoscale double-gate MOSFETs simulation, Journal of Computational and Applied Mathematics, vol.175, issue.1, pp.87-99, 2005.

Y. Y. Li, T. M. Chao, and S. M. Sze, A Novel Parallel Approach for Quantum Effect Simulation in Semiconductor Devices, International Journal of Modelling and Simulation, vol.23, issue.2, pp.94-102, 2003.

G. G. Kumar, M. Singh, A. Bulusu, and G. Trivedi, A Framework to Simulate Semiconductor Devices Using Parallel Computer Architecture, Journal of Physics: Conference Series, vol.759, p.012098, 2016.

A. , 2. Contents of LAPACK, LAPACK Users' Guide, pp.9-53, 1999.

V. D. Camiola, G. Mascali, and V. Romano, Simulation of a double-gate MOSFET by a non-parabolic energy-transport subband model for semiconductors based on the maximum entropy principle, Mathematical and Computer Modelling, vol.58, issue.1-2, pp.321-343, 2013.

A. Kumar-suhag and R. Sharma, Design and Simulation of Nanoscale Double Gate MOSFET using high K Material and Ballistic Transport Method, Materials Today: Proceedings, vol.4, issue.9, pp.10412-10416, 2017.

R. Prasher, D. Dass, and R. Vaid, Performance of a double gate nanoscale MOSFET (DG-MOSFET) based on novel channel materials, Journal of Nano-and Electronic Physics, vol.5, issue.1, 2013.

G. Mascali and V. Romano, A non parabolic hydrodynamical subband model for semiconductors based on the maximum entropy principle, Mathematical and Computer Modelling, vol.55, issue.3-4, pp.1003-1020, 2012.

[. N. Ben, F. Ben-abdallah, N. Méhats, and . Vauchelet, A note on the long time behavior for the drift-diffusion-Poisson system, C. R. Math. Acad. Sci. Paris, vol.339, issue.10, pp.683-688, 2004.
URL : https://hal.archives-ouvertes.fr/hal-00378933

A. , Energy transport in semiconductor devices, Mathematical and Computer Modelling of Dynamical Systems, vol.16, issue.1, 2010.

C. Ringhofer, C. Schmeiser, A. Zwirchmayr, ;. Rupp, T. Grasser et al., On the feasibility of spherical harmonics expansions of the Boltzmann transport equation for three-dimensional device geometries, Electron Devices Meeting (IEDM), vol.39, pp.1078-1095, 2001.

S. Hong, A. Pham, and C. Jungemann, Poisson Equation, Computational Microelectronics, pp.175-176, 2011.

A. R. Brodtkorb, T. R. Hagen, and M. L. Sætra, Graphics processing unit (GPU) programming strategies and trends in GPU computing, Journal of Parallel and Distributed Computing, vol.73, issue.1, pp.4-13, 2013.

M. Ujaldon, High performance computing and simulations on the GPU using CUDA, 2012 International Conference on High Performance Computing & Simulation (HPCS), pp.1-7, 2012.

J. D. Owens, M. Houston, D. Luebke, S. Green, J. E. Stone et al., GPU Computing, Proceedings of the IEEE, vol.96, issue.5, pp.879-899, 2008.

. Nvidia-cuda-home, NVIDIA: CUDA Zone, 2018.

J. Lippuner, NVIDIA CUDA, NVIDIA CUDA, 2019.

[. Nvidia-cuda-c, NVIDIA: CUDA C Programming Guide, 2018.

M. De la asunción, J. M. Mantas, and M. J. Castro, Simulation of one-layer shallow water systems on multicore and CUDA architectures, The Journal of Supercomputing, vol.58, issue.2, pp.206-214, 2010.

M. J. Castro, S. Ortega, M. De-la-asunción, J. M. Mantas, and J. M. Gallardo, GPU computing for shallow water flow simulation based on finite volume schemes, Comptes Rendus Mécanique, vol.339, issue.2-3, pp.165-184, 2011.

M. De-la-asunción, M. J. Castro, E. D. Fernández-nieto, J. M. Mantas, S. O. Acosta et al., Efficient GPU implementation of a two waves TVD-WAF method for the two-dimensional one layer shallow water system on structured meshes, Computers & Fluids, vol.80, pp.441-452, 2013.

D. S. Abdi, L. C. Wilcox, T. C. Warburton, and F. X. Giraldo, A GPU-accelerated continuous and discontinuous Galerkin non-hydrostatic atmospheric model, The International Journal of High Performance Computing Applications, vol.33, issue.1, pp.81-109, 2017.

Y. Ye, K. Li, Y. Wang, and T. Dengz, New hybrid CPU-GPU solver for CFD-DEM simulation of fluidized beds, Powder Technology, vol.316, pp.233-244, 2017.

B. Devries, J. Iannelli, C. Trefftz, K. A. O?hearn, and G. Wolffe, Parallel Implementations of FGMRES for Solving Large, Sparse Non-symmetric Linear Systems, Procedia Computer Science, vol.18, pp.491-500, 2013.

Y. Ye, K. Li, Y. Wang, and T. Deng, Parallel computation of Entropic Lattice Boltzmann method on hybrid CPU?GPU accelerated system, Computers & Fluids, vol.110, pp.114-121, 2015.

K. Rupp, A. Jüngel, and T. Grasser, A GPU-Accelerated Parallel Preconditioner for the Solution of the Boltzmann Transport Equation for Semiconductors, Monte Carlo Simulation of Ultrafast Carrier Transport: Scalability Study. Procedia Computer Science, vol.7174, pp.2298-2306, 2012.

K. Rupp, P. Tillet, F. Rudolf, J. Weinbub, A. Morhammer et al., ViennaCL---Linear Algebra Library for Multi- and Many-Core Architectures, SIAM Journal on Scientific Computing, vol.38, issue.5, pp.S412-S439, 2016.

B. Chapman, G. Jost, and R. Van-der-pas, NVIDIA. NVIDIA's Next Generation CUDA TM Compute Architecture: Kepler GK110, 2008.

A. Nishida, Experience in Developing an Open Source Scalable Software Infrastructure in Japan, Computational Science and Its Applications ? ICCSA 2010, vol.6017, pp.448-462, 2010.

H. Kotakemori, H. Hasegawa, and A. Nishida, Performance evaluation of a parallel iterative method library using OpenMP, Eighth International Conference on High-Performance Computing in Asia-Pacific Region (HPCASIA'05), pp.432-436, 2005.

Y. ;. Saad, M. B. Sonneveld, and . Van-gijzen, IDR(s): a family of simple and fast algorithms for solving large nonsymmetric linear systems, SIAM J. Sci. Comput, vol.31, issue.2, pp.1035-1062, 2003.