NICSLU High-performance Parallel Sparse
Solver for Circuit Simulation
|
Introduction About the name of the
solver: NICS is the name of our lab ---- Nano-scale Integrated Circuits and
Systems. NICSLU
is a high-performance and robust software package for solving large-scale
sparse linear systems of equations (Ax = b) on multi-core shared-memory
machines. It is written by C, and can be easily used in C/C++ programs. NICSLU
is well suited for SPICE-based circuit
simulation problems. NICSLU is on average faster than KLU and PARDISO for
circuit matrices. NICSLU is also proved to be high-performance in several
state-of-the-art SPICE-based commercial simulators from several famous EDA
companies. Several EDA companies are
trying/testing NICSLU. Sparse direct solver is a popular research
topic for several decades. There are many open- or close-source sparse
solvers available, such as UMFPACK, SuperLU, PARDISO, MUMPS, WSMP,
SPARSE......Among them, SPARSE is targeted at circuit matrices and used in
many old SPICE simulators. SPARSE uses linked list to store matrices so it
performs poorly on modern CPUs with a large cache (especially for large
matrices). Circuit matrices are special because they are extremely sparse.
Most of available sparse solvers are targeted at general sparse matrices like
matrices from finite element problems. This is why KLU (by Prof. Tim Davis)
was developed. KLU adopts the sparse left-looking algorithm without building
supernodes so it is suitable for extremely sparse matrices. As a result, KLU
is faster than many sparse solvers (including SPARSE) for circuit matrices. NICSLU is based on KLU but we have made
many efforts on both the sequential algorithm and parallelization, resulting
in much higher performance than KLU. NICSLU is the first parallel
implementation of the Gilbert-Peierls sparse left-looking algorithm. We have
developed a hybrid parallelization framework to perform parallel sparse LU
factorizations for fitting different sparsity and dependence. Latest NICSLU
explores higher performance for dense circuit matrices by utilizing
supernodes. NICSLU is developed by Xiaoming Chen (陈晓明), Ling Ren (任令), Wei Wu (武伟), Xin Li (李鑫), Guohui Wu (吴国辉), Yu Wang
(汪玉), and Huazhong Yang
(杨华中), all from
Department of Electronic Engineering, Tsinghua University, Beijing, China.
Sparse
left-looking algorithm
Parallelization framework of NICSLU |
Click Here for Results & Comparisons |
Related
Publications These
papers are copyrighted by ACM or IEEE or Springer. They are posted here for your
personal use, to ensure timely dissemination of research work with no
commercial purpose. If
you are using NICSLU in your research, please cite the corresponding papers. [BOOK] Xiaoming
Chen, Yu Wang, Huazhong Yang, "Parallel Sparse Direct
Solver for Integrated Circuit Simulation". Springer, 1st
edition, Feb 2017, 136 pages. [Springer link] [Google
book][BibTeX] (Most of the techniques used
in NICSLU are described in this book) 1)
CPU solver & parallel factorization algorithm [TCAD] Xiaoming Chen,
Yu Wang, Huazhong Yang, "NICSLU:
An Adaptive Sparse Matrix Solver for Parallel Circuit Simulation",
Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions
on, vol. 32, no. 2, pp. 261-274, feb. 2013. [PDF][BibTeX] [ASPDAC’12] Xiaoming
Chen, Yu Wang, Huazhong Yang, "An
adaptive LU factorization algorithm for parallel circuit simulation",
Design Automation Conference (ASP-DAC), 2012 17th Asia and South Pacific,
pp.359-364, Jan. 30 2012-Feb. 2 2012. (best paper nomination) [PDF]
[BibTeX] [DATE’16]
Xiaoming Chen, Lixue Xia, Yu Wang, Huazhong Yang, “Sparsity-Oriented
Sparse Solver Design for Circuit Simulation”, Design, Automation, and Test in Europe (DATE) 2016,
pp.1580-1585, March 14-18, 2016. [PDF][BibTeX] 2)
CPU parallel re-factorization [TCASII] Xiaoming
Chen, Wei Wu, Yu Wang, Hao Yu, Huazhong Yang, "An EScheduler-based Data Dependence Analysis and Task Scheduling for
Parallel Circuit Simulation", Circuits and Systems II: Express
Briefs, IEEE Transactions on, vol. 58, no. 10, pp. 702 –706, oct. 2011. [PDF] [BibTeX]
3)
CPU fast factorization [DATE’15] Xiaoming
Chen, Yu Wang, Huazhong Yang, "A
Fast Parallel Sparse Solver for SPICE-based Circuit Simulators",
Design, Automation, and Test in Europe (DATE) 2015, pp.205-210, 9-13 March
2015. [PDF] [BibTex]
4)
GPU parallel re-factorization [DAC’12] Ling Ren,
Xiaoming Chen, Yu Wang, Chenxi Zhang, Huazhong Yang, "Sparse LU factorization for parallel
circuit simulation on GPU", In Proceedings of the 49th Annual Design
Automation Conference (DAC '12). ACM, New York, NY, USA, 1125-1130. [PDF] [BibTeX] [TPDS] Xiaoming Chen,
Ling Ren, Yu Wang, Huazhong Yang, “GPU-Accelerated
Sparse LU Factorization for Circuit Simulation with Performance Modeling”,
IEEE Transactions on Parallel and Distributed Systems (IEEE TPDS), vol.26,
no.3, pp.786-795, March 2015. [PDF] [BibTeX] 5)
GPU blocked algorithm [IA3’13]
Xiaoming Chen, Du Su, Yu Wang, Huazhong Yang, "Nonzero pattern analysis and memory access optimization in GPU-based
sparse LU factorization for circuit simulation", In Proceedings of
the 3rd Workshop on Irregular Applications: Architectures and Algorithms
(IA^3 '13). [PDF] [BibTeX] |
Please
visit https://github.com/chenxm1986/nicslu
to download NICSLU. |