NICSLU

High-performance Parallel Sparse Solver for Circuit Simulation

 
Department_of_Electronic_Engineering,_Tsinghua_University.svg (4)

 

Introduction

About the name of the solver: NICS is the name of our lab ---- Nano-scale Integrated Circuits and Systems.

NICSLU is a high-performance and robust software package for solving large-scale sparse linear systems of equations (Ax = b) on multi-core shared-memory machines. It is written by C, and can be easily used in C/C++ programs.

NICSLU is well suited for SPICE-based circuit simulation problems. NICSLU is on average faster than KLU and PARDISO for circuit matrices. NICSLU is also proved to be high-performance in several state-of-the-art SPICE-based commercial simulators from several famous EDA companies. Several EDA companies are trying/testing NICSLU.

Sparse direct solver is a popular research topic for several decades. There are many open- or close-source sparse solvers available, such as UMFPACK, SuperLU, PARDISO, MUMPS, WSMP, SPARSE......Among them, SPARSE is targeted at circuit matrices and used in many old SPICE simulators. SPARSE uses linked list to store matrices so it performs poorly on modern CPUs with a large cache (especially for large matrices). Circuit matrices are special because they are extremely sparse. Most of available sparse solvers are targeted at general sparse matrices like matrices from finite element problems. This is why KLU (by Prof. Tim Davis) was developed. KLU adopts the sparse left-looking algorithm without building supernodes so it is suitable for extremely sparse matrices. As a result, KLU is faster than many sparse solvers (including SPARSE) for circuit matrices.

NICSLU is based on KLU but we have made many efforts on both the sequential algorithm and parallelization, resulting in much higher performance than KLU. NICSLU is the first parallel implementation of the Gilbert-Peierls sparse left-looking algorithm. We have developed a hybrid parallelization framework to perform parallel sparse LU factorizations for fitting different sparsity and dependence. Latest NICSLU explores higher performance for dense circuit matrices by utilizing supernodes.

NICSLU is developed by Xiaoming Chen (陈晓明), Ling Ren (任令), Wei Wu (武伟), Xin Li (李鑫), Guohui Wu (吴国辉), Yu Wang (汪玉), and Huazhong Yang (杨华中), all from Department of Electronic Engineering, Tsinghua University, Beijing, China.

         

Sparse left-looking algorithm                      Parallelization framework of NICSLU

 

Click Here for Results & Comparisons

 

Related Publications

These papers are copyrighted by ACM or IEEE or Springer. They are posted here for your personal use, to ensure timely dissemination of research work with no commercial purpose.

If you are using NICSLU in your research, please cite the corresponding papers.

 

[BOOK] Xiaoming Chen, Yu Wang, Huazhong Yang, "Parallel Sparse Direct Solver for Integrated Circuit Simulation". Springer, 1st edition, Feb 2017, 136 pages. [Springer link] [Google book][BibTeX] (Most of the techniques used in NICSLU are described in this book)

 

1) CPU solver & parallel factorization algorithm

[TCAD] Xiaoming Chen, Yu Wang, Huazhong Yang, "NICSLU: An Adaptive Sparse Matrix Solver for Parallel Circuit Simulation", Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on, vol. 32, no. 2, pp. 261-274, feb. 2013. [PDF][BibTeX]  

[ASPDAC’12] Xiaoming Chen, Yu Wang, Huazhong Yang, "An adaptive LU factorization algorithm for parallel circuit simulation", Design Automation Conference (ASP-DAC), 2012 17th Asia and South Pacific, pp.359-364, Jan. 30 2012-Feb. 2 2012. (best paper nomination)  [PDF] [BibTeX]

[DATE’16] Xiaoming Chen, Lixue Xia, Yu Wang, Huazhong Yang, “Sparsity-Oriented Sparse Solver Design for Circuit Simulation”, Design, Automation, and Test in Europe (DATE) 2016, pp.1580-1585, March 14-18, 2016. [PDF][BibTeX]

 

2) CPU parallel re-factorization

[TCASII] Xiaoming Chen, Wei Wu, Yu Wang, Hao Yu, Huazhong Yang, "An EScheduler-based Data Dependence Analysis and Task Scheduling for Parallel Circuit Simulation", Circuits and Systems II: Express Briefs, IEEE Transactions on, vol. 58, no. 10, pp. 702 –706, oct. 2011. [PDF] [BibTeX]

 

3) CPU fast factorization

[DATE’15] Xiaoming Chen, Yu Wang, Huazhong Yang, "A Fast Parallel Sparse Solver for SPICE-based Circuit Simulators", Design, Automation, and Test in Europe (DATE) 2015, pp.205-210, 9-13 March 2015. [PDF] [BibTex]

 

4) GPU parallel re-factorization

[DAC’12] Ling Ren, Xiaoming Chen, Yu Wang, Chenxi Zhang, Huazhong Yang, "Sparse LU factorization for parallel circuit simulation on GPU", In Proceedings of the 49th Annual Design Automation Conference (DAC '12). ACM, New York, NY, USA, 1125-1130. [PDF] [BibTeX]

[TPDS] Xiaoming Chen, Ling Ren, Yu Wang, Huazhong Yang, “GPU-Accelerated Sparse LU Factorization for Circuit Simulation with Performance Modeling”, IEEE Transactions on Parallel and Distributed Systems (IEEE TPDS), vol.26, no.3, pp.786-795, March 2015. [PDF] [BibTeX]

 

5) GPU blocked algorithm

[IA3’13] Xiaoming Chen, Du Su, Yu Wang, Huazhong Yang, "Nonzero pattern analysis and memory access optimization in GPU-based sparse LU factorization for circuit simulation", In Proceedings of the 3rd Workshop on Irregular Applications: Architectures and Algorithms (IA^3 '13). [PDF] [BibTeX]

 

Please visit https://github.com/chenxm1986/nicslu to download NICSLU.