AUGEM: automatically generate high performance dense linear algebra kernels on x86 CPUs Q Wang, X Zhang, Y Zhang, Q Yi Proceedings of the international conference on high performance computing …, 2013 | 285 | 2013 |
Model-driven level 3 BLAS performance optimization on Loongson 3A processor Z Xianyi, W Qian, Z Yunquan 2012 IEEE 18th international conference on parallel and distributed systems …, 2012 | 260 | 2012 |
OpenBLAS Z Xianyi, W Qian, Z Chothia URL: http://xianyi. github. io/OpenBLAS 88, 2012 | 233* | 2012 |
The BLIS framework: Experiments in portability FG Van Zee, TM Smith, B Marker, TM Low, RAVD Geijn, FD Igual, ... ACM Transactions on Mathematical Software (TOMS) 42 (2), 1-19, 2016 | 129 | 2016 |
Optimizing SpMV for diagonal sparse matrices on GPU X Sun, Y Zhang, T Wang, X Zhang, L Yuan, L Rao 2011 International conference on parallel processing, 492-501, 2011 | 40 | 2011 |
Optimizing and scaling HPCG on Tianhe-2: early experience X Zhang, C Yang, F Liu, Y Liu, Y Lu Algorithms and Architectures for Parallel Processing: 14th International …, 2014 | 36 | 2014 |
623 Tflop/s HPCG run on Tianhe-2: Leveraging millions of hybrid cores Y Liu, C Yang, F Liu, X Zhang, Y Lu, Y Du, C Yang, M Xie, X Liao The International Journal of High Performance Computing Applications 30 (1 …, 2016 | 35 | 2016 |
Accelerating HPCG on Tianhe-2: a hybrid CPU-MIC algorithm Y Liu, X Zhang, C Yang, F Liu, Y Lu 2014 20th IEEE International Conference on Parallel and Distributed Systems …, 2014 | 20 | 2014 |
Accelerating linpack performance with mixed precision algorithm on CPU+ GPGPU heterogeneous cluster W Lei, Z Yunquan, Z Xianyi, L Fangfang 2010 10th IEEE International Conference on Computer and Information …, 2010 | 20 | 2010 |
Crsd: application specific auto-tuning of spmv for diagonal sparse matrices X Sun, Y Zhang, T Wang, G Long, X Zhang, Y Li European Conference on Parallel Processing, 316-327, 2011 | 14 | 2011 |
Automatic performance tuning of spmv on gpgpu X Zhang, Y Zhang, X Sun, F Liu, S Liu, Y Tang, Y Li HPC Asia, Kaohsiung, Taiwan, China, 173-179, 2009 | 5 | 2009 |
URL: http://xianyi. github. io Z Xianyi, W Qian, ZO Chothia OpenBLAS, 2014 | 3 | 2014 |
QuantWiz: A parallel software package for LC-MS-based label-free protein quantification J Wang, Y Zhang, X Zhang, X Sun, Z Hu, S Li, R Zeng 2009 11th IEEE International Conference on High Performance Computing and …, 2009 | 3 | 2009 |