Follow
Neal Crago
Neal Crago
Senior Research Scientist, NVIDIA
Verified email at nvidia.com - Homepage
Title
Cited by
Cited by
Year
ExTensor: An Accelerator for Sparse Tensor Algebra
K Hegde, H Asghari-Moghaddam, M Pellauer, N Crago, A Jaleel, ...
Proceedings of the 52nd Annual IEEE/ACM International Symposium on …, 2019
2562019
Rigel: An architecture and scalable programming interface for a 1000-core accelerator
JH Kelm, DR Johnson, MR Johnson, NC Crago, W Tuohy, A Mahesri, ...
Proceedings of the 36th annual international symposium on Computer …, 2009
2052009
Triggered instructions: A control paradigm for spatially-programmed architectures
A Parashar, M Pellauer, M Adler, B Ahsan, N Crago, D Lustig, V Pavlov, ...
ACM SIGARCH Computer Architecture News 41 (3), 142-153, 2013
1642013
Buffets: An Efficient and Composable Storage Idiom for Explicit Decoupled Data Orchestration
M Pellauer, YS Shao, J Clemons, N Crago, K Hegde, R Venkatesan, ...
Proceedings of the Twenty-Fourth International Conference on Architectural …, 2019
101*2019
Tradeoffs in designing accelerator architectures for visual computing
A Mahesri, D Johnson, N Crago, SJ Patel
2008 41st IEEE/ACM International Symposium on Microarchitecture, 164-175, 2008
572008
Executing distributed memory operations using processing elements connected by distributed channels
B Ahsan, MC Adler, NC Crago, JS Emer, A Jaleel, A Parashar, ...
US Patent 10,331,583, 2019
562019
Efficient control and communication paradigms for coarse-grained spatial architectures
M Pellauer, A Parashar, M Adler, B Ahsan, R Allmon, N Crago, K Fleming, ...
ACM Transactions on Computer Systems (TOCS) 33 (3), 1-32, 2015
552015
Efficient spatial processing element control via triggered instructions
A Parashar, M Pellauer, M Adler, B Ahsan, N Crago, D Lustig, V Pavlov, ...
IEEE Micro 34 (3), 120-137, 2014
542014
OUTRIDER: efficient memory latency tolerance with decoupled strands
NC Crago, SJ Patel
Proceeding of the 38th annual international symposium on Computer …, 2011
492011
Processors, methods, and systems for a configurable spatial accelerator with memory system performance, power reduction, and atomics support features
MC Adler, C Chou, NC Crago, K Fleming, KD Glossop, A Jaleel, ...
US Patent 10,387,319, 2019
442019
P-OPT: Practical Optimal Cache Replacement for Graph Analytics
V Balaji, N Crago, A Jaleel, B Lucia
2021 IEEE International Symposium on High-Performance Computer Architecture …, 2021
382021
Developing a parallel computational implementation of AMOEBA
MJ Widener, NC Crago, J Aldstadt
International Journal of Geographical Information Science 26 (9), 1707-1723, 2012
242012
Exploiting spatial architectures for edit distance algorithms
JJ Tithi, NC Crago, JS Emer
2014 IEEE International Symposium on Performance Analysis of Systems and …, 2014
232014
Executing distributed memory operations using processing elements connected by distributed channels
B Ahsan, MC Adler, NC Crago, JS Emer, A Jaleel, A Parashar, ...
US Patent 10,853,276, 2020
222020
Accelerating Sparse Data Orchestration via Dynamic Reflexive Tiling
TO Odemuyiwa, H Asghari-Moghaddam, M Pellauer, K Hegde, PA Tsai, ...
Proceedings of the 28th ACM International Conference on Architectural …, 2023
162023
Rigel: A scalable architecture for 1000+ core accelerators
DR Johnson, JH Kelm, NC Crago, MR Johnson, W Tuohy, W Truty, ...
Symposium on Application Accelerators in High Performance Computing, Urbana …, 2009
82009
Detecting irregular clusters in big spatial data
J Aldstadt, M Widener, N Crago
Proceedings of the 7th International Conference on Geographic Information …, 2012
72012
Symphony: Orchestrating Sparse and Dense Tensors with Hierarchical Heterogeneous Processing
M Pellauer, J Clemons, V Balaji, N Crago, A Jaleel, D Lee, M O’Connor, ...
ACM Transactions on Computer Systems 41 (1-4), 1-30, 2023
62023
Flexible accelerator for a tensor workload
PA Tsai, N Crago, A Parashar, JS Emer, SW Keckler
US Patent App. 17/343,582, 2022
42022
Exposing memory access patterns to improve instruction and memory efficiency in GPUs
NC Crago, M Stephenson, SW Keckler
ACM Transactions on Architecture and Code Optimization (TACO) 15 (4), 1-23, 2018
42018
The system can't perform the operation now. Try again later.
Articles 1–20