Follow
Yaosheng Fu
Yaosheng Fu
Research Scientist at NVIDIA
Verified email at nvidia.com
Title
Cited by
Cited by
Year
OpenPiton: An open source manycore research framework
J Balkind, M McKeown, Y Fu, T Nguyen, Y Zhou, A Lavrov, M Shahrad, ...
ACM SIGPLAN Notices 51 (4), 217-232, 2016
2802016
Optimizing multi-GPU parallelization strategies for deep learning training
S Pal, E Ebrahimi, A Zulfiqar, Y Fu, V Zhang, S Migacz, D Nellans, ...
Ieee Micro 39 (5), 91-101, 2019
682019
BYOC: a" bring your own core" framework for heterogeneous-ISA research
J Balkind, K Lim, M Schaffner, F Gao, G Chirkov, A Li, A Lavrov, ...
Proceedings of the Twenty-Fifth International Conference on Architectural …, 2020
652020
Coherence domain restriction on large scale systems
Y Fu, TM Nguyen, D Wentzlaff
Proceedings of the 48th International Symposium on Microarchitecture, 686-698, 2015
592015
Power and Energy Characterization of an Open Source 25-Core Manycore Processor.
M McKeown, A Lavrov, M Shahrad, PJ Jackson, Y Fu, J Balkind, ...
HPCA, 762-775, 2018
572018
PriME: A parallel and distributed simulator for thousand-core chips
Y Fu, D Wentzlaff
2014 IEEE International Symposium on Performance Analysis of Systems and …, 2014
492014
Need for speed: Experiences building a trustworthy system-level gpu simulator
O Villa, D Lustig, Z Yan, E Bolotin, Y Fu, N Chatterjee, N Jiang, D Nellans
2021 IEEE International Symposium on High-Performance Computer Architecture …, 2021
402021
Piton: A manycore processor for multitenant clouds
M McKeown, Y Fu, T Nguyen, Y Zhou, J Balkind, A Lavrov, M Shahrad, ...
Ieee micro 37 (2), 70-80, 2017
392017
The architectural implications of distributed reinforcement learning on CPU-GPU systems
A Inci, E Bolotin, Y Fu, G Dalal, S Mannor, D Nellans, D Marculescu
arXiv preprint arXiv:2012.04210, 2020
182020
GPU domain specialization via composable on-package architecture
Y Fu, E Bolotin, N Chatterjee, D Nellans, SW Keckler
ACM Transactions on Architecture and Code Optimization (TACO) 19 (1), 1-23, 2021
112021
OpenPiton: an open source hardware platform for your research
J Balkind, M McKeown, Y Fu, T Nguyen, Y Zhou, A Lavrov, M Shahrad, ...
Communications of the ACM 62 (12), 79-87, 2019
92019
Piton: A 25-core academic manycore processor
M McKeown, Y Fu, T Nguyen, Y Zhou, J Balkind, A Lavrov, M Shahrad, ...
Symp. on High Performance Chips, 2016
62016
A new joint source and channel coding scheme for packet-based scalable multimedia streams
C Chi, Y Zhang, Y Fu, Z Yang
2010 IEEE Globecom Workshops, 954-959, 2010
62010
Piton: A 25-core academic manycore research processor.
M McKeown, Y Fu, TM Nguyen, Y Zhou, J Balkind, A Lavrov, M Shahrad, ...
Hot Chips Symposium, 1-38, 2016
32016
Techniques for configuring parallel processors for different application domains
Y Fu, E Bolotin, N Chatterjee, SW Keckler, D Nellans
US Patent 11,609,879, 2023
22023
AutoScratch: ML-Optimized Cache Management for Inference-Oriented GPUs
Y Fu, E Bolotin, A Jaleel, G Dalal, S Mannor, J Subag, N Korem, M Behar, ...
Proceedings of Machine Learning and Systems 5, 495-512, 2023
22023
Architectural Support for Large-scale Shared Memory Systems
Y Fu
Princeton University, 2017
22017
OpenPit
J Balkind, M McKeown, Y Fu, T Nguyen, Y Zhou, A Lavrov, M Shahrad, ...
22016
Automatic method for power management tuning in computing systems
E Bolotin, Y Fu, Z Yan, G Dalal, S Mannor, D Nellans
US Patent 11,880,261, 2024
2024
Technique for autonomously managing cache using machine learning
Y Fu, S Mannor, E Bolotin, D Nellans, G Dalal
US Patent App. 17/514,735, 2023
2023
The system can't perform the operation now. Try again later.
Articles 1–20