Gandiva: Introspective cluster scheduling for deep learning W Xiao, R Bhardwaj, R Ramjee, M Sivathanu, N Kwatra, Z Han, P Patel, ... 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2018 | 599 | 2018 |
Analysis of {Large-Scale}{Multi-Tenant}{GPU} clusters for {DNN} training workloads M Jeon, S Venkataraman, A Phanishayee, J Qian, W Xiao, F Yang 2019 USENIX Annual Technical Conference (USENIX ATC 19), 947-960, 2019 | 436 | 2019 |
{MLaaS} in the wild: Workload analysis and scheduling in {Large-Scale} heterogeneous {GPU} clusters Q Weng, W Xiao, Y Yu, W Wang, C Wang, J He, Y Li, L Zhang, W Lin, ... 19th USENIX Symposium on Networked Systems Design and Implementation (NSDI …, 2022 | 299 | 2022 |
Kv-direct: High-performance in-memory key-value store with programmable nic B Li, Z Ruan, W Xiao, Y Lu, Y Xiong, A Putnam, E Chen, L Zhang Proceedings of the 26th Symposium on Operating Systems Principles, 137-152, 2017 | 296 | 2017 |
Efficient and effective sparse LSTM on FPGA with bank-balanced sparsity S Cao, C Zhang, Z Yao, W Xiao, L Nie, D Zhan, Y Liu, M Wu, L Zhang Proceedings of the 2019 ACM/SIGDA International Symposium on Field …, 2019 | 213 | 2019 |
{AntMan}: Dynamic scaling on {GPU} clusters for deep learning W Xiao, S Ren, Y Li, Y Zhang, P Hou, Z Li, Y Feng, W Lin, Y Jia 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2020 | 212 | 2020 |
GraM: scaling graph computation to the trillions M Wu, F Yang, J Xue, W Xiao, Y Miao, L Wei, H Lin, Y Dai, L Zhou Proceedings of the Sixth ACM Symposium on Cloud Computing, 408-421, 2015 | 166 | 2015 |
Balanced sparsity for efficient dnn inference on gpu Z Yao, S Cao, W Xiao, C Zhang, L Nie Proceedings of the AAAI conference on artificial intelligence 33 (01), 5676-5683, 2019 | 131 | 2019 |
An empirical study on program failures of deep learning jobs R Zhang, W Xiao, H Zhang, Y Liu, H Lin, M Yang Proceedings of the ACM/IEEE 42nd international conference on software …, 2020 | 105 | 2020 |
Seernet: Predicting convolutional neural network feature-map sparsity through low-bit quantization S Cao, L Ma, W Xiao, C Zhang, Y Liu, L Zhang, L Nie, Z Yang Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2019 | 96 | 2019 |
{Tux²}: Distributed Graph Computation for Machine Learning W Xiao, J Xue, Y Miao, Z Li, C Chen, M Wu, W Li, L Zhou 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI …, 2017 | 82 | 2017 |
Multi-tenant GPU Clusters for Deep Learning Workloads: Analysis and Implications M Jeon, S Venkataraman, A Phanishayee, J Qian, W Xiao, F Yang MSR-TR-2018-13, 2018 | 81 | 2018 |
Zico: Efficient {GPU} memory sharing for concurrent {DNN} training G Lim, J Ahn, W Xiao, Y Kwon, M Jeon 2021 USENIX Annual Technical Conference (USENIX ATC 21), 161-175, 2021 | 53 | 2021 |
Whale: Efficient giant model training over heterogeneous {GPUs} X Jia, L Jiang, A Wang, W Xiao, Z Shi, J Zhang, X Li, L Chen, Y Li, ... 2022 USENIX Annual Technical Conference (USENIX ATC 22), 673-688, 2022 | 51 | 2022 |
Memory efficient loss recovery for hardware-based transport in datacenter Y Lu, G Chen, Z Ruan, W Xiao, B Li, J Zhang, Y Xiong, P Cheng, E Chen Proceedings of the First Asia-Pacific Workshop on Networking, 22-28, 2017 | 32 | 2017 |
Infinite-llm: Efficient llm service for long context with distattention and distributed kvcache B Lin, C Zhang, T Peng, H Zhao, W Xiao, M Sun, A Liu, Z Zhang, L Li, ... arXiv preprint arXiv:2401.02669, 2024 | 29 | 2024 |
Llumnix: Dynamic Scheduling for Large Language Model Serving B Sun, Z Huang, H Zhao, W Xiao, X Zhang, Y Li, W Lin arXiv preprint arXiv:2406.03243, 2024 | 18 | 2024 |
Goldminer: Elastic scaling of training data pre-processing pipelines for deep learning H Zhao, Z Yang, Y Cheng, C Tian, S Ren, W Xiao, M Yuan, L Chen, K Liu, ... Proceedings of the ACM on Management of Data 1 (2), 1-25, 2023 | 16 | 2023 |
Cognn: efficient scheduling for concurrent gnn training on gpus Q Sun, Y Liu, H Yang, R Zhang, M Dun, M Li, X Liu, W Xiao, Y Li, Z Luan, ... SC22: International Conference for High Performance Computing, Networking …, 2022 | 13 | 2022 |
Scheduling CPU for GPU-based deep learning jobs W Xiao, Z Han, H Zhao, X Peng, Q Zhang, F Yang, L Zhou Proceedings of the ACM Symposium on Cloud Computing, 503-503, 2018 | 11 | 2018 |