Rlaif: Scaling reinforcement learning from human feedback with ai feedback H Lee, S Phatale, H Mansoor, K Lu, T Mesnard, C Bishop, V Carbune, ... arXiv preprint arXiv:2309.00267, 2023 | 161 | 2023 |
Methods and systems for predicting conversion rates of content publisher and content provider pairs R Kirillov, H Mansoor US Patent 9,246,990, 2016 | 10 | 2016 |
LLMs cannot find reasoning errors, but can correct them! G Tyen, H Mansoor, P Chen, T Mak, V Cărbune arXiv preprint arXiv:2311.08516, 2023 | 9 | 2023 |
Methods and systems for providing an actionable object within a third-party content slot of an information resource of a content publisher R Kirillov, A Tyler, D Banfield, H Mansoor, DM Goodridge, LA Collard US Patent 10,067,916, 2018 | 6 | 2018 |
Methods and systems for providing an actionable object within a third-party content slot of an information resource of a content publisher R Kirillov, A Tyler, D Banfield, H Mansoor, DM Goodridge, LA Collard US Patent 9,461,936, 2016 | 5 | 2016 |
Methods and systems for providing an actionable object within a third-party content slot of an information resource of a content publisher R Kirillov, A Tyler, D Banfield, H Mansoor, DM Goodridge, LA Collard US Patent 10,210,140, 2019 | 4 | 2019 |
ScreenAI: A Vision-Language Model for UI and Infographics Understanding G Baechler, S Sunkara, M Wang, F Zubach, H Mansoor, V Etter, ... arXiv preprint arXiv:2402.04615, 2024 | 2 | 2024 |
Chart-based Reasoning: Transferring Capabilities from LLMs to VLMs V Carbune, H Mansoor, F Liu, R Aralikatte, G Baechler, J Chen, A Sharma arXiv preprint arXiv:2403.12596, 2024 | 1 | 2024 |
PERL: Parameter Efficient Reinforcement Learning from Human Feedback H Sidahmed, S Phatale, A Hutcheson, Z Lin, Z Chen, Z Yu, J Jin, ... arXiv preprint arXiv:2403.10704, 2024 | | 2024 |
The Impact of Preference Agreement in Reinforcement Learning from Human Feedback: A Case Study in Summarization S Gooding, H Mansoor arXiv preprint arXiv:2311.04919, 2023 | | 2023 |