Follow
Bertie Vidgen
Bertie Vidgen
Oxford, Turing
Verified email at rewire.online
Title
Cited by
Cited by
Year
Dynabench: Rethinking benchmarking in NLP
D Kiela, M Bartolo, Y Nie, D Kaushik, A Geiger, Z Wu, B Vidgen, G Prasad, ...
arXiv preprint arXiv:2104.14337, 2021
4452021
Directions in abusive language training data, a systematic review: Garbage in, garbage out
B Vidgen, L Derczynski
Plos one 15 (12), e0243300, 2020
3612020
HateCheck: Functional tests for hate speech detection models
P Röttger, B Vidgen, D Nguyen, Z Waseem, H Margetts, JB Pierrehumbert
arXiv preprint arXiv:2012.15606, 2020
2772020
Trustllm: Trustworthiness in large language models
L Sun, Y Huang, H Wang, S Wu, Q Zhang, C Gao, Y Huang, W Lyu, ...
arXiv preprint arXiv:2401.05561 3, 2024
2732024
Learning from the worst: Dynamically generated datasets to improve online hate detection
B Vidgen, T Thrush, Z Waseem, D Kiela
arXiv preprint arXiv:2012.15761, 2020
2692020
Challenges and frontiers in abusive content detection
B Vidgen, A Harris, D Nguyen, R Tromble, S Hale, H Margetts
Proceedings of the third workshop on abusive language online, 2019
2382019
Detecting weak and strong Islamophobic hate speech on social media
B Vidgen, T Yasseri
Journal of Information Technology & Politics 17 (1), 66-78, 2020
2112020
Two contrasting data annotation paradigms for subjective NLP tasks
P Röttger, B Vidgen, D Hovy, JB Pierrehumbert
arXiv preprint arXiv:2112.07475, 2021
1662021
P-Values: Misunderstood and Misused
B Vidgen, T Yasseri
Frontiers in Physics 4, 6, 2016
1612016
Xstest: A test suite for identifying exaggerated safety behaviours in large language models
P Röttger, HR Kirk, B Vidgen, G Attanasio, F Bianchi, D Hovy
arXiv preprint arXiv:2308.01263, 2023
1532023
Semeval-2023 task 10: Explainable detection of online sexism
HR Kirk, W Yin, B Vidgen, P Röttger
arXiv preprint arXiv:2303.04222, 2023
1402023
An expert annotated dataset for the detection of online misogyny
E Guest, B Vidgen, A Mittos, N Sastry, G Tyson, H Margetts
Proceedings of the 16th conference of the European chapter of the …, 2021
1212021
Detecting East Asian prejudice on social media
B Vidgen, A Botelho, D Broniatowski, E Guest, M Hall, H Margetts, ...
arXiv preprint arXiv:2005.03909, 2020
1142020
Personalisation within bounds: A risk taxonomy and policy framework for the alignment of large language models with personalised feedback
HR Kirk, B Vidgen, P Röttger, SA Hale
arXiv preprint arXiv:2303.05453, 2023
1052023
Introducing CAD: the contextual abuse dataset
B Vidgen, D Nguyen, H Margetts, P Rossini, R Tromble
1032021
The benefits, risks and bounds of personalizing the alignment of large language models to individuals
HR Kirk, B Vidgen, P Röttger, SA Hale
Nature Machine Intelligence 6 (4), 383-392, 2024
922024
The prism alignment project: What participatory, representative and individualised human feedback reveals about the subjective and multicultural alignment of large language models
HR Kirk, A Whitefield, P Röttger, A Bean, K Margatina, J Ciro, R Mosquera, ...
arXiv preprint arXiv:2404.16019, 2024
752024
Position: TrustLLM: Trustworthiness in large language models
Y Huang, L Sun, H Wang, S Wu, Q Zhang, Y Li, C Gao, Y Huang, W Lyu, ...
International Conference on Machine Learning, 20166-20270, 2024
732024
Financebench: A new benchmark for financial question answering
P Islam, A Kannappan, D Kiela, R Qian, N Scherrer, B Vidgen
arXiv preprint arXiv:2311.11944, 2023
702023
Hatemoji: A test suite and adversarially-generated dataset for benchmarking and detecting emoji-based hate
HR Kirk, B Vidgen, P Röttger, T Thrush, SA Hale
arXiv preprint arXiv:2108.05921, 2021
642021
The system can't perform the operation now. Try again later.
Articles 1–20