Thanks for confirming.
So this is definitely an encoding issue, but I'm not really sure how to fix it.
We try to convert everything into UTF-8 as part of sanitizing input but mb_detect_encoding isn't detecting what these Vietnamese words are properly, so it just borks them? For example, we detect ờ
as UTF-8 but Mường
as ISO-8859-1?
Googling suggests this is Vietnamese encoding issue affects a fair amount of software, but I can't see a solid way to fix it.