< Back to previous page
Dataset
MFAQ: a Multilingual FAQ Dataset
MFAQ is a multilingual FAQ dataset publicly available. It contains around 6M FAQ pairs from the web, in 21 different languages. Although this is significantly larger than existing FAQ retrieval datasets, it comes with its own challenges: duplication of content and uneven distribution of topics.
Publication year:2021
Accessibility:open
Publisher:Hugging Face
License:CC0-1.0
Format:json
Keywords: Computer. Automation, Linguistics