< Back to previous page

Dataset

MFAQ: a Multilingual FAQ Dataset

MFAQ is a multilingual FAQ dataset publicly available. It contains around 6M FAQ pairs from the web, in 21 different languages. Although this is significantly larger than existing FAQ retrieval datasets, it comes with its own challenges: duplication of content and uneven distribution of topics.
Publication year:2021
Accessibility:open
Publisher:Hugging Face
License:CC0-1.0
Format:json
Keywords: Computer. Automation, Linguistics