Datasets for Farsi (Persian) Natural Language Processing (NLP)
Farsi (Persian)
- Dependency Parsing
- Irony Detection
- Lexical Database
- Named Entity Recognition
- Natural Language Inference
- Parallel Corpora
- Part-of-speech Tagging
- Pre-trained Embeddings
- Pre-trained LM
- Question Answering
- Raw Text Corpora
- Sentiment Analysis
- Spell Checking
- Text Classification
- Text Summarization
- Word Similarity
- NLP Tools
This website aims at listing datasets and tools for research and development in Farsi Natural Language Processing (NLP).
Contribute
Adding a new dataset or task
If you would like to add a new dataset (or edit an existing one), you can just click on the small edit button in the top-right corner of the corresponding .md file for the task (in the Github repository). This allows you to edit the file in Markdown. Simply add a row to the corresponding table in the same format. After you’ve made your change, make sure that the table still looks ok by clicking on the “Preview changes” tab at the top of the page. If everything looks good, go to the bottom of the page, where you see a form. Add a name for your proposed change, an optional description, indicate that you would like to “Create a new branch for this commit and start a pull request”, and click on “Propose file change”.
You can also use this Google Form to contribute: https://forms.gle/hojfmiamZoeUU93m6
About
A project initiated by Dadmatech, with the help of:
Mohammad Taher Pilehvar [Director]
Mohsen Fayyaz
Alireza Moradi
Zhivar Sourati
Sepehr Babapour
Parisa Yalsavar
Ehsan Aghazadeh