A benchmark for evaluating sentence/document embeddings of Scandinavian language models.
Important
The Scandinavian Embedding Benchmark has moved to MTEB. You can find the Scandinavian Leaderboard under the MTEB Leaderboard. To run the benchmark, add results etc. please refer to the MTEB documentation. The reason for the change is that 1) encourage others to evaluate on scandinavian tasks, 2) avoid duplication of effort, and 3) make it easier for users to compare models across languages. My hope is that this will lead to better models for Scandinavian languages.
Missing a model or information? That is great we would love to add it to MTEB. Please file an issue on MTEB and we will help get it added.
You can install the Scandinavian Embedding Benchmark (seb) via pip from PyPI:
pip install sebTo see more examples, see the documentation.
| Documentation | |
|---|---|
| 🔧 Installation | Installation instructions on how to install this package |
| 👩💻 Usage | Introduction on how to use the package |
| 📖 Documentation | A minimal and developing documentation |
| Type | |
|---|---|
| 🚨 Bug Reports | GitHub Issue Tracker |
| 🎁 Feature Requests & Ideas | GitHub Issue Tracker |
| 👩💻 Usage Questions | GitHub Discussions |
| 🗯 General Discussion | GitHub Discussions |
To cite this work please refer to the following work accepted at neurips:
Enevoldsen, K., Kardos, M., Muennighoff, N., & Nielbo, K. (2024). The Scandinavian Embedding Benchmarks: Comprehensive Assessment of Multilingual and Monolingual Text Embedding. In Advances in Neural Information Processing Systems
or use the following BibTeX:
@inproceedings{enevoldsen2024scandinavian,
title={The Scandinavian Embedding Benchmarks: Comprehensive Assessment of Multilingual and Monolingual Text Embedding},
author={Enevoldsen, Kenneth and Kardos, M{\'a}rton and Muennighoff, Niklas and Nielbo, Kristoffer},
booktitle={Advances in Neural Information Processing Systems},
year={2024},
url={https://nips.cc/virtual/2024/poster/97869}
}
