Skip to content

KennethEnevoldsen/scandinavian-embedding-benchmark

Scandinavian Embedding Benchmark

PyPI Python Version documentation Tests Ruff DOI

A benchmark for evaluating sentence/document embeddings of Scandinavian language models.

Important

The Scandinavian Embedding Benchmark has moved to MTEB. You can find the Scandinavian Leaderboard under the MTEB Leaderboard. To run the benchmark, add results etc. please refer to the MTEB documentation. The reason for the change is that 1) encourage others to evaluate on scandinavian tasks, 2) avoid duplication of effort, and 3) make it easier for users to compare models across languages. My hope is that this will lead to better models for Scandinavian languages.

Missing a model or information? That is great we would love to add it to MTEB. Please file an issue on MTEB and we will help get it added.

Installation

You can install the Scandinavian Embedding Benchmark (seb) via pip from PyPI:

pip install seb

To see more examples, see the documentation.

📖 Documentation

Documentation
🔧 Installation Installation instructions on how to install this package
👩‍💻 Usage Introduction on how to use the package
📖 Documentation A minimal and developing documentation

💬 Where to ask questions

Type
🚨 Bug Reports GitHub Issue Tracker
🎁 Feature Requests & Ideas GitHub Issue Tracker
👩‍💻 Usage Questions GitHub Discussions
🗯 General Discussion GitHub Discussions

Citation

To cite this work please refer to the following work accepted at neurips:

Enevoldsen, K., Kardos, M., Muennighoff, N., & Nielbo, K. (2024). The Scandinavian Embedding Benchmarks: Comprehensive Assessment of Multilingual and Monolingual Text Embedding. In Advances in Neural Information Processing Systems

or use the following BibTeX:

@inproceedings{enevoldsen2024scandinavian,
  title={The Scandinavian Embedding Benchmarks: Comprehensive Assessment of Multilingual and Monolingual Text Embedding},
  author={Enevoldsen, Kenneth and Kardos, M{\'a}rton and Muennighoff, Niklas and Nielbo, Kristoffer},
  booktitle={Advances in Neural Information Processing Systems},
  year={2024},
  url={https://nips.cc/virtual/2024/poster/97869}
}