Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper • 1908.10084 • Published • 13
This is a sentence-transformers model finetuned from ibm-granite/granite-embedding-97m-multilingual-r2 on 12 datasets. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for retrieval.
SentenceTransformer(
(0): Transformer({'transformer_task': 'feature-extraction', 'modality_config': {'text': {'method': 'forward', 'method_output_name': 'last_hidden_state'}}, 'module_output_name': 'token_embeddings', 'architecture': 'ModernBertModel'})
(1): Pooling({'embedding_dimension': 384, 'pooling_mode': 'cls', 'include_prompt': False})
(2): Normalize({})
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("bobox/synt-dataset-multi-task")
# Run inference
sentences = [
'attachment styles, neurobiological mechanisms, risk-taking behaviors, adolescents, prefrontal cortex, amygdala, oxytocin, dopamine pathways, cortisol regulation, longitudinal correlations, limbic system, executive function',
'Empirical investigations demonstrate that teenagers with secure caregiver bonds generally display controlled engagement in perilous activities, attributable to mature prefrontal inhibitory control. Conversely, anxious-ambivalent attachment correlates with amygdalar hyperactivation precipitating rash actions, while avoidant attachment links to diminished oxytocin reception fostering sensation-seeking. Longitudinal neuroimaging confirms insecure attachments remodel mesolimbic dopamine circuits throughout adolescence, elevating vulnerability to substance use and hazardous conduct. Additionally, glucocorticoid imbalance from persistent stress reactions in insecure dyads compromises risk evaluation capacities. These findings illustrate how early caregiving dynamics shape the maturation of emotional processing and cognitive control systems.',
'Longitudinal studies correlate anxious-ambivalent attachment with increased adolescent anxiety disorders, manifesting as social withdrawal and academic underachievement due to altered HPA axis functioning.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.9043, 0.8639],
# [0.9043, 1.0000, 0.8681],
# [0.8639, 0.8681, 1.0000]])
sts-bEmbeddingSimilarityEvaluator| Metric | Value |
|---|---|
| pearson_cosine | 0.8442 |
| spearman_cosine | 0.8553 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{oord2019representationlearningcontrastivepredictive,
title={Representation Learning with Contrastive Predictive Coding},
author={Aaron van den Oord and Yazhe Li and Oriol Vinyals},
year={2019},
eprint={1807.03748},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/1807.03748},
}