You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

SentenceTransformer based on ibm-granite/granite-embedding-97m-multilingual-r2

This is a sentence-transformers model finetuned from ibm-granite/granite-embedding-97m-multilingual-r2 on 12 datasets. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for retrieval.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: ibm-granite/granite-embedding-97m-multilingual-r2
Maximum Sequence Length: 32768 tokens
Output Dimensionality: 384 dimensions
Similarity Function: Cosine Similarity
Supported Modality: Text
Training Datasets:
- standard_mnrl
- multi_lingual
- STS
- translation
- cross_lingual
- entailment_logic
- information_extraction
- summaryzation
- keyword_semantic_search
- anchor_type_and_intent_symm
- anchor_type_and_intent_asymm
- topic_clustering

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'transformer_task': 'feature-extraction', 'modality_config': {'text': {'method': 'forward', 'method_output_name': 'last_hidden_state'}}, 'module_output_name': 'token_embeddings', 'architecture': 'ModernBertModel'})
  (1): Pooling({'embedding_dimension': 384, 'pooling_mode': 'cls', 'include_prompt': False})
  (2): Normalize({})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("bobox/synt-dataset-multi-task")
# Run inference
sentences = [
    'attachment styles, neurobiological mechanisms, risk-taking behaviors, adolescents, prefrontal cortex, amygdala, oxytocin, dopamine pathways, cortisol regulation, longitudinal correlations, limbic system, executive function',
    'Empirical investigations demonstrate that teenagers with secure caregiver bonds generally display controlled engagement in perilous activities, attributable to mature prefrontal inhibitory control. Conversely, anxious-ambivalent attachment correlates with amygdalar hyperactivation precipitating rash actions, while avoidant attachment links to diminished oxytocin reception fostering sensation-seeking. Longitudinal neuroimaging confirms insecure attachments remodel mesolimbic dopamine circuits throughout adolescence, elevating vulnerability to substance use and hazardous conduct. Additionally, glucocorticoid imbalance from persistent stress reactions in insecure dyads compromises risk evaluation capacities. These findings illustrate how early caregiving dynamics shape the maturation of emotional processing and cognitive control systems.',
    'Longitudinal studies correlate anxious-ambivalent attachment with increased adolescent anxiety disorders, manifesting as social withdrawal and academic underachievement due to altered HPA axis functioning.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.9043, 0.8639],
#         [0.9043, 1.0000, 0.8681],
#         [0.8639, 0.8681, 1.0000]])

Evaluation

Metrics

Semantic Similarity

Dataset: sts-b
Evaluated with EmbeddingSimilarityEvaluator

Metric	Value
pearson_cosine	0.8442
spearman_cosine	0.8553

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{oord2019representationlearningcontrastivepredictive,
      title={Representation Learning with Contrastive Predictive Coding},
      author={Aaron van den Oord and Yazhe Li and Oriol Vinyals},
      year={2019},
      eprint={1807.03748},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/1807.03748},
}

Downloads last month: -

Safetensors

Model size

97.4M params

Tensor type

BF16

Model tree for bobox/synt-dataset-multi-task

Base model

ibm-granite/granite-embedding-97m-multilingual-r2

Finetuned

(3)

this model

Papers for bobox/synt-dataset-multi-task

Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

Paper • 1908.10084 • Published Aug 27, 2019 • 13

Representation Learning with Contrastive Predictive Coding

Paper • 1807.03748 • Published Jul 10, 2018 • 1

Evaluation results

Pearson Cosine on sts b
self-reported

0.844
Spearman Cosine on sts b
self-reported

0.855