FinBERT-Multilingual

A multilingual extension of the FinBERT paradigm: domain-adapted transformer for financial sentiment classification across six languages (EN, ZH, JA, DE, FR, ES).

While the original FinBERT demonstrated the effectiveness of domain-specific pre-training for English financial NLP, this model extends that approach to a multilingual setting using XLM-RoBERTa-base as the backbone, enabling cross-lingual financial sentiment analysis without language-specific models.

Model Architecture

  • Base model: xlm-roberta-base (278M parameters)
  • Task: 3-class sequence classification (Negative / Neutral / Positive)
  • Domain adaptation: Task-Adaptive Pre-Training (TAPT) via Masked Language Modeling on 35K+ financial texts
  • Languages: English, Chinese, Japanese, German, French, Spanish

Training Pipeline

Stage 1: Task-Adaptive Pre-Training (TAPT)

Following Gururangan et al. (2020), we perform continued MLM pre-training on the unlabeled financial corpus to adapt the model's representations to the financial domain. This stage exposes the model to domain-specific vocabulary and discourse patterns across all six target languages using approximately 35,000 financial text samples.

Stage 2: Supervised Fine-Tuning

The domain-adapted model is then fine-tuned on the labeled sentiment classification task.

Hyperparameters:

Parameter Value
Learning rate 2e-5
LR scheduler Cosine annealing
Label smoothing 0.1
Checkpoint selection SWA (top-3 checkpoints)
Base model xlm-roberta-base

Stochastic Weight Averaging (SWA): Rather than selecting a single best checkpoint, we average the weights of the top-3 performing checkpoints. This produces a flatter loss minimum and more robust generalization, particularly beneficial for multilingual settings where overfitting to dominant languages is a risk.

Label smoothing (0.1): Prevents overconfident predictions and improves calibration, which is important for financial applications where prediction confidence informs downstream decisions.

Evaluation Results

Overall Metrics

Metric Score
Accuracy 0.8103
F1 (weighted) 0.8102
Precision (weighted) 0.8111
Recall (weighted) 0.8103

Per-Class Performance

Class Precision Recall F1-Score
Negative 0.78 0.83 0.81
Neutral 0.83 0.79 0.81
Positive 0.80 0.82 0.81

The balanced per-class performance (all F1 scores at 0.81) indicates that the model does not exhibit significant class bias, despite the imbalanced training distribution (Neutral: 45.5%, Positive: 30.8%, Negative: 23.7%).

Usage

from transformers import pipeline

classifier = pipeline("text-classification", model="Kenpache/finbert-multilingual")

# English
classifier("The company reported record quarterly earnings, driven by strong demand.")
# [{'label': 'positive', 'score': 0.95}]

# German
classifier("Die Aktie verlor nach der Gewinnwarnung deutlich an Wert.")
# [{'label': 'negative', 'score': 0.92}]

# Japanese
classifier("同社の売上高は前年同期比で横ばいとなった。")
# [{'label': 'neutral', 'score': 0.88}]

# Chinese
classifier("该公司宣布大规模裁员计划,股价应声下跌。")
# [{'label': 'negative', 'score': 0.91}]

Direct Model Loading

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

tokenizer = AutoTokenizer.from_pretrained("Kenpache/finbert-multilingual")
model = AutoModelForSequenceClassification.from_pretrained("Kenpache/finbert-multilingual")

text = "Les bénéfices du groupe ont augmenté de 15% au premier trimestre."
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)

with torch.no_grad():
    outputs = model(**inputs)
    probs = torch.softmax(outputs.logits, dim=-1)
    pred = torch.argmax(probs, dim=-1).item()

labels = {0: "negative", 1: "neutral", 2: "positive"}
print(f"Prediction: {labels[pred]} ({probs[0][pred]:.4f})")

Training Data

The model was trained on Kenpache/multilingual-financial-sentiment, a curated dataset of ~39K financial news sentences from 80+ sources across six languages.

Language Samples Sources
Japanese 8,287 Nikkei, Nikkan Kogyo, Reuters JP, Minkabu, etc.
Chinese 7,930 Sina Finance, EastMoney, 10jqka, etc.
Spanish 7,125 Expansión, Cinco Días, Bloomberg Línea, etc.
English 6,887 CNBC, Yahoo Finance, Fortune, Benzinga, etc.
German 5,023 Börse.de, FAZ, NTV Börse, Handelsblatt, etc.
French 3,935 Boursorama, Tradingsat, BFM Business, etc.

Comparison with FinBERT

Feature FinBERT FinBERT-Multilingual
Base model BERT-base XLM-RoBERTa-base
Languages English only 6 languages
Domain adaptation Financial corpus pre-training TAPT on multilingual financial texts
Classes 3 (Pos/Neg/Neu) 3 (Pos/Neg/Neu)
Checkpoint selection Single best SWA (top-3)

Citation

If you use this model in your research, please cite:

@misc{finbert-multilingual-2025,
  title={FinBERT-Multilingual: Cross-Lingual Financial Sentiment Analysis with Domain-Adapted XLM-RoBERTa},
  author={Kenpache},
  year={2025},
  url={https://huggingface.co/Kenpache/finbert-multilingual}
}

License

Apache 2.0

Downloads last month
49
Safetensors
Model size
0.3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train Kenpache/finbert-multilingual

Papers for Kenpache/finbert-multilingual

Evaluation results