FinBERT-Multilingual
A multilingual extension of the FinBERT paradigm: domain-adapted transformer for financial sentiment classification across six languages (EN, ZH, JA, DE, FR, ES).
While the original FinBERT demonstrated the effectiveness of domain-specific pre-training for English financial NLP, this model extends that approach to a multilingual setting using XLM-RoBERTa-base as the backbone, enabling cross-lingual financial sentiment analysis without language-specific models.
Model Architecture
- Base model:
xlm-roberta-base(278M parameters) - Task: 3-class sequence classification (Negative / Neutral / Positive)
- Domain adaptation: Task-Adaptive Pre-Training (TAPT) via Masked Language Modeling on 35K+ financial texts
- Languages: English, Chinese, Japanese, German, French, Spanish
Training Pipeline
Stage 1: Task-Adaptive Pre-Training (TAPT)
Following Gururangan et al. (2020), we perform continued MLM pre-training on the unlabeled financial corpus to adapt the model's representations to the financial domain. This stage exposes the model to domain-specific vocabulary and discourse patterns across all six target languages using approximately 35,000 financial text samples.
Stage 2: Supervised Fine-Tuning
The domain-adapted model is then fine-tuned on the labeled sentiment classification task.
Hyperparameters:
| Parameter | Value |
|---|---|
| Learning rate | 2e-5 |
| LR scheduler | Cosine annealing |
| Label smoothing | 0.1 |
| Checkpoint selection | SWA (top-3 checkpoints) |
| Base model | xlm-roberta-base |
Stochastic Weight Averaging (SWA): Rather than selecting a single best checkpoint, we average the weights of the top-3 performing checkpoints. This produces a flatter loss minimum and more robust generalization, particularly beneficial for multilingual settings where overfitting to dominant languages is a risk.
Label smoothing (0.1): Prevents overconfident predictions and improves calibration, which is important for financial applications where prediction confidence informs downstream decisions.
Evaluation Results
Overall Metrics
| Metric | Score |
|---|---|
| Accuracy | 0.8103 |
| F1 (weighted) | 0.8102 |
| Precision (weighted) | 0.8111 |
| Recall (weighted) | 0.8103 |
Per-Class Performance
| Class | Precision | Recall | F1-Score |
|---|---|---|---|
| Negative | 0.78 | 0.83 | 0.81 |
| Neutral | 0.83 | 0.79 | 0.81 |
| Positive | 0.80 | 0.82 | 0.81 |
The balanced per-class performance (all F1 scores at 0.81) indicates that the model does not exhibit significant class bias, despite the imbalanced training distribution (Neutral: 45.5%, Positive: 30.8%, Negative: 23.7%).
Usage
from transformers import pipeline
classifier = pipeline("text-classification", model="Kenpache/finbert-multilingual")
# English
classifier("The company reported record quarterly earnings, driven by strong demand.")
# [{'label': 'positive', 'score': 0.95}]
# German
classifier("Die Aktie verlor nach der Gewinnwarnung deutlich an Wert.")
# [{'label': 'negative', 'score': 0.92}]
# Japanese
classifier("同社の売上高は前年同期比で横ばいとなった。")
# [{'label': 'neutral', 'score': 0.88}]
# Chinese
classifier("该公司宣布大规模裁员计划,股价应声下跌。")
# [{'label': 'negative', 'score': 0.91}]
Direct Model Loading
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
tokenizer = AutoTokenizer.from_pretrained("Kenpache/finbert-multilingual")
model = AutoModelForSequenceClassification.from_pretrained("Kenpache/finbert-multilingual")
text = "Les bénéfices du groupe ont augmenté de 15% au premier trimestre."
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
with torch.no_grad():
outputs = model(**inputs)
probs = torch.softmax(outputs.logits, dim=-1)
pred = torch.argmax(probs, dim=-1).item()
labels = {0: "negative", 1: "neutral", 2: "positive"}
print(f"Prediction: {labels[pred]} ({probs[0][pred]:.4f})")
Training Data
The model was trained on Kenpache/multilingual-financial-sentiment, a curated dataset of ~39K financial news sentences from 80+ sources across six languages.
| Language | Samples | Sources |
|---|---|---|
| Japanese | 8,287 | Nikkei, Nikkan Kogyo, Reuters JP, Minkabu, etc. |
| Chinese | 7,930 | Sina Finance, EastMoney, 10jqka, etc. |
| Spanish | 7,125 | Expansión, Cinco Días, Bloomberg Línea, etc. |
| English | 6,887 | CNBC, Yahoo Finance, Fortune, Benzinga, etc. |
| German | 5,023 | Börse.de, FAZ, NTV Börse, Handelsblatt, etc. |
| French | 3,935 | Boursorama, Tradingsat, BFM Business, etc. |
Comparison with FinBERT
| Feature | FinBERT | FinBERT-Multilingual |
|---|---|---|
| Base model | BERT-base | XLM-RoBERTa-base |
| Languages | English only | 6 languages |
| Domain adaptation | Financial corpus pre-training | TAPT on multilingual financial texts |
| Classes | 3 (Pos/Neg/Neu) | 3 (Pos/Neg/Neu) |
| Checkpoint selection | Single best | SWA (top-3) |
Citation
If you use this model in your research, please cite:
@misc{finbert-multilingual-2025,
title={FinBERT-Multilingual: Cross-Lingual Financial Sentiment Analysis with Domain-Adapted XLM-RoBERTa},
author={Kenpache},
year={2025},
url={https://huggingface.co/Kenpache/finbert-multilingual}
}
License
Apache 2.0
- Downloads last month
- 49
Dataset used to train Kenpache/finbert-multilingual
Papers for Kenpache/finbert-multilingual
FinBERT: Financial Sentiment Analysis with Pre-trained Language Models
Evaluation results
- Accuracyself-reported0.810
- F1 (weighted)self-reported0.810