Evaluation tool to assess the cultural relevance of images for user-defined culture labels
AI & ML interests
None defined yet.
Recent Activity
View all activity
Papers
Benchmark Test-Time Scaling of General LLM Agents
On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models
-
SOTOPIA-π: Interactive Learning of Socially Intelligent Language Agents
Paper • 2403.08715 • Published • 21 -
SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents
Paper • 2310.11667 • Published • 4 -
cmu-lti/sotopia
Updated • 351 • 6 -
cmu-lti/sotopia-pi
Viewer • Updated • 33.4k • 294 • 8
Evaluation tool to assess the cultural relevance of images for user-defined culture labels
-
SOTOPIA-π: Interactive Learning of Socially Intelligent Language Agents
Paper • 2403.08715 • Published • 21 -
SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents
Paper • 2310.11667 • Published • 4 -
cmu-lti/sotopia
Updated • 351 • 6 -
cmu-lti/sotopia-pi
Viewer • Updated • 33.4k • 294 • 8
datasets 13
cmu-lti/tau-usi
Updated • 11
cmu-lti/machine-translation-for-vision
Viewer • Updated • 696 • 1.1k • 1
cmu-lti/stateful
Viewer • Updated • 500 • 62
cmu-lti/caire-specific
Viewer • Updated • 68 • 8
cmu-lti/interactive-swe
Viewer • Updated • 500 • 35
cmu-lti/caire-universal
Viewer • Updated • 400 • 8
cmu-lti/caire-index-ckpts
Updated • 7
cmu-lti/AI-LieDar
Updated • 38
cmu-lti/agents_vs_script
Viewer • Updated • 20.3k • 35 • 3
cmu-lti/sotopia
Updated • 351 • 6