Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up

hal

community
https://hal.cs.princeton.edu/
Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

kanghengliu  updated a collection 10 days ago
CORE-bench v1.1
kanghengliu  updated a collection 10 days ago
CORE-bench v1.1
kanghengliu  updated a dataset 10 days ago
agent-evals/core-bench-v1.1-ood
View all activity

Benedikt Stroebl's profile pictureSayash Kapoor's profile pictureArvind Narayanan's profile pictureZachary Siegel's profile pictureBoyi Wei's profile picturePeter Kirgis's profile picturewave's profile pictureZiru Chen's profile pictureYifei Zhou's profile picturexuetianci's profile pictureAmmar's profile pictureNDZOMGA Franck Stéphane's profile pictureHarsh Trivedi's profile pictureKangheng Liu's profile picture

agent-evals 's collections 1

CORE-bench v1.1
Benchmark for AI agents on scientific reproducibility — mainline (39) and OOD (19) splits derived from Code Ocean capsules.
  • agent-evals/core-bench-v1.1-mainline

    Viewer • Updated 10 days ago • 39 • 62
  • agent-evals/core-bench-v1.1-ood

    Viewer • Updated 10 days ago • 19 • 45
CORE-bench v1.1
Benchmark for AI agents on scientific reproducibility — mainline (39) and OOD (19) splits derived from Code Ocean capsules.
  • agent-evals/core-bench-v1.1-mainline

    Viewer • Updated 10 days ago • 39 • 62
  • agent-evals/core-bench-v1.1-ood

    Viewer • Updated 10 days ago • 19 • 45
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs