Pinned Loading
-
open-compass/opencompass
open-compass/opencompass PublicOpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
-
PremiLab-Math/MathCheck
PremiLab-Math/MathCheck Public[ICLR 2025] Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist
Python 31
-
NLP2CT/kNN-TL
NLP2CT/kNN-TL Public[ACL 2023] kNN-TL: k-Nearest-Neighbor Transfer Learning for Low-Resource Neural Machine Translation
-
NLP2CT/UaIT
NLP2CT/UaIT Public[EMNLP 2024] Can LLMs Learn Uncertainty on Their Own? Expressing Uncertainty Effectively in A Self-Training Manner
Python
-
DevoAllen/Awesome-Reasoning-Economy-Papers
DevoAllen/Awesome-Reasoning-Economy-Papers PublicHarnessing the Reasoning Economy: A Survey of Efficient Reasoning for Large Language Models
If the problem persists, check the GitHub status page or contact support.