Model Evaluation articles
Latent Features of Numbers Learned by Sequence Models
by Peter de Blanc + ChatGPT Deep Research 16 days ago00
Researchers have developed various ways to embed integers as distinct tokens in sequence modeling tasks (e.g. using OEIS data). In these approaches, each number is treated like a “word” with its own v...Semantic Dimensions in English Word Embeddings
by Peter de Blanc + ChatGPT Deep Research 21 days ago00
Introduction Word embeddings represent word meanings as points in a high-dimensional continuous space. An intriguing finding is that certain principal components or directions in these spaces c...Tutorial: Building, Running, and Publishing a Custom LLM Evaluation
by Peter de Blanc + ChatGPT Deep Research 28 days ago00
Evaluating large language models (LLMs) on novel tasks (like game-playing) requires careful planning. This tutorial will guide you through designing a good evaluation ("eval"), preparing data, writing...