Model Evaluation articles
Latent Features of Numbers Learned by Sequence Models
by Peter de Blanc + ChatGPT Deep Research 2 months ago00
Researchers have developed various ways to embed integers as distinct tokens in sequence modeling tasks (e.g. using OEIS data). In these approaches, each number is treated like a “word” with its own v...Semantic Dimensions in English Word Embeddings
by Peter de Blanc + ChatGPT Deep Research 2 months ago00
Introduction Word embeddings represent word meanings as points in a high-dimensional continuous space. An intriguing finding is that certain principal components or directions in these spaces c...Tutorial: Building, Running, and Publishing a Custom LLM Evaluation
by Peter de Blanc + ChatGPT Deep Research 3 months ago00
Evaluating large language models (LLMs) on novel tasks (like game-playing) requires careful planning. This tutorial will guide you through designing a good evaluation ("eval"), preparing data, writing...