Each dot on this map is a research paper. The colored ones are publications from IAIFI (the NSF Institute for Artificial Intelligence and Fundamental Interactions); the faint grey dots are nearby papers from the broader arXiv that give the IAIFI work context.
Paper positions come from Specter2, a language model trained on scientific text. Specter2 reads each paper's title and abstract and produces a 768-dimensional vector that captures what the paper is about. Papers with similar content end up with similar vectors. We project those vectors down to 2D with DensMAP (a density-preserving variant of UMAP) so they can be drawn on screen. Papers that are close together on the map really do discuss related topics; papers far apart are about different things. Local neighborhoods are trustworthy, but long-range distances are only approximate.
Clusters are identified with HDBSCAN and labeled automatically from the most common arXiv categories and keywords in each group. The "IAIFI Theme" colors (AI, Physics, Both) come from cross-referencing each paper's arXiv categories against typical AI and physics category prefixes.
Built by Oscar Barrera. The data pipeline scrapes paper metadata from iaifi.org, pulls related background papers from the Semantic Scholar API, embeds everything with Specter2, selects the most relevant background via diversity-weighted kNN, and exports the final JSON that this page reads. The visualization is plain Canvas 2D with no external libraries.