Original Reddit post

What is this? Long story short, it shows current trends in AI research and how they tend to change over time. The idea is that we can map text into a point location in semantic space. Then, if we have textual data that changes over time, the consecutive point locations create a trajectory in that semantic space. From many such paths, we can compute a generalized flow model that shows where the trends tend to go. What I did here is that, for each arXiv paper category, I created a path showing how the papers’ meanings and topics changed over the last 6 months. Then, from many such paths, the generalized flow model was computed. What it found: The three main components that seem to govern the current AI research space are: X: abstraction level Y: perception emphasis Z: agentic emphasis It also found two distinct global attractor basins. The first attractor basin seems to represent AI research moving toward grounded perception and interaction with the real world. This is less about abstract model behavior and more about making AI systems understand messy, changing environments, where inputs are noisy, incomplete, distributed, or constrained by deployment conditions. The second attractor basin seems to represent AI research moving toward agentic behavior, reasoning, and control of model objectives. This is more about making models follow the intended goal, avoid shortcut solutions, and behave reliably when trained or evaluated through imperfect signals. So, roughly speaking, one attractor is about AI becoming better at perceiving and operating in the physical world, while the other is about AI becoming better controlled as an agentic reasoning system. The video is from this interactive web version, which you can try here: https://pixedar.github.io/ai/tracescope/ The tool that was used to build these semantic flows is my open source repo here: https://github.com/Pixedar/TraceScope If you are interested in the details of how the points are projected and how the axes are computed, there is an explanation in the repo README as well. I also explained more in my previous post about semantic flow, where I mapped step by step LLM reasoning and explained the details in the comments: https://www.reddit.com/r/learnmachinelearning/comments/1suorcm/mapped_the_semantic_flow_of_stepbystep_llm I made this web demo version to make the semantic flow concept more accessible Limitations: Another thing is that the paper data might not be ideal, because there is a lot of randomness in when a given paper gets published, so it introduces a lot of noise. Nevertheless, it should still approximate the global trends. The TraceScope open source repo works better if we have native time series like data, such as step by step reasoning. This result cannot be treated as a peer reviewed quality grade result about current research directions, since proper statistical validation would take a lot of time. So if you want to use it for research, you should experiment with the model parameters and validate it statistically submitted by /u/Pixedar

Originally posted by u/Pixedar on r/ArtificialInteligence