Original Reddit post

Hi all, First I want to tell that I was actually learning supervised machine learning comfortable I mean vectors, SVM, PCA , gradient descent etc, but I got an idea from physics and told chatgpt to map it mathematically . He did and then I orchestrated Deepseek, Gemini and Claude together with GPT to understand and explore deeply and in this process a new algorithm was discovered which beats traditional baselines. Here was the setup: Setup

  • Two datasets: PathMNIST (image patches, 2000 nodes, 9 classes) and 20 Newsgroups (text, 2000 nodes, 10 classes)
  • 20 random label splits per experiment, mean ± std reported
  • Corrupted graph: 40% random edge addition (adversarial noise condition)
  • 5 labels per class (45 / 50 total labeled nodes out of 2000)
  • All methods evaluated on identical label splits Methods compared:
  • Linear baseline: logistic regression on raw features, no graph
  • Poisson learning (harmonic solution on graph Laplacian)
  • Heat diffusion with oracle stopping (†not deployable — uses ground truth to find T)
  • GCN: standard 2-layer, 3 random restarts, best taken
  • Optimus: my base method
  • Optimus Pro: Optimus + a specific label selection strategy Results — PathMNIST, lpc=5 (45 labeled nodes) | Method | Clean | Corrupted (+40% edges) | Degradation | |—|—|—|—| | Linear (no graph) | 0.701 ± 0.021 | 0.701 ± 0.021 | 0.000 | | Poisson | 0.743 ± 0.027 | 0.518 ± 0.064 | −0.225 | | Heat diffusion† | 0.724 ± 0.024 | 0.609 ± 0.021 | −0.115 | | GCN | 0.771 ± 0.030 | 0.764 ± 0.025 | −0.007 | | Optimus | 0.790 ± 0.021 | 0.775 ± 0.025 | −0.015 | | Optimus Pro | 0.797 | 0.774 ± 0.010 | −0.023 | † Oracle stopping: uses all ground-truth labels to select T. Not deployable. Results — 20 Newsgroups, lpc=5 (50 labeled nodes) | Method | Clean | Corrupted | Degradation | |—|—|—|—| | Linear | 0.605 ± 0.029 | 0.605 ± 0.029 | 0.000 | | Poisson | 0.416 ± 0.160 | 0.293 ± 0.102 | −0.123 | | GCN | 0.738 ± 0.026 | 0.720 ± 0.026 | −0.019 | | Optimus | 0.788 ± 0.012 | 0.722 ± 0.020 | −0.066 | | Optimus Pro | 0.798 | 0.728 ± 0.007 | −0.070 | 1. Extreme label scarcity (lpc=1, only 9 total labeled nodes on PathMNIST): | Method | Accuracy | |—|—| | Linear | 0.534 | | Poisson | 0.369 | | GCN | 0.606 | | Optimus | 0.663 | | Optimus Pro | 0.739 | Optimus Pro with 9 labels beats GCN with 45 labels (0.739 vs 0.771) — about 5× label efficiency 2. Optimus is training-free. No gradient descent, no learned parameters, no hyperparameter search at test time. GCN requires training. Yet on clean PathMNIST, Optimus beats GCN by +0.019 (p=0.001, Wilcoxon). On 20 Newsgroups the gap is +0.050 (p<0.001). Am I choosing GCN hyperparameters fairly? I used lr=0.01, hidden=64, weight_decay=5e-4, 200 epochs, 3 restarts, best taken. 3. Optimus has a closed-form stopping criterion — derived mathematically from the method’s dynamics rather than tuned on validation data. The stopping time adapts to the graph’s spectral properties. This is what prevents it from needing oracle stopping like the heat diffusion baseline. 4. Poisson learning collapses catastrophically on text graphs — std=0.160 on 20 Newsgroups clean, dropping to near-random on some seeds. Is this a known issue with Poisson on certain graph types? 5. GCN is surprisingly robust to 40% edge corruption (−0.007 on PathMNIST) compared to Poisson (−0.225) and heat diffusion (−0.115). I think this is because GCN’s learned weights partially ignore corrupted graph signal and fall back on features. But then Optimus Pro also achieves comparable robustness (−0.023) without any training. Is there a theoretical explanation for why spectral-based methods can be robust without learned regularisation? SO above summary was created by AI, thats my dilemma initially I was able to understand but suddenly the field went so tangential that I have no clue terms like “spectral gap”, “fisher ratio”, “topology”, “metstable transient phenomenon” etc !! I would like to pursue further study taking this as a base but I need to have understanding of graph based semi -supervised Learning, on searching on internet there is no clear or no path to develop competency in this . Could someone in this field chart out a path of learning ?? with resources ?? I asked AI But it straightaway leads to papers without developing basics so that what I need . Thanks submitted by /u/Loner_Indian

Originally posted by u/Loner_Indian on r/ArtificialInteligence