What question did this study set out to answer?

This research aims to explore the impact of stochastic resonance on neural networks of varying sizes and architectures.

April 14, 2026Open Access

SNN-Synthesis v7: Stochastic Resonance Quantization, Test-Time Compute Scaling Laws, and The Bitter Lesson — from 63K to 7B Parameters

Key Points

This research aims to explore the impact of stochastic resonance on neural networks of varying sizes and architectures.
Conducted 60 experimental phases across multiple neural network architectures ranging from 63K to 7B parameters.
Examined the effects of different noisy beam search configurations and scaling laws on accuracy.
Implemented knowledge multiplexing to investigate dual-resource management in neural architectures.
A 1.5B-parameter model achieved 80% accuracy through stochastic resonance quantization, outperforming a 7B model.
Accuracy increases logarithmically with beam count, showing improved performance from K=1 to K=21 in reasoning tasks.
A critical computational overhead threshold was identified, beyond which intelligence declines against random exploration.

Abstract

Abstract I present SNN-Synthesis v7, a comprehensive investigation of stochastic resonance in neural networks spanning 63K-parameter CNNs to 7B-parameter LLMs across 60 experimental phases. Building upon v1–v6 (Phases 1–38: trajectory distillation, multi-layer orchestration, scale-invariant resonance, Noisy Beam Search, SNN-ExIt, Two-Condition Theory, architecture-invariant LLM NBS, multi-task NBS on GSM8K, LLM-ExIt self-evolution, knowledge multiplexing, and σ-diverse NBS), this version adds 22 new experiments (Phases 39–60) establishing three landmark results. Three Landmark Results (v7) (X) The Bitter Lesson: Crossover Law — In ARC-AGI-3, any per-action computational overhead >0.5ms causes complex agents (RND, CNN, N-gram) to lose to pure random exploration under fixed time budgets. SimHash curiosity (0.005ms, O(1)) is the only intelligence that survives this constraint, confirming Rich Sutton's "Bitter Lesson" for interactive environments. (Phases 44–52) (XI) Stochastic Resonance Quantization — A 1.5B-parameter model (Qwen-1.5B) with σ-diverse NBS (K=11) achieves 80% accuracy, surpassing a 7B model (Mistral-7B) at K=1 baseline (42%) by +38pp. Lost parameters (spatial resolution) can be compensated by noise + beam search (temporal resolution), establishing a space-time duality in neural computation. (Phase 59) (XII) Test-Time Compute Scaling Law — LLM reasoning accuracy scales logarithmically with beam count K: Mistral-7B on math problems improves from 16.7% (K=1) to 33.3% (K=21); Tower of Hanoi from 13.3% to 53.3%. Accuracy plateaus with diminishing returns beyond K=11, establishing the cost-performance frontier for σ-diverse NBS. (Phase 60) v1–v6 Foundations (Phases 1–38) (I) Noisy Beam Search: K=11 parallel noisy trajectories achieve 78% on CNN (from 12%) and 100% on Mistral-7B Modified Hanoi (from 16%).(II) SNN-ExIt: Oracle-free self-evolution reaches 99% on ARC-AGI-3 LS20, surpassing Oracle CNN (78%).(III) Knowledge Multiplexing via ID Gating (Phase 35c): A single 115K parameter CNN stores distinct knowledge for multiple games without interference, using discrete condition-ID gating (h ← h ⊙ σ(Embed(id))). Knowledge separation score reaches +0.572. Four alternative approaches (noise modulation, SNN chaotic noise, continuous-wave gating, pink noise) all fail—only discrete gating succeeds, mirroring biological neurotransmitter-based mode switching.(IV) σ-Diverse NBS (Phase 37a): Assigning different σ values to each of K=11 beams eliminates the need for task-specific σ* tuning. Performance matches the best individually-tuned fixed σ across all tested difficulty levels, providing a hyperparameter-free exploration strategy.(V) Capacity Scaling (Phase 38a): ID gating requires ≥2.7K parameters for effective knowledge separation. At 115K parameters, gated models surpass ungated models (0.706 > 0.625), demonstrating that gating acts as positive regularization.(VI) Multi-Model NBS (Phase 38): Qwen2.5-7B-Instruct achieves 100% solve rate at K=11 on Modified Hanoi, matching Mistral-7B and confirming cross-model universality.(VII) GSM8K LLM-ExIt (Phase 33): Extending LLM-ExIt to math reasoning; Mistral-7B K1 accuracy improves from 56.5% to 58.0% over 3 iterations. The modest gain confirms ExIt functions on open-ended tasks but reveals that high-baseline tasks limit self-improvement headroom.(VIII) σ* Prediction (Phase 34): TruthfulQA MC1 achieves 100% accuracy at σ*=0.2 with K=11, extending the σ* map to four tasks: GSM8K (0.01), TruthfulQA (0.2), Hanoi (0.15), ARC-AGI (0.2). v7 Key Findings (Phases 39–60) RND Curiosity (Phase 39): 63.5% solve rate vs. Random 2.5% at difficulty 6 (+61pp), but 0.5ms/action overhead is fatal under time budgets σ-Diverse NBS on LLMs (Phase 40): Mistral-7B GSM8K achieves 70% with σ-diverse K=11, confirming LLM-scale superiority The Crossover Law (Phases 44–46): Overhead >0.5ms/action → intelligence loses to random exploration SimHash O(1) Curiosity (Phase 51): Locality-sensitive hashing matches RND at ~100× less overhead (0.005ms) Grand Simulation (Phase 52): Final benchmark validates SimHash + σ-diverse NBS as optimal ARC-AGI-3 agent Asymptotic Scaling (Phase 56): At 10M actions, all agents converge to 100%—no fundamental wall exists 6 Null Results (Phases 53–55, 57–58): Associative SimHash, temporal SimHash, macro chunks, bag-of-patches, and curiosity-diverse swarms all fail, confirming design convergence SR-Quantization (Phase 59): Qwen-1.5B + NBS (80%) > Mistral-7B baseline (42%) — small + noise > large + greedy TTC Scaling Law (Phase 60): Logarithmic accuracy scaling with K; optimal cost-performance at K=5–11 38 contributions spanning 63K–7B parameters, CNNs to Transformers, 9 task domains, 2 model families (Mistral, Qwen), 4 model scales (1B–7B), and 19 honest null results. Code and data: https://github.com/hafufu-stack/SNN-Synthesis Acknowledgments This research was conducted entirely independently, without institutional affiliation or corporate funding. The author currently faces financial constraints that make it increasingly difficult to maintain subscriptions to AI services essential for this line of research. To sustain and improve the quality of future work, the author is actively seeking community sponsorship. Details are available at https://github.com/sponsors/hafufu-stack.

Read Full Paperexternally

Perguntar à IA

Bookmark

View Full Paper