The Carry-Adder Wall · Part III

Neural Distinguishers Expire on Carry Composition

Discussed on the blog: Tilting at Windmills VI

Abstract

A hand-built carry-aware score measures a one-round advantage cliff on reduced SHA-256: it predicts a downstream output byte one adder layer away and loses all reach one round deeper. The natural objection is that a learned model might find local structure a human feature misses, as Gohr’s neural distinguishers did for round-reduced Speck. We translate that methodology to the mining read point. A residual network receives, as features, everything computed through an interior round (the state words, their carry-free derivations, the round’s modular sums, and the schedule words, in bit, value, and Fourier encodings) and is free to learn any function of them. The network exceeds the hand-built score one adder layer downstream (0.9980.998 versus 0.880.88 retained advantage) and then collapses to the noise floor one round deeper, across independent stems, a shifted read point, and a 90×90\times capacity, 64×64\times data, and 8×8\times training-time scaling. On pure kk-operand modular sums, with no SHA structure present, the same network learns the top byte for k3k \le 3 and fails for k4k \ge 4. We show the wall is not an artifact of finite feature precision (a seed-paired float32-versus-float64 comparison is null in both arms) and that it is the learner’s reach rather than an intrinsic boundary: the carry chain’s spectral gap is 12\tfrac{1}{2} for every kk, so nothing in the carry’s mixing singles out k=4k = 4. We also identify a distinct second failure mode, feature isolation, and show it does not affect the main result.

@misc{hollows2026neuraldi,
  author = {Hollows, Peter},
  title  = {{Neural Distinguishers Expire on Carry Composition}},
  year   = {2026},
  month  = jun,
  note   = {The Carry-Adder Wall series, Part III},
  url    = {https://dojo7.com/papers/neural-distinguishers/}
}