Neural Distinguishers Expire on Carry Composition
Discussed on the blog: Tilting at Windmills VI
Abstract
A hand-built carry-aware score measures a one-round advantage cliff on reduced SHA-256: it predicts a downstream output byte one adder layer away and loses all reach one round deeper. The natural objection is that a learned model might find local structure a human feature misses, as Gohr’s neural distinguishers did for round-reduced Speck. We translate that methodology to the mining read point. A residual network receives, as features, everything computed through an interior round (the state words, their carry-free derivations, the round’s modular sums, and the schedule words, in bit, value, and Fourier encodings) and is free to learn any function of them. The network exceeds the hand-built score one adder layer downstream ( versus retained advantage) and then collapses to the noise floor one round deeper, across independent stems, a shifted read point, and a capacity, data, and training-time scaling. On pure -operand modular sums, with no SHA structure present, the same network learns the top byte for and fails for . We show the wall is not an artifact of finite feature precision (a seed-paired float32-versus-float64 comparison is null in both arms) and that it is the learner’s reach rather than an intrinsic boundary: the carry chain’s spectral gap is for every , so nothing in the carry’s mixing singles out . We also identify a distinct second failure mode, feature isolation, and show it does not affect the main result.
@misc{hollows2026neuraldi,
author = {Hollows, Peter},
title = {{Neural Distinguishers Expire on Carry Composition}},
year = {2026},
month = jun,
note = {The Carry-Adder Wall series, Part III},
url = {https://dojo7.com/papers/neural-distinguishers/}
}