#1 Added improvements: Implemented Pollard's Rho for large keys, para… #5
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The original script failed for moduli above 300 bits due to naive search failures. Here, we resort to Pollard's Rho, a probabilistic method based on cycle detection that is more effective for semiprimes, preserving the NN for quick gains on small keys.
Now, the load_data_and_model function will load factor pairs stored in JSON (n {p, q}), falling back to the default sequential model if .h5 is missing. The update_model_and_data function will add new data only for n < 2^53 (avoiding loss of floating-point precision); it retrains on subsets with 50 epochs. Batch_size=min(32,data_len) saves the model after fitting.
Another improvement implements the Brent variant of Pollard's Rho with cycle detection via the tortoise-hare (x, y advances). This limits iterations to 1M per attempt to avoid crashes, and uses math.gcd for divisor checks. The factor_with_rho function will orchestrate the trial division up to 10K for small factors and then execute Rho in parallel. Returning ordered (p, q) only after the Miller-Rabin primality check.
512-bit semiprimes are expected to be successfully factored in <10s on average (tested using synthetic keys, e.g., p=next_prime(2^256), q=next_prime(2^256 + 2^100)). It also returns errors in primes or failures, which will increase reliability.