Talk:Lenstra elliptic-curve factorization

Pseudocode ?

edit

Can anyone provide pseudocode and an example for this topic to illustrate the method? Four steps are given but they are too abstract to be useful IMO. Compare with the excellent topic on Miller-Rabin primality test, for example. 14:08, 16 October 2008 (UTC) —Preceding unsigned comment added by 199.171.52.20 (talk)

I don't like pseudocode, but did try to make the steps more concrete. Will also add an example soon. --GaborPete (talk) 08:42, 1 April 2009 (UTC)Reply
OK, I have now put in an example. --GaborPete (talk) 04:56, 3 April 2009 (UTC)Reply
Also, I changed the "over Z/nZ" wording to "(mod n)". One reason is that the old version sounded too much as if Z/nZ was a field. Another reason is that my undergrads, for example, know what (mod n) is, but don't know "over Z/nZ". --GaborPete (talk) 08:42, 1 April 2009 (UTC)Reply

Try using a better browser for your edits, or use an offline editor like TextPad or NotePad so you won't lose what you type. --Ed Poor

The text "20 digits" should be replaced by "the cubic root of the number to be factored".

Average and worst case runtime? When was the method developed?Possible to give examples?

Most people who know this stuff would characterise ECM as "probabilistic" and the seive methods MPQS and GNFS as "deterministic". It is possible to voice minor quibbles in both cases, but to assert exactly the opposite as the article does, is totally wrong. The rest of the article looks mostly ok. Possible deliberate vandalism by a later editor? Or just momentary brainfade? 203.164.222.80 02:03, 30 October 2006 (UTC)Reply

There is a bound quoted on the running time of the algorithm. But that's a deterministic bound, no mention of randomness, so it cannot be right! --GaborPete (talk) 08:42, 1 April 2009 (UTC)Reply

QS, MPQS and GNFS are classed as true random algorithms (this is unquestionably so) as when the final relation x^2 = y^2 mod n [means (x-y) (x y) = 0 mod n] is achieved there is a fifty-fifty chance that it will yield a trivial factor when n is the product of two primes, and fifty-fifty chance it will give a non trivial factor, so there is no way to make it deterministic. If the first relation achieved yields a trivial factor another relationship must be generated. (This is thus analagous to the complexity class ZPP. In ZPP, the expected running time is polynomial, but the worst case is exponential.) It is exponentially unlikely but theoretically possbible that one would have to generate exponential number of relations x^2 = y^2 mod n to get a non trivial factor. I don't believe current theory even by invoking the Riemann hypothesis and its generalizations and extensions can provably get this down to even sub-exponential. Maybe, but I highly doubt it, it would be possible to provably get it down to n^c trials, where c is 1/2 or so for the quadratic sieve, but I highly doubt it. One might have better luck with Dixon's algorithm, but I doubt it highly as well.

The basic problem is if (x-y) = 0 or (x y) = 0 mod n one gets a trivial factor and there is no known way to force x and y to not be so ahead of time. To do so is equivalant to factoring n.

Perhaps the most accurate and precise statement would be: "Currently there is no known way to make NFS deterministic. While no rigerous proof exists that the NFS is a probabilistic algorithm, it is widely believed to be probabilistic."

The Pollard p-1 method and the EC Factoring method may in practice be implemented in a random way, but nothing in the theory says it can't be implemented in a deterministic way using pseudo randomness. In the case of p-1 method the base can chosen to be two or any small number and the power (which is a product of primes up to some limit) is chosen in advance and does not need randomness.

One can pick a seed non randomly (say a small number less than the the log of the number to factor) generate pseudo random numbers from this seed to get the needed "pseudo-random" parameters for the curves and needed points. Since the needed seed is very small one has actually derandomized the algorithm with only a log n slow down at most, so one in effect has derandomized the algorithm. Perhaps a better way to state this with only a poly(log n) slow down the EC factoring method can be derandomized and hence is deterministic. My algorithmic number theory professor always referred to the p-1 method as deterministic and I see no theoretical reason why the same logic doesn't apply to the EC factoring method.

Current theory on elliptic curves, I believe, does not give any asymptotic advantage in implementing the EC factoring method in a true random manner. I am sure the seed could be very small, one could always pick say 2 and run it through a decent but unchanging pseudo random number generator that was fixed for one's implementation. By unchanging I mean a pseudo random number generator that would not change no matter how many times it was run for any numbers that were being factored. Thus one would have no real slow down, not even a poly one, like stated above, just the time to create the pseudo random parameters and points, which would not change the asymptotic running time.

The above is how a complexity theorist would view the matter as even a polynomial time slow downs "doesn't count" in the sense of changing the basic complexity, which is sub-exponential in the small factor of the number being factored. I should check see how algorithimic number theorists view the matter for EC factoring method.

It is possible that some variant optimized implementations of EC factoring method may not be derandomized so easily, but the basic, plain vanilla algorithm can and hence should like the Pollard p-1 method be considered in a strict technical sense deterministic. Depending on the implementation it may make sense to use true randomness, but I doubt if it would make any difference in most implementations. 71.117.44.150 14:02, 19 June 2007 (UTC)Reply

I disagree, ECM, just like QS, etc., is a probabilistic algorithm, that can be made into a deterministic algorithm by using pseudorandom numbers. As I understand the gist of your argument above, you're saying that in the worst case, QS could be about as slow as trial division i.e. exponential (or perhaps fail to factor, although, it would be easy to add escape to running trial division), whereas ECM should be as fast as trial division for the smallest prime factor of the number-to-factor. This is a subtle distinction not capture by the words probabilistic/deterministic. As you suggest, there may be complexity classes ZPP which describe such things. So, the article should amended either to drop the distinction probabilistic/deterministic, or to use the same terminology as in the articles for QS, etc., and then a new section created in which the complexity class issues, such as ZPP, are given, citing published sources. For what it's worth, I've skimmed Henri Cohen's book A Course in Computational Algebraic Number Theory, and I didn't see anything about the deterministic or proabilistic nature of ECM, let alone any complexity classes. So, I may soon amend the article to say that ECM is probabilistic or not mention it, if the QS and other articles say so. DRLB (talk) 16:36, 29 January 2009 (UTC)Reply

Frankly, both ECM and QS/NFS are heuristic algorithms. There's no proof that they're either deterministic (in the sense of always returning a result) or probabilistic (in the sense of having a known bound on their error rate). For example, there's no reason to believe, by merely examining a description of NFS, that it will not always yield a trivial difference of squares; the possibility exists that there is still some input for which it will produce this behavior, just as the Fermat primality test always fails on Carmichael numbers. I know less about ECM, but I'm pretty sure it also relies on unproven assumptions but works well in practice. The best thing to do is just label them heuristic algorithms, and avoid other labels. It would be valid to say that the description of ECM assumes the presence of a source of random bits (e.g. whenever it says "select a random..."), which is not an assumption made by QS/NFS; perhaps this is the distinction you're getting at. Dcoetzee 09:13, 30 January 2009 (UTC)Reply


ECM and Pollards p-1 and Willam's P 1 method are all easily convertable to deterministic algorithms. The running times are heuristic estimates as the estimates of the probability of finding small factors are based on the estimated probability of finding a smooth number within a given range. In the case of ECM one also has the uncertainty of finding a curve with a smooth order. Thus, one can use a randomized version or a derandomized version ECM, but still not find a factor and even if one runs either version of the algorithm for a long time still not know for sure what the bound on the smallest factor is. (Not including any information from a prior trial division.)

With p-1 method, one clearly can compute say a^R (mod n), where R is a product of the first k primes as well the powers of the smaller primes without using any randomness. The base a doesn't have to be randomly chosen. I likely wouldn't use a=2, but a = 13 would be just fine.

QS/NFS can not be derandomized with today's knowledge. —Preceding unsigned comment added by 76.200.95.119 (talk) 20:28, 2 May 2009 (UTC)Reply

Suggestion for article to state that ECM is fastest factoring algorithm for random numbers

edit

As I understand things, ECM is slower than GNFS and QS for products of two large primes, and perhaps for three or four, etc. But when a integer has small enough factors, which most numbers do, then ECM finds the small factors faster than QS and GNFS. Actually, most numbers have some very small factors, and I guess that trial division is the fastest way to find really small factors, faster than ECM. It's clear, though, that trial division only beats ECM for sufficiently smooth integers, and that a random number has negligible chance of being sufficiently smooth. I'm not quite sure, but I think that if what wants to do is to take a random integer and completely factor it, then, for the numbers of sizes that we can currently factor, trial divsion for small factors followed by ECM for the remaining larger factors is the fastest approach known. My basis for this assertion is hearing that many implementations of GNFS use the ECM to factor intermediate integers, because this is the optimal way to factor the intermediate integers. To summarize, I believe that ECM is fastest for random integers, but GNFS is faster for RSA-type numbers. However, I'm not confident about this to amend the article to say so. Nevertheless, if this is true, then perhaps it may be worth adding to the article, since some readers may be more interested factoring random-like integers than RSA-type integers. DRLB (talk) 16:56, 29 January 2009 (UTC)Reply

It's complicated and depends on exactly what you want. The best average clock time to factor a random number is achieved by combining different algorithms, moving between them as the searched factors become larger, while hoping that a factor is suddenly found for which the cofactor (the original number divided by all found factors) is a prime, so later more time consuming steps can be avoided. It's easy to do primality tests on huge cofactors. Depending on the precise strategy and when a prime cofactor is reached, some of the algorithms will not be used on some input numbers. Currently recommended strategies (which may be cut short by suddenly completing the factorization) typically include:
  1. Trial division by small primes.
  2. Switching multiple times between Pollard's p - 1 algorithm, Williams' p 1 algorithm and Lenstra elliptic curve factorization (ECM). All three have variable parameters optimized to search factors of a certain size (but none of them give guarantees of finding all factors of that size). If one set of parameters does not complete the factorization then go back to p-1, p 1, ECM with parameters optimized for larger factors (but taking longer to run). p-1 and p 1 run faster than ECM but can find fewer factors (they must have certain properties to be found).
  3. General number field sieve (GNFS).
If the cofactor remains larger than a few hundred digits then GNFS is infeasible to run and you either give up or continue with p-1, p 1, ECM in hope of there still being factors small enough (below around 70 digits) to have a reasonable chance of being found with those methods.
If you only use one of the algorithms and want to minimize the average time then the average time will be dominated by those cases where there are more than one really large factor, and GNFS is best for that. Many of the input numbers would be factored faster with ECM alone than with GNFS alone, but GNFS still wins on average. The difficulty of completely factoring numbers with more than one really large factor means that the goal in practice is often something like: "Only x% of the input numbers have to be completely factored, or only x% of all prime factors have to be found, or maximize the expected number of found prime factors in the available cpu time". Then ECM alone will be better on average than GNFS alone in many cases. If it's known in advance that the input numbers are RSA-like with two large prime factors (no known test can determine that without knowing the factors), then GNFS is best. PrimeHunter (talk) 18:09, 29 January 2009 (UTC)Reply
A random number has negligible chance of being smooth, but high chance of having small factors. If nothing else, you'd definitely need a trial division step in there. Personally, I think the p-1 and p 1 tests are useless for random integers because they're quite unlikely to apply. Both ECM or GNFS could be useful on the remaining value. A good demonstration of this is the factorization of the Fermat numbers. Dcoetzee 09:04, 30 January 2009 (UTC)Reply

Faster online site??

edit

I know an online site called Alpertron that uses this method to find primes and it's the fastest online method I know to use, but sometimes it doesn't succeed well. This means that it sometimes says that a number is composite but doesn't find any factors within 325 curves. (Why 325?? It's because it requires a lot of patience, and after curve #325, it requires a lot of patience per curve, even more than it does until curve #325.) Any faster online site?? Georgia guy (talk) 21:40, 13 January 2012 (UTC)Reply

edit

Hello fellow Wikipedians,

I have just modified one external link on Lenstra elliptic-curve factorization. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:

When you have finished reviewing my changes, you may follow the instructions on the template below to fix any issues with the URLs.

This message was posted before February 2018. After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{source check}} (last update: 5 June 2024).

  • If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
  • If you found an error with any archives or the URLs themselves, you can fix them with this tool.

Cheers.—InternetArchiveBot (Report bug) 17:41, 20 December 2017 (UTC)Reply

Order of the subgroup of number of points on the curve

edit

I think that the Hasse theorem gives only an upper bound to the order of the subgroup. We have to consider the subgroup that belongs to the starting point. This a divisor of the number of points on the curve. The order of the subgroup can be much smaller as the prime p.

It means, the algorithm is in fact even faster. 95.222.84.67 (talk) 20:08, 6 December 2021 (UTC)Reply

Statement of algorithm

edit

The algorithm is currently described as three steps. But Step 2 is mainly background material about the group law on an elliptic curve, right? And then there's some ill-specified stuff about u and v. And then Step 3 is a legitimate algorithmic step, involving v not at all, except that decisions must be made based on v.

A competent programmer, who is trained in number theory, could not implement the algorithm as stated. It should be stated much more clearly. Mgnbar (talk) 01:22, 19 September 2019 (UTC)Reply