Fastest Square Root Algorithm












15












$begingroup$


What is the fastest algorithm for finding the square root of a number?



I created one that can find the square root of "987654321" to 16 decimal places in just 20 iterations



I've now tried Newton's method as well as my own method (Newtons code seen below)



What is the fastest known algorithm for taking the second root of a number?



My code for Newton's Method (*Edit: there was an error in my code, it is fixed in the comments below):



a=2//2nd root
b=97654321//base
n=1//initial guess
c=0//current iteration (this is a changing variable)
r=500000 //total number of iterations to run
while (c<r) {
m = n-(((n^a)-b)/(a*b))//Newton's algorithm
n=m
c++;
trace(m + " <--guess ... iteration--> " + c)
}









share|cite|improve this question











$endgroup$








  • 6




    $begingroup$
    Your attempt with Newton's method sounds unrealistic: Newton / Heron has quadratic convergence. Even starting with $x_0=1$ gives an error $<10^{-24}$ after 20 iterations.
    $endgroup$
    – Hagen von Eitzen
    Feb 6 '13 at 6:55








  • 2




    $begingroup$
    There is no way it could take 100,000 iterations with Newton's method. You need more precision.
    $endgroup$
    – copper.hat
    Feb 6 '13 at 6:59










  • $begingroup$
    Have you seen these methods ? en.wikipedia.org/wiki/Methods_of_computing_square_roots
    $endgroup$
    – k1next
    Feb 6 '13 at 7:09






  • 1




    $begingroup$
    Jeez, you couldn't be bothered to compare your code with Hagen's answer yourself? You need a*n instead of a*b in the denominator.
    $endgroup$
    – Rahul
    Feb 6 '13 at 7:26








  • 2




    $begingroup$
    check out this post on SO stackoverflow.com/questions/295579/… It shows the fastest way to find the square root and to check if it an integer or not.
    $endgroup$
    – Jeel Shah
    Feb 6 '13 at 11:27
















15












$begingroup$


What is the fastest algorithm for finding the square root of a number?



I created one that can find the square root of "987654321" to 16 decimal places in just 20 iterations



I've now tried Newton's method as well as my own method (Newtons code seen below)



What is the fastest known algorithm for taking the second root of a number?



My code for Newton's Method (*Edit: there was an error in my code, it is fixed in the comments below):



a=2//2nd root
b=97654321//base
n=1//initial guess
c=0//current iteration (this is a changing variable)
r=500000 //total number of iterations to run
while (c<r) {
m = n-(((n^a)-b)/(a*b))//Newton's algorithm
n=m
c++;
trace(m + " <--guess ... iteration--> " + c)
}









share|cite|improve this question











$endgroup$








  • 6




    $begingroup$
    Your attempt with Newton's method sounds unrealistic: Newton / Heron has quadratic convergence. Even starting with $x_0=1$ gives an error $<10^{-24}$ after 20 iterations.
    $endgroup$
    – Hagen von Eitzen
    Feb 6 '13 at 6:55








  • 2




    $begingroup$
    There is no way it could take 100,000 iterations with Newton's method. You need more precision.
    $endgroup$
    – copper.hat
    Feb 6 '13 at 6:59










  • $begingroup$
    Have you seen these methods ? en.wikipedia.org/wiki/Methods_of_computing_square_roots
    $endgroup$
    – k1next
    Feb 6 '13 at 7:09






  • 1




    $begingroup$
    Jeez, you couldn't be bothered to compare your code with Hagen's answer yourself? You need a*n instead of a*b in the denominator.
    $endgroup$
    – Rahul
    Feb 6 '13 at 7:26








  • 2




    $begingroup$
    check out this post on SO stackoverflow.com/questions/295579/… It shows the fastest way to find the square root and to check if it an integer or not.
    $endgroup$
    – Jeel Shah
    Feb 6 '13 at 11:27














15












15








15


12



$begingroup$


What is the fastest algorithm for finding the square root of a number?



I created one that can find the square root of "987654321" to 16 decimal places in just 20 iterations



I've now tried Newton's method as well as my own method (Newtons code seen below)



What is the fastest known algorithm for taking the second root of a number?



My code for Newton's Method (*Edit: there was an error in my code, it is fixed in the comments below):



a=2//2nd root
b=97654321//base
n=1//initial guess
c=0//current iteration (this is a changing variable)
r=500000 //total number of iterations to run
while (c<r) {
m = n-(((n^a)-b)/(a*b))//Newton's algorithm
n=m
c++;
trace(m + " <--guess ... iteration--> " + c)
}









share|cite|improve this question











$endgroup$




What is the fastest algorithm for finding the square root of a number?



I created one that can find the square root of "987654321" to 16 decimal places in just 20 iterations



I've now tried Newton's method as well as my own method (Newtons code seen below)



What is the fastest known algorithm for taking the second root of a number?



My code for Newton's Method (*Edit: there was an error in my code, it is fixed in the comments below):



a=2//2nd root
b=97654321//base
n=1//initial guess
c=0//current iteration (this is a changing variable)
r=500000 //total number of iterations to run
while (c<r) {
m = n-(((n^a)-b)/(a*b))//Newton's algorithm
n=m
c++;
trace(m + " <--guess ... iteration--> " + c)
}






sequences-and-series algorithms convergence roots computational-mathematics






share|cite|improve this question















share|cite|improve this question













share|cite|improve this question




share|cite|improve this question








edited Dec 26 '18 at 23:49







Albert Renshaw

















asked Feb 6 '13 at 6:50









Albert RenshawAlbert Renshaw

7371627




7371627








  • 6




    $begingroup$
    Your attempt with Newton's method sounds unrealistic: Newton / Heron has quadratic convergence. Even starting with $x_0=1$ gives an error $<10^{-24}$ after 20 iterations.
    $endgroup$
    – Hagen von Eitzen
    Feb 6 '13 at 6:55








  • 2




    $begingroup$
    There is no way it could take 100,000 iterations with Newton's method. You need more precision.
    $endgroup$
    – copper.hat
    Feb 6 '13 at 6:59










  • $begingroup$
    Have you seen these methods ? en.wikipedia.org/wiki/Methods_of_computing_square_roots
    $endgroup$
    – k1next
    Feb 6 '13 at 7:09






  • 1




    $begingroup$
    Jeez, you couldn't be bothered to compare your code with Hagen's answer yourself? You need a*n instead of a*b in the denominator.
    $endgroup$
    – Rahul
    Feb 6 '13 at 7:26








  • 2




    $begingroup$
    check out this post on SO stackoverflow.com/questions/295579/… It shows the fastest way to find the square root and to check if it an integer or not.
    $endgroup$
    – Jeel Shah
    Feb 6 '13 at 11:27














  • 6




    $begingroup$
    Your attempt with Newton's method sounds unrealistic: Newton / Heron has quadratic convergence. Even starting with $x_0=1$ gives an error $<10^{-24}$ after 20 iterations.
    $endgroup$
    – Hagen von Eitzen
    Feb 6 '13 at 6:55








  • 2




    $begingroup$
    There is no way it could take 100,000 iterations with Newton's method. You need more precision.
    $endgroup$
    – copper.hat
    Feb 6 '13 at 6:59










  • $begingroup$
    Have you seen these methods ? en.wikipedia.org/wiki/Methods_of_computing_square_roots
    $endgroup$
    – k1next
    Feb 6 '13 at 7:09






  • 1




    $begingroup$
    Jeez, you couldn't be bothered to compare your code with Hagen's answer yourself? You need a*n instead of a*b in the denominator.
    $endgroup$
    – Rahul
    Feb 6 '13 at 7:26








  • 2




    $begingroup$
    check out this post on SO stackoverflow.com/questions/295579/… It shows the fastest way to find the square root and to check if it an integer or not.
    $endgroup$
    – Jeel Shah
    Feb 6 '13 at 11:27








6




6




$begingroup$
Your attempt with Newton's method sounds unrealistic: Newton / Heron has quadratic convergence. Even starting with $x_0=1$ gives an error $<10^{-24}$ after 20 iterations.
$endgroup$
– Hagen von Eitzen
Feb 6 '13 at 6:55






$begingroup$
Your attempt with Newton's method sounds unrealistic: Newton / Heron has quadratic convergence. Even starting with $x_0=1$ gives an error $<10^{-24}$ after 20 iterations.
$endgroup$
– Hagen von Eitzen
Feb 6 '13 at 6:55






2




2




$begingroup$
There is no way it could take 100,000 iterations with Newton's method. You need more precision.
$endgroup$
– copper.hat
Feb 6 '13 at 6:59




$begingroup$
There is no way it could take 100,000 iterations with Newton's method. You need more precision.
$endgroup$
– copper.hat
Feb 6 '13 at 6:59












$begingroup$
Have you seen these methods ? en.wikipedia.org/wiki/Methods_of_computing_square_roots
$endgroup$
– k1next
Feb 6 '13 at 7:09




$begingroup$
Have you seen these methods ? en.wikipedia.org/wiki/Methods_of_computing_square_roots
$endgroup$
– k1next
Feb 6 '13 at 7:09




1




1




$begingroup$
Jeez, you couldn't be bothered to compare your code with Hagen's answer yourself? You need a*n instead of a*b in the denominator.
$endgroup$
– Rahul
Feb 6 '13 at 7:26






$begingroup$
Jeez, you couldn't be bothered to compare your code with Hagen's answer yourself? You need a*n instead of a*b in the denominator.
$endgroup$
– Rahul
Feb 6 '13 at 7:26






2




2




$begingroup$
check out this post on SO stackoverflow.com/questions/295579/… It shows the fastest way to find the square root and to check if it an integer or not.
$endgroup$
– Jeel Shah
Feb 6 '13 at 11:27




$begingroup$
check out this post on SO stackoverflow.com/questions/295579/… It shows the fastest way to find the square root and to check if it an integer or not.
$endgroup$
– Jeel Shah
Feb 6 '13 at 11:27










8 Answers
8






active

oldest

votes


















15












$begingroup$

If you use Halley's method, you exhibit cubic convergence! This method is second in the class of Householder's methods.



Halley's method is:
$$
x_{n+1} = x_n - frac{2f(x_n)f'(x_n)}{2[f'(x_n)]^2-f(x_n)f''(x_n)}
$$
If we let $$f(x) = x^2 - a$$
which meets the criteria, (continuous second derivative)


Then Halley's method is:



$$
x_{n+1} = x_n - frac{left(2x_n^3 - 2ax_nright)}{3x_n^2 + a}
$$
Which has the simplification:
$$
x_{n+1} = frac{x_n^3 + 3ax_n}{3x_n^2 + a}
$$
I also will add this document which discusses extensions of the newtonian method.



There exists an extension due to Potra and Pták called the “two-step method” that may be re-written as the iterative scheme
$$x_{n+1} =x_n − frac{f(x_n)+fleft(x_n − frac{f(x_n)}{f′(x_n)}right)}{f'(x_n)}$$
that converges cubically in some neighborhood of of the root $x^{*}$ which does not require the computation of the second derivative.



See: On Newton-type methods with cubic convergence for more information on this topic.



As Hurkyl and others have noted, your best bet is to just use Newton's Method. These alternative methods generally come with more operations per iteration. They aren't really worth the computational cost, but they are a good comparison.






share|cite|improve this answer











$endgroup$













  • $begingroup$
    The householder method is the fastest known algorithm?
    $endgroup$
    – Albert Renshaw
    Feb 6 '13 at 7:31










  • $begingroup$
    @AlbertRenshaw I'm not sure if it is the fastest algorithm_ but you should check out the article I included. Cubic convergence is really good, especially if you don't need the second derivative.
    $endgroup$
    – Rustyn
    Feb 6 '13 at 7:40








  • 7




    $begingroup$
    You can do even better: there's an easy way to get quartic convergence: each iteration, you do two steps of Newton's algorithm. :) Cubic convergence means you only need 2/3 as many iterations as quadratic convergence, but that isn't helpful if each iteration takes 3/2 times as long or more.
    $endgroup$
    – Hurkyl
    Feb 6 '13 at 7:44










  • $begingroup$
    @Hurkyl I would agree, Newton's method is pretty much the most efficient route. I wanted to include other methods so that he could compare.
    $endgroup$
    – Rustyn
    Feb 6 '13 at 7:46








  • 1




    $begingroup$
    @Hurkyl: I've been focusing on the former actually. But after benchmarking, I found this method to be slower - the overhead of the additions, squarings and multiplications is just not worth saving a division (on x64 with SSE, that is).
    $endgroup$
    – orlp
    Feb 7 '13 at 17:49





















7












$begingroup$

Newton's method for solving $f(x)=x^2-N=0$ leads to the recurrence
$x_{n+1}=x_n-frac{x_n^2-N}{2x_n}=frac{x_n+N/x_n}2$, also known as Heron's method. Since $f'(x)ne0$ at the root, the convergence is quadrattic (i.e. the number of correct decimals doubles with each step once a threshold precision is reached).
The results depend on the starting value, of course. Simply guessing $x_0=1$ leads to
$$x_{1} = 493827161.00000000000\
x_{2} = 246913581.49999999899\
x_{3} = 123456792.74999998937\
x_{4} = 61728400.374999909634\
x_{5} = 30864208.187499266317\
x_{6} = 15432120.093744108961\
x_{7} = 7716092.0468278285538\
x_{8} = 3858110.0230600438248\
x_{9} = 1929183.0086989850523\
x_{10} = 964847.48170274167713\
x_{11} = 482935.55973452582660\
x_{12} = 242490.33277426247529 \
x_{13} = 123281.64823302696814 \
x_{14} = 65646.506775513694016 \
x_{15} = 40345.773393104621684 \
x_{16} = 32412.760144718719221 \
x_{17} = 31441.958847358050036 \
x_{18} = 31426.971626562861740 \
x_{19} = 31426.968052932067262 \
x_{20} = 31426.968052931864079 $$

with small enough error.






share|cite|improve this answer









$endgroup$













  • $begingroup$
    I found my error in Newtons method now! I coded it wrong, thank you! +1 Do you know of algorithms faster than this one?
    $endgroup$
    – Albert Renshaw
    Feb 6 '13 at 7:30










  • $begingroup$
    If the representation of your number x allows this, begin with a number with half of the number of digits of x (where the digits left of the decimal point are taken). Then in Hagen's list you step in immediately at $x_{17}$ ... (take care: you've given $97654321$ while Hagen uses $987654321$ )
    $endgroup$
    – Gottfried Helms
    Feb 6 '13 at 8:00



















5












$begingroup$

Not a bona fide "alogrithm", but a cute hack nevertheless that I once used in code that required taking an inverse square root millions of times (back when I was doing computational astrophysics) is found here:



http://en.wikipedia.org/wiki/Fast_inverse_square_root



It does use a few iterations of Newton's method, but only after some very, very clever trickery.



I remember naively using trial-and-error optimization to find a "magic number" that would come closest to a direct square root, though of course it was much slower (still faster than just called "sqrt" from math.h) and had a higher error than the above hack.






share|cite|improve this answer









$endgroup$













  • $begingroup$
    Pretty neat. ${}{}{}$
    $endgroup$
    – copper.hat
    Feb 6 '13 at 7:35










  • $begingroup$
    That's the line where the "more magic" does its work.
    $endgroup$
    – Joe Z.
    Feb 6 '13 at 13:46



















3












$begingroup$

I just noticed nobody's pointed out the following trick: to compute $1/sqrt{n}$, Newton's method used to find a root of $f(y) = 1/y^2 - n$ gives the iteration



$$ y leftarrow frac{3y - ny^3}{2}$$



I believe that in some ranges, it is faster to compute an estimate of $sqrt{n}$ by using Newton's method to first compute $1/sqrt{n}$ then invert the answer than it is to use Newton's method directly.



It is likely faster to compute this as



$$ frac{3y - ny^3}{2} = y - frac{n y^2 - 1}{2}y$$



The point being that if $y$ is a good approximation of $1/sqrt{n}$, then $n y^2 - 1$ is a good approximation of $0$, which reduces the amount of precision you need to keep around, and you can play tricks to speed up the calculation of a product if you already know many of its digits.





But it's possible you might be able to do even better by computing an approximation of both $x sim sqrt{n}$ and $y sim 1/sqrt{n}$ simultaneously. I haven't worked through the details.



The reason to hope that this might work out is that a better update calculation for $x$ may be available because $ny sim n/x$, and then a faster way to update $y$ based on $y sim 1/x$ rather than $y sim 1/sqrt{n}$.






share|cite|improve this answer











$endgroup$





















    3












    $begingroup$

    In case you meant not the theoretical speed but the algorithm that runs the fastest on a computer, then it's the "quake 3" algorithm or one of its derivatives which, I believe, is implemented as the GCC's sqrt function at optimization levels 2 and 3. It's ironic that the determining factor here is a clever value and implementation of the initial condition rather than the actual iterative scheme. On my normal laptop it takes about 2.0e-09s to call sqrt with gcc -O2 or gcc -O3, which is about 2.8 times less than what the runner up, the standard implementation of sqrt gives.






    share|cite|improve this answer











    $endgroup$





















      2












      $begingroup$

      A related problem. You can use the Taylor series of $sqrt{x}$ at a point $a$



      $$ sqrt{x} = sum _{n=0}^{infty }frac{sqrt {pi }}{2},{frac {{a}^{frac{1}{2}-n} left( x-aright)^{n}}{Gammaleft( frac{3}{2}-n right)n! }}. $$



      If you pick $a$ to be close to $987654321$, say $ a=987654320 $, then you get fast convergence.






      share|cite|improve this answer











      $endgroup$













      • $begingroup$
        Hey, how do you calculate $pi$ and $Gamma(x)$ quickly in computer?
        $endgroup$
        – Frank Science
        Feb 6 '13 at 9:44












      • $begingroup$
        @FrankScience: well you'd hardcode Pi of course. The gamma function is the largest hurdle here, I think.
        $endgroup$
        – orlp
        Feb 6 '13 at 10:59










      • $begingroup$
        I think I remember reading something a while back about the natural log of gamma being easier to compute than gamma itself, then you just raise e to that power, but these are all iteratively long steps.
        $endgroup$
        – Albert Renshaw
        Jun 29 '16 at 19:28



















      1












      $begingroup$

      You can get 20 exact digits with only 3 iterations, but it require a better initial
      guess ( do not use X0=1 )
      This guess is easy to find with a simple table of square roots between 1 and 100.
      Then you start with 3,16 x 10^4 for X0...
      Apply Newton/Hero 3 times, and you get:
      31426,968052931864079... that's 20 correct digits!
      Hello again,
      With my " 101 Dalmatians " algoritm, I can get 5 exact digits with 1 iteration.
      and 15 exact digits with only 2 Newton/Heron iterations.
      I use linear interpolation and second order polynomial to find a very good initial guess.
      I use also a "lookup" in a simple table from 1 to 101, with 4 digits accuracy.
      It is very fast...but an assembly langage code will do beter of course.






      share|cite|improve this answer











      $endgroup$













      • $begingroup$
        Here's a MathJax tutorial :)
        $endgroup$
        – Shaun
        Oct 15 '14 at 17:47



















      0












      $begingroup$

      Babylonian algorithm.



      To determine square root of a.



      $ x_{n+1} = frac{1}{2}(x_n + frac{a}{x_n}) $



      Exponential convergence with $x_0 approx sqrt{a} $. Results independent of starting value $x_0$, shows fastest possible convergence.
      http://en.wikipedia.org/wiki/Methods_of_computing_square_roots#Babylonian_method






      share|cite|improve this answer











      $endgroup$









      • 1




        $begingroup$
        Babylonian method is newton's method
        $endgroup$
        – Rustyn
        Feb 6 '13 at 7:44










      • $begingroup$
        Yes, but is it the best known? Or Householder's?
        $endgroup$
        – Vigneshwaren
        Feb 6 '13 at 7:50










      • $begingroup$
        Babylonian will do fine. I'm not sure which is faster in the long run.
        $endgroup$
        – Rustyn
        Feb 6 '13 at 7:55











      Your Answer





      StackExchange.ifUsing("editor", function () {
      return StackExchange.using("mathjaxEditing", function () {
      StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
      StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
      });
      });
      }, "mathjax-editing");

      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "69"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      noCode: true, onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });














      draft saved

      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f296102%2ffastest-square-root-algorithm%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      8 Answers
      8






      active

      oldest

      votes








      8 Answers
      8






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      15












      $begingroup$

      If you use Halley's method, you exhibit cubic convergence! This method is second in the class of Householder's methods.



      Halley's method is:
      $$
      x_{n+1} = x_n - frac{2f(x_n)f'(x_n)}{2[f'(x_n)]^2-f(x_n)f''(x_n)}
      $$
      If we let $$f(x) = x^2 - a$$
      which meets the criteria, (continuous second derivative)


      Then Halley's method is:



      $$
      x_{n+1} = x_n - frac{left(2x_n^3 - 2ax_nright)}{3x_n^2 + a}
      $$
      Which has the simplification:
      $$
      x_{n+1} = frac{x_n^3 + 3ax_n}{3x_n^2 + a}
      $$
      I also will add this document which discusses extensions of the newtonian method.



      There exists an extension due to Potra and Pták called the “two-step method” that may be re-written as the iterative scheme
      $$x_{n+1} =x_n − frac{f(x_n)+fleft(x_n − frac{f(x_n)}{f′(x_n)}right)}{f'(x_n)}$$
      that converges cubically in some neighborhood of of the root $x^{*}$ which does not require the computation of the second derivative.



      See: On Newton-type methods with cubic convergence for more information on this topic.



      As Hurkyl and others have noted, your best bet is to just use Newton's Method. These alternative methods generally come with more operations per iteration. They aren't really worth the computational cost, but they are a good comparison.






      share|cite|improve this answer











      $endgroup$













      • $begingroup$
        The householder method is the fastest known algorithm?
        $endgroup$
        – Albert Renshaw
        Feb 6 '13 at 7:31










      • $begingroup$
        @AlbertRenshaw I'm not sure if it is the fastest algorithm_ but you should check out the article I included. Cubic convergence is really good, especially if you don't need the second derivative.
        $endgroup$
        – Rustyn
        Feb 6 '13 at 7:40








      • 7




        $begingroup$
        You can do even better: there's an easy way to get quartic convergence: each iteration, you do two steps of Newton's algorithm. :) Cubic convergence means you only need 2/3 as many iterations as quadratic convergence, but that isn't helpful if each iteration takes 3/2 times as long or more.
        $endgroup$
        – Hurkyl
        Feb 6 '13 at 7:44










      • $begingroup$
        @Hurkyl I would agree, Newton's method is pretty much the most efficient route. I wanted to include other methods so that he could compare.
        $endgroup$
        – Rustyn
        Feb 6 '13 at 7:46








      • 1




        $begingroup$
        @Hurkyl: I've been focusing on the former actually. But after benchmarking, I found this method to be slower - the overhead of the additions, squarings and multiplications is just not worth saving a division (on x64 with SSE, that is).
        $endgroup$
        – orlp
        Feb 7 '13 at 17:49


















      15












      $begingroup$

      If you use Halley's method, you exhibit cubic convergence! This method is second in the class of Householder's methods.



      Halley's method is:
      $$
      x_{n+1} = x_n - frac{2f(x_n)f'(x_n)}{2[f'(x_n)]^2-f(x_n)f''(x_n)}
      $$
      If we let $$f(x) = x^2 - a$$
      which meets the criteria, (continuous second derivative)


      Then Halley's method is:



      $$
      x_{n+1} = x_n - frac{left(2x_n^3 - 2ax_nright)}{3x_n^2 + a}
      $$
      Which has the simplification:
      $$
      x_{n+1} = frac{x_n^3 + 3ax_n}{3x_n^2 + a}
      $$
      I also will add this document which discusses extensions of the newtonian method.



      There exists an extension due to Potra and Pták called the “two-step method” that may be re-written as the iterative scheme
      $$x_{n+1} =x_n − frac{f(x_n)+fleft(x_n − frac{f(x_n)}{f′(x_n)}right)}{f'(x_n)}$$
      that converges cubically in some neighborhood of of the root $x^{*}$ which does not require the computation of the second derivative.



      See: On Newton-type methods with cubic convergence for more information on this topic.



      As Hurkyl and others have noted, your best bet is to just use Newton's Method. These alternative methods generally come with more operations per iteration. They aren't really worth the computational cost, but they are a good comparison.






      share|cite|improve this answer











      $endgroup$













      • $begingroup$
        The householder method is the fastest known algorithm?
        $endgroup$
        – Albert Renshaw
        Feb 6 '13 at 7:31










      • $begingroup$
        @AlbertRenshaw I'm not sure if it is the fastest algorithm_ but you should check out the article I included. Cubic convergence is really good, especially if you don't need the second derivative.
        $endgroup$
        – Rustyn
        Feb 6 '13 at 7:40








      • 7




        $begingroup$
        You can do even better: there's an easy way to get quartic convergence: each iteration, you do two steps of Newton's algorithm. :) Cubic convergence means you only need 2/3 as many iterations as quadratic convergence, but that isn't helpful if each iteration takes 3/2 times as long or more.
        $endgroup$
        – Hurkyl
        Feb 6 '13 at 7:44










      • $begingroup$
        @Hurkyl I would agree, Newton's method is pretty much the most efficient route. I wanted to include other methods so that he could compare.
        $endgroup$
        – Rustyn
        Feb 6 '13 at 7:46








      • 1




        $begingroup$
        @Hurkyl: I've been focusing on the former actually. But after benchmarking, I found this method to be slower - the overhead of the additions, squarings and multiplications is just not worth saving a division (on x64 with SSE, that is).
        $endgroup$
        – orlp
        Feb 7 '13 at 17:49
















      15












      15








      15





      $begingroup$

      If you use Halley's method, you exhibit cubic convergence! This method is second in the class of Householder's methods.



      Halley's method is:
      $$
      x_{n+1} = x_n - frac{2f(x_n)f'(x_n)}{2[f'(x_n)]^2-f(x_n)f''(x_n)}
      $$
      If we let $$f(x) = x^2 - a$$
      which meets the criteria, (continuous second derivative)


      Then Halley's method is:



      $$
      x_{n+1} = x_n - frac{left(2x_n^3 - 2ax_nright)}{3x_n^2 + a}
      $$
      Which has the simplification:
      $$
      x_{n+1} = frac{x_n^3 + 3ax_n}{3x_n^2 + a}
      $$
      I also will add this document which discusses extensions of the newtonian method.



      There exists an extension due to Potra and Pták called the “two-step method” that may be re-written as the iterative scheme
      $$x_{n+1} =x_n − frac{f(x_n)+fleft(x_n − frac{f(x_n)}{f′(x_n)}right)}{f'(x_n)}$$
      that converges cubically in some neighborhood of of the root $x^{*}$ which does not require the computation of the second derivative.



      See: On Newton-type methods with cubic convergence for more information on this topic.



      As Hurkyl and others have noted, your best bet is to just use Newton's Method. These alternative methods generally come with more operations per iteration. They aren't really worth the computational cost, but they are a good comparison.






      share|cite|improve this answer











      $endgroup$



      If you use Halley's method, you exhibit cubic convergence! This method is second in the class of Householder's methods.



      Halley's method is:
      $$
      x_{n+1} = x_n - frac{2f(x_n)f'(x_n)}{2[f'(x_n)]^2-f(x_n)f''(x_n)}
      $$
      If we let $$f(x) = x^2 - a$$
      which meets the criteria, (continuous second derivative)


      Then Halley's method is:



      $$
      x_{n+1} = x_n - frac{left(2x_n^3 - 2ax_nright)}{3x_n^2 + a}
      $$
      Which has the simplification:
      $$
      x_{n+1} = frac{x_n^3 + 3ax_n}{3x_n^2 + a}
      $$
      I also will add this document which discusses extensions of the newtonian method.



      There exists an extension due to Potra and Pták called the “two-step method” that may be re-written as the iterative scheme
      $$x_{n+1} =x_n − frac{f(x_n)+fleft(x_n − frac{f(x_n)}{f′(x_n)}right)}{f'(x_n)}$$
      that converges cubically in some neighborhood of of the root $x^{*}$ which does not require the computation of the second derivative.



      See: On Newton-type methods with cubic convergence for more information on this topic.



      As Hurkyl and others have noted, your best bet is to just use Newton's Method. These alternative methods generally come with more operations per iteration. They aren't really worth the computational cost, but they are a good comparison.







      share|cite|improve this answer














      share|cite|improve this answer



      share|cite|improve this answer








      edited Feb 6 '13 at 23:23

























      answered Feb 6 '13 at 7:06









      RustynRustyn

      6,86411843




      6,86411843












      • $begingroup$
        The householder method is the fastest known algorithm?
        $endgroup$
        – Albert Renshaw
        Feb 6 '13 at 7:31










      • $begingroup$
        @AlbertRenshaw I'm not sure if it is the fastest algorithm_ but you should check out the article I included. Cubic convergence is really good, especially if you don't need the second derivative.
        $endgroup$
        – Rustyn
        Feb 6 '13 at 7:40








      • 7




        $begingroup$
        You can do even better: there's an easy way to get quartic convergence: each iteration, you do two steps of Newton's algorithm. :) Cubic convergence means you only need 2/3 as many iterations as quadratic convergence, but that isn't helpful if each iteration takes 3/2 times as long or more.
        $endgroup$
        – Hurkyl
        Feb 6 '13 at 7:44










      • $begingroup$
        @Hurkyl I would agree, Newton's method is pretty much the most efficient route. I wanted to include other methods so that he could compare.
        $endgroup$
        – Rustyn
        Feb 6 '13 at 7:46








      • 1




        $begingroup$
        @Hurkyl: I've been focusing on the former actually. But after benchmarking, I found this method to be slower - the overhead of the additions, squarings and multiplications is just not worth saving a division (on x64 with SSE, that is).
        $endgroup$
        – orlp
        Feb 7 '13 at 17:49




















      • $begingroup$
        The householder method is the fastest known algorithm?
        $endgroup$
        – Albert Renshaw
        Feb 6 '13 at 7:31










      • $begingroup$
        @AlbertRenshaw I'm not sure if it is the fastest algorithm_ but you should check out the article I included. Cubic convergence is really good, especially if you don't need the second derivative.
        $endgroup$
        – Rustyn
        Feb 6 '13 at 7:40








      • 7




        $begingroup$
        You can do even better: there's an easy way to get quartic convergence: each iteration, you do two steps of Newton's algorithm. :) Cubic convergence means you only need 2/3 as many iterations as quadratic convergence, but that isn't helpful if each iteration takes 3/2 times as long or more.
        $endgroup$
        – Hurkyl
        Feb 6 '13 at 7:44










      • $begingroup$
        @Hurkyl I would agree, Newton's method is pretty much the most efficient route. I wanted to include other methods so that he could compare.
        $endgroup$
        – Rustyn
        Feb 6 '13 at 7:46








      • 1




        $begingroup$
        @Hurkyl: I've been focusing on the former actually. But after benchmarking, I found this method to be slower - the overhead of the additions, squarings and multiplications is just not worth saving a division (on x64 with SSE, that is).
        $endgroup$
        – orlp
        Feb 7 '13 at 17:49


















      $begingroup$
      The householder method is the fastest known algorithm?
      $endgroup$
      – Albert Renshaw
      Feb 6 '13 at 7:31




      $begingroup$
      The householder method is the fastest known algorithm?
      $endgroup$
      – Albert Renshaw
      Feb 6 '13 at 7:31












      $begingroup$
      @AlbertRenshaw I'm not sure if it is the fastest algorithm_ but you should check out the article I included. Cubic convergence is really good, especially if you don't need the second derivative.
      $endgroup$
      – Rustyn
      Feb 6 '13 at 7:40






      $begingroup$
      @AlbertRenshaw I'm not sure if it is the fastest algorithm_ but you should check out the article I included. Cubic convergence is really good, especially if you don't need the second derivative.
      $endgroup$
      – Rustyn
      Feb 6 '13 at 7:40






      7




      7




      $begingroup$
      You can do even better: there's an easy way to get quartic convergence: each iteration, you do two steps of Newton's algorithm. :) Cubic convergence means you only need 2/3 as many iterations as quadratic convergence, but that isn't helpful if each iteration takes 3/2 times as long or more.
      $endgroup$
      – Hurkyl
      Feb 6 '13 at 7:44




      $begingroup$
      You can do even better: there's an easy way to get quartic convergence: each iteration, you do two steps of Newton's algorithm. :) Cubic convergence means you only need 2/3 as many iterations as quadratic convergence, but that isn't helpful if each iteration takes 3/2 times as long or more.
      $endgroup$
      – Hurkyl
      Feb 6 '13 at 7:44












      $begingroup$
      @Hurkyl I would agree, Newton's method is pretty much the most efficient route. I wanted to include other methods so that he could compare.
      $endgroup$
      – Rustyn
      Feb 6 '13 at 7:46






      $begingroup$
      @Hurkyl I would agree, Newton's method is pretty much the most efficient route. I wanted to include other methods so that he could compare.
      $endgroup$
      – Rustyn
      Feb 6 '13 at 7:46






      1




      1




      $begingroup$
      @Hurkyl: I've been focusing on the former actually. But after benchmarking, I found this method to be slower - the overhead of the additions, squarings and multiplications is just not worth saving a division (on x64 with SSE, that is).
      $endgroup$
      – orlp
      Feb 7 '13 at 17:49






      $begingroup$
      @Hurkyl: I've been focusing on the former actually. But after benchmarking, I found this method to be slower - the overhead of the additions, squarings and multiplications is just not worth saving a division (on x64 with SSE, that is).
      $endgroup$
      – orlp
      Feb 7 '13 at 17:49













      7












      $begingroup$

      Newton's method for solving $f(x)=x^2-N=0$ leads to the recurrence
      $x_{n+1}=x_n-frac{x_n^2-N}{2x_n}=frac{x_n+N/x_n}2$, also known as Heron's method. Since $f'(x)ne0$ at the root, the convergence is quadrattic (i.e. the number of correct decimals doubles with each step once a threshold precision is reached).
      The results depend on the starting value, of course. Simply guessing $x_0=1$ leads to
      $$x_{1} = 493827161.00000000000\
      x_{2} = 246913581.49999999899\
      x_{3} = 123456792.74999998937\
      x_{4} = 61728400.374999909634\
      x_{5} = 30864208.187499266317\
      x_{6} = 15432120.093744108961\
      x_{7} = 7716092.0468278285538\
      x_{8} = 3858110.0230600438248\
      x_{9} = 1929183.0086989850523\
      x_{10} = 964847.48170274167713\
      x_{11} = 482935.55973452582660\
      x_{12} = 242490.33277426247529 \
      x_{13} = 123281.64823302696814 \
      x_{14} = 65646.506775513694016 \
      x_{15} = 40345.773393104621684 \
      x_{16} = 32412.760144718719221 \
      x_{17} = 31441.958847358050036 \
      x_{18} = 31426.971626562861740 \
      x_{19} = 31426.968052932067262 \
      x_{20} = 31426.968052931864079 $$

      with small enough error.






      share|cite|improve this answer









      $endgroup$













      • $begingroup$
        I found my error in Newtons method now! I coded it wrong, thank you! +1 Do you know of algorithms faster than this one?
        $endgroup$
        – Albert Renshaw
        Feb 6 '13 at 7:30










      • $begingroup$
        If the representation of your number x allows this, begin with a number with half of the number of digits of x (where the digits left of the decimal point are taken). Then in Hagen's list you step in immediately at $x_{17}$ ... (take care: you've given $97654321$ while Hagen uses $987654321$ )
        $endgroup$
        – Gottfried Helms
        Feb 6 '13 at 8:00
















      7












      $begingroup$

      Newton's method for solving $f(x)=x^2-N=0$ leads to the recurrence
      $x_{n+1}=x_n-frac{x_n^2-N}{2x_n}=frac{x_n+N/x_n}2$, also known as Heron's method. Since $f'(x)ne0$ at the root, the convergence is quadrattic (i.e. the number of correct decimals doubles with each step once a threshold precision is reached).
      The results depend on the starting value, of course. Simply guessing $x_0=1$ leads to
      $$x_{1} = 493827161.00000000000\
      x_{2} = 246913581.49999999899\
      x_{3} = 123456792.74999998937\
      x_{4} = 61728400.374999909634\
      x_{5} = 30864208.187499266317\
      x_{6} = 15432120.093744108961\
      x_{7} = 7716092.0468278285538\
      x_{8} = 3858110.0230600438248\
      x_{9} = 1929183.0086989850523\
      x_{10} = 964847.48170274167713\
      x_{11} = 482935.55973452582660\
      x_{12} = 242490.33277426247529 \
      x_{13} = 123281.64823302696814 \
      x_{14} = 65646.506775513694016 \
      x_{15} = 40345.773393104621684 \
      x_{16} = 32412.760144718719221 \
      x_{17} = 31441.958847358050036 \
      x_{18} = 31426.971626562861740 \
      x_{19} = 31426.968052932067262 \
      x_{20} = 31426.968052931864079 $$

      with small enough error.






      share|cite|improve this answer









      $endgroup$













      • $begingroup$
        I found my error in Newtons method now! I coded it wrong, thank you! +1 Do you know of algorithms faster than this one?
        $endgroup$
        – Albert Renshaw
        Feb 6 '13 at 7:30










      • $begingroup$
        If the representation of your number x allows this, begin with a number with half of the number of digits of x (where the digits left of the decimal point are taken). Then in Hagen's list you step in immediately at $x_{17}$ ... (take care: you've given $97654321$ while Hagen uses $987654321$ )
        $endgroup$
        – Gottfried Helms
        Feb 6 '13 at 8:00














      7












      7








      7





      $begingroup$

      Newton's method for solving $f(x)=x^2-N=0$ leads to the recurrence
      $x_{n+1}=x_n-frac{x_n^2-N}{2x_n}=frac{x_n+N/x_n}2$, also known as Heron's method. Since $f'(x)ne0$ at the root, the convergence is quadrattic (i.e. the number of correct decimals doubles with each step once a threshold precision is reached).
      The results depend on the starting value, of course. Simply guessing $x_0=1$ leads to
      $$x_{1} = 493827161.00000000000\
      x_{2} = 246913581.49999999899\
      x_{3} = 123456792.74999998937\
      x_{4} = 61728400.374999909634\
      x_{5} = 30864208.187499266317\
      x_{6} = 15432120.093744108961\
      x_{7} = 7716092.0468278285538\
      x_{8} = 3858110.0230600438248\
      x_{9} = 1929183.0086989850523\
      x_{10} = 964847.48170274167713\
      x_{11} = 482935.55973452582660\
      x_{12} = 242490.33277426247529 \
      x_{13} = 123281.64823302696814 \
      x_{14} = 65646.506775513694016 \
      x_{15} = 40345.773393104621684 \
      x_{16} = 32412.760144718719221 \
      x_{17} = 31441.958847358050036 \
      x_{18} = 31426.971626562861740 \
      x_{19} = 31426.968052932067262 \
      x_{20} = 31426.968052931864079 $$

      with small enough error.






      share|cite|improve this answer









      $endgroup$



      Newton's method for solving $f(x)=x^2-N=0$ leads to the recurrence
      $x_{n+1}=x_n-frac{x_n^2-N}{2x_n}=frac{x_n+N/x_n}2$, also known as Heron's method. Since $f'(x)ne0$ at the root, the convergence is quadrattic (i.e. the number of correct decimals doubles with each step once a threshold precision is reached).
      The results depend on the starting value, of course. Simply guessing $x_0=1$ leads to
      $$x_{1} = 493827161.00000000000\
      x_{2} = 246913581.49999999899\
      x_{3} = 123456792.74999998937\
      x_{4} = 61728400.374999909634\
      x_{5} = 30864208.187499266317\
      x_{6} = 15432120.093744108961\
      x_{7} = 7716092.0468278285538\
      x_{8} = 3858110.0230600438248\
      x_{9} = 1929183.0086989850523\
      x_{10} = 964847.48170274167713\
      x_{11} = 482935.55973452582660\
      x_{12} = 242490.33277426247529 \
      x_{13} = 123281.64823302696814 \
      x_{14} = 65646.506775513694016 \
      x_{15} = 40345.773393104621684 \
      x_{16} = 32412.760144718719221 \
      x_{17} = 31441.958847358050036 \
      x_{18} = 31426.971626562861740 \
      x_{19} = 31426.968052932067262 \
      x_{20} = 31426.968052931864079 $$

      with small enough error.







      share|cite|improve this answer












      share|cite|improve this answer



      share|cite|improve this answer










      answered Feb 6 '13 at 7:08









      Hagen von EitzenHagen von Eitzen

      280k23271503




      280k23271503












      • $begingroup$
        I found my error in Newtons method now! I coded it wrong, thank you! +1 Do you know of algorithms faster than this one?
        $endgroup$
        – Albert Renshaw
        Feb 6 '13 at 7:30










      • $begingroup$
        If the representation of your number x allows this, begin with a number with half of the number of digits of x (where the digits left of the decimal point are taken). Then in Hagen's list you step in immediately at $x_{17}$ ... (take care: you've given $97654321$ while Hagen uses $987654321$ )
        $endgroup$
        – Gottfried Helms
        Feb 6 '13 at 8:00


















      • $begingroup$
        I found my error in Newtons method now! I coded it wrong, thank you! +1 Do you know of algorithms faster than this one?
        $endgroup$
        – Albert Renshaw
        Feb 6 '13 at 7:30










      • $begingroup$
        If the representation of your number x allows this, begin with a number with half of the number of digits of x (where the digits left of the decimal point are taken). Then in Hagen's list you step in immediately at $x_{17}$ ... (take care: you've given $97654321$ while Hagen uses $987654321$ )
        $endgroup$
        – Gottfried Helms
        Feb 6 '13 at 8:00
















      $begingroup$
      I found my error in Newtons method now! I coded it wrong, thank you! +1 Do you know of algorithms faster than this one?
      $endgroup$
      – Albert Renshaw
      Feb 6 '13 at 7:30




      $begingroup$
      I found my error in Newtons method now! I coded it wrong, thank you! +1 Do you know of algorithms faster than this one?
      $endgroup$
      – Albert Renshaw
      Feb 6 '13 at 7:30












      $begingroup$
      If the representation of your number x allows this, begin with a number with half of the number of digits of x (where the digits left of the decimal point are taken). Then in Hagen's list you step in immediately at $x_{17}$ ... (take care: you've given $97654321$ while Hagen uses $987654321$ )
      $endgroup$
      – Gottfried Helms
      Feb 6 '13 at 8:00




      $begingroup$
      If the representation of your number x allows this, begin with a number with half of the number of digits of x (where the digits left of the decimal point are taken). Then in Hagen's list you step in immediately at $x_{17}$ ... (take care: you've given $97654321$ while Hagen uses $987654321$ )
      $endgroup$
      – Gottfried Helms
      Feb 6 '13 at 8:00











      5












      $begingroup$

      Not a bona fide "alogrithm", but a cute hack nevertheless that I once used in code that required taking an inverse square root millions of times (back when I was doing computational astrophysics) is found here:



      http://en.wikipedia.org/wiki/Fast_inverse_square_root



      It does use a few iterations of Newton's method, but only after some very, very clever trickery.



      I remember naively using trial-and-error optimization to find a "magic number" that would come closest to a direct square root, though of course it was much slower (still faster than just called "sqrt" from math.h) and had a higher error than the above hack.






      share|cite|improve this answer









      $endgroup$













      • $begingroup$
        Pretty neat. ${}{}{}$
        $endgroup$
        – copper.hat
        Feb 6 '13 at 7:35










      • $begingroup$
        That's the line where the "more magic" does its work.
        $endgroup$
        – Joe Z.
        Feb 6 '13 at 13:46
















      5












      $begingroup$

      Not a bona fide "alogrithm", but a cute hack nevertheless that I once used in code that required taking an inverse square root millions of times (back when I was doing computational astrophysics) is found here:



      http://en.wikipedia.org/wiki/Fast_inverse_square_root



      It does use a few iterations of Newton's method, but only after some very, very clever trickery.



      I remember naively using trial-and-error optimization to find a "magic number" that would come closest to a direct square root, though of course it was much slower (still faster than just called "sqrt" from math.h) and had a higher error than the above hack.






      share|cite|improve this answer









      $endgroup$













      • $begingroup$
        Pretty neat. ${}{}{}$
        $endgroup$
        – copper.hat
        Feb 6 '13 at 7:35










      • $begingroup$
        That's the line where the "more magic" does its work.
        $endgroup$
        – Joe Z.
        Feb 6 '13 at 13:46














      5












      5








      5





      $begingroup$

      Not a bona fide "alogrithm", but a cute hack nevertheless that I once used in code that required taking an inverse square root millions of times (back when I was doing computational astrophysics) is found here:



      http://en.wikipedia.org/wiki/Fast_inverse_square_root



      It does use a few iterations of Newton's method, but only after some very, very clever trickery.



      I remember naively using trial-and-error optimization to find a "magic number" that would come closest to a direct square root, though of course it was much slower (still faster than just called "sqrt" from math.h) and had a higher error than the above hack.






      share|cite|improve this answer









      $endgroup$



      Not a bona fide "alogrithm", but a cute hack nevertheless that I once used in code that required taking an inverse square root millions of times (back when I was doing computational astrophysics) is found here:



      http://en.wikipedia.org/wiki/Fast_inverse_square_root



      It does use a few iterations of Newton's method, but only after some very, very clever trickery.



      I remember naively using trial-and-error optimization to find a "magic number" that would come closest to a direct square root, though of course it was much slower (still faster than just called "sqrt" from math.h) and had a higher error than the above hack.







      share|cite|improve this answer












      share|cite|improve this answer



      share|cite|improve this answer










      answered Feb 6 '13 at 7:30









      gmossgmoss

      57239




      57239












      • $begingroup$
        Pretty neat. ${}{}{}$
        $endgroup$
        – copper.hat
        Feb 6 '13 at 7:35










      • $begingroup$
        That's the line where the "more magic" does its work.
        $endgroup$
        – Joe Z.
        Feb 6 '13 at 13:46


















      • $begingroup$
        Pretty neat. ${}{}{}$
        $endgroup$
        – copper.hat
        Feb 6 '13 at 7:35










      • $begingroup$
        That's the line where the "more magic" does its work.
        $endgroup$
        – Joe Z.
        Feb 6 '13 at 13:46
















      $begingroup$
      Pretty neat. ${}{}{}$
      $endgroup$
      – copper.hat
      Feb 6 '13 at 7:35




      $begingroup$
      Pretty neat. ${}{}{}$
      $endgroup$
      – copper.hat
      Feb 6 '13 at 7:35












      $begingroup$
      That's the line where the "more magic" does its work.
      $endgroup$
      – Joe Z.
      Feb 6 '13 at 13:46




      $begingroup$
      That's the line where the "more magic" does its work.
      $endgroup$
      – Joe Z.
      Feb 6 '13 at 13:46











      3












      $begingroup$

      I just noticed nobody's pointed out the following trick: to compute $1/sqrt{n}$, Newton's method used to find a root of $f(y) = 1/y^2 - n$ gives the iteration



      $$ y leftarrow frac{3y - ny^3}{2}$$



      I believe that in some ranges, it is faster to compute an estimate of $sqrt{n}$ by using Newton's method to first compute $1/sqrt{n}$ then invert the answer than it is to use Newton's method directly.



      It is likely faster to compute this as



      $$ frac{3y - ny^3}{2} = y - frac{n y^2 - 1}{2}y$$



      The point being that if $y$ is a good approximation of $1/sqrt{n}$, then $n y^2 - 1$ is a good approximation of $0$, which reduces the amount of precision you need to keep around, and you can play tricks to speed up the calculation of a product if you already know many of its digits.





      But it's possible you might be able to do even better by computing an approximation of both $x sim sqrt{n}$ and $y sim 1/sqrt{n}$ simultaneously. I haven't worked through the details.



      The reason to hope that this might work out is that a better update calculation for $x$ may be available because $ny sim n/x$, and then a faster way to update $y$ based on $y sim 1/x$ rather than $y sim 1/sqrt{n}$.






      share|cite|improve this answer











      $endgroup$


















        3












        $begingroup$

        I just noticed nobody's pointed out the following trick: to compute $1/sqrt{n}$, Newton's method used to find a root of $f(y) = 1/y^2 - n$ gives the iteration



        $$ y leftarrow frac{3y - ny^3}{2}$$



        I believe that in some ranges, it is faster to compute an estimate of $sqrt{n}$ by using Newton's method to first compute $1/sqrt{n}$ then invert the answer than it is to use Newton's method directly.



        It is likely faster to compute this as



        $$ frac{3y - ny^3}{2} = y - frac{n y^2 - 1}{2}y$$



        The point being that if $y$ is a good approximation of $1/sqrt{n}$, then $n y^2 - 1$ is a good approximation of $0$, which reduces the amount of precision you need to keep around, and you can play tricks to speed up the calculation of a product if you already know many of its digits.





        But it's possible you might be able to do even better by computing an approximation of both $x sim sqrt{n}$ and $y sim 1/sqrt{n}$ simultaneously. I haven't worked through the details.



        The reason to hope that this might work out is that a better update calculation for $x$ may be available because $ny sim n/x$, and then a faster way to update $y$ based on $y sim 1/x$ rather than $y sim 1/sqrt{n}$.






        share|cite|improve this answer











        $endgroup$
















          3












          3








          3





          $begingroup$

          I just noticed nobody's pointed out the following trick: to compute $1/sqrt{n}$, Newton's method used to find a root of $f(y) = 1/y^2 - n$ gives the iteration



          $$ y leftarrow frac{3y - ny^3}{2}$$



          I believe that in some ranges, it is faster to compute an estimate of $sqrt{n}$ by using Newton's method to first compute $1/sqrt{n}$ then invert the answer than it is to use Newton's method directly.



          It is likely faster to compute this as



          $$ frac{3y - ny^3}{2} = y - frac{n y^2 - 1}{2}y$$



          The point being that if $y$ is a good approximation of $1/sqrt{n}$, then $n y^2 - 1$ is a good approximation of $0$, which reduces the amount of precision you need to keep around, and you can play tricks to speed up the calculation of a product if you already know many of its digits.





          But it's possible you might be able to do even better by computing an approximation of both $x sim sqrt{n}$ and $y sim 1/sqrt{n}$ simultaneously. I haven't worked through the details.



          The reason to hope that this might work out is that a better update calculation for $x$ may be available because $ny sim n/x$, and then a faster way to update $y$ based on $y sim 1/x$ rather than $y sim 1/sqrt{n}$.






          share|cite|improve this answer











          $endgroup$



          I just noticed nobody's pointed out the following trick: to compute $1/sqrt{n}$, Newton's method used to find a root of $f(y) = 1/y^2 - n$ gives the iteration



          $$ y leftarrow frac{3y - ny^3}{2}$$



          I believe that in some ranges, it is faster to compute an estimate of $sqrt{n}$ by using Newton's method to first compute $1/sqrt{n}$ then invert the answer than it is to use Newton's method directly.



          It is likely faster to compute this as



          $$ frac{3y - ny^3}{2} = y - frac{n y^2 - 1}{2}y$$



          The point being that if $y$ is a good approximation of $1/sqrt{n}$, then $n y^2 - 1$ is a good approximation of $0$, which reduces the amount of precision you need to keep around, and you can play tricks to speed up the calculation of a product if you already know many of its digits.





          But it's possible you might be able to do even better by computing an approximation of both $x sim sqrt{n}$ and $y sim 1/sqrt{n}$ simultaneously. I haven't worked through the details.



          The reason to hope that this might work out is that a better update calculation for $x$ may be available because $ny sim n/x$, and then a faster way to update $y$ based on $y sim 1/x$ rather than $y sim 1/sqrt{n}$.







          share|cite|improve this answer














          share|cite|improve this answer



          share|cite|improve this answer








          edited Feb 7 '13 at 0:14

























          answered Feb 6 '13 at 23:57









          HurkylHurkyl

          111k9120262




          111k9120262























              3












              $begingroup$

              In case you meant not the theoretical speed but the algorithm that runs the fastest on a computer, then it's the "quake 3" algorithm or one of its derivatives which, I believe, is implemented as the GCC's sqrt function at optimization levels 2 and 3. It's ironic that the determining factor here is a clever value and implementation of the initial condition rather than the actual iterative scheme. On my normal laptop it takes about 2.0e-09s to call sqrt with gcc -O2 or gcc -O3, which is about 2.8 times less than what the runner up, the standard implementation of sqrt gives.






              share|cite|improve this answer











              $endgroup$


















                3












                $begingroup$

                In case you meant not the theoretical speed but the algorithm that runs the fastest on a computer, then it's the "quake 3" algorithm or one of its derivatives which, I believe, is implemented as the GCC's sqrt function at optimization levels 2 and 3. It's ironic that the determining factor here is a clever value and implementation of the initial condition rather than the actual iterative scheme. On my normal laptop it takes about 2.0e-09s to call sqrt with gcc -O2 or gcc -O3, which is about 2.8 times less than what the runner up, the standard implementation of sqrt gives.






                share|cite|improve this answer











                $endgroup$
















                  3












                  3








                  3





                  $begingroup$

                  In case you meant not the theoretical speed but the algorithm that runs the fastest on a computer, then it's the "quake 3" algorithm or one of its derivatives which, I believe, is implemented as the GCC's sqrt function at optimization levels 2 and 3. It's ironic that the determining factor here is a clever value and implementation of the initial condition rather than the actual iterative scheme. On my normal laptop it takes about 2.0e-09s to call sqrt with gcc -O2 or gcc -O3, which is about 2.8 times less than what the runner up, the standard implementation of sqrt gives.






                  share|cite|improve this answer











                  $endgroup$



                  In case you meant not the theoretical speed but the algorithm that runs the fastest on a computer, then it's the "quake 3" algorithm or one of its derivatives which, I believe, is implemented as the GCC's sqrt function at optimization levels 2 and 3. It's ironic that the determining factor here is a clever value and implementation of the initial condition rather than the actual iterative scheme. On my normal laptop it takes about 2.0e-09s to call sqrt with gcc -O2 or gcc -O3, which is about 2.8 times less than what the runner up, the standard implementation of sqrt gives.







                  share|cite|improve this answer














                  share|cite|improve this answer



                  share|cite|improve this answer








                  edited May 4 '14 at 7:08

























                  answered May 4 '14 at 6:30









                  rytisrytis

                  1365




                  1365























                      2












                      $begingroup$

                      A related problem. You can use the Taylor series of $sqrt{x}$ at a point $a$



                      $$ sqrt{x} = sum _{n=0}^{infty }frac{sqrt {pi }}{2},{frac {{a}^{frac{1}{2}-n} left( x-aright)^{n}}{Gammaleft( frac{3}{2}-n right)n! }}. $$



                      If you pick $a$ to be close to $987654321$, say $ a=987654320 $, then you get fast convergence.






                      share|cite|improve this answer











                      $endgroup$













                      • $begingroup$
                        Hey, how do you calculate $pi$ and $Gamma(x)$ quickly in computer?
                        $endgroup$
                        – Frank Science
                        Feb 6 '13 at 9:44












                      • $begingroup$
                        @FrankScience: well you'd hardcode Pi of course. The gamma function is the largest hurdle here, I think.
                        $endgroup$
                        – orlp
                        Feb 6 '13 at 10:59










                      • $begingroup$
                        I think I remember reading something a while back about the natural log of gamma being easier to compute than gamma itself, then you just raise e to that power, but these are all iteratively long steps.
                        $endgroup$
                        – Albert Renshaw
                        Jun 29 '16 at 19:28
















                      2












                      $begingroup$

                      A related problem. You can use the Taylor series of $sqrt{x}$ at a point $a$



                      $$ sqrt{x} = sum _{n=0}^{infty }frac{sqrt {pi }}{2},{frac {{a}^{frac{1}{2}-n} left( x-aright)^{n}}{Gammaleft( frac{3}{2}-n right)n! }}. $$



                      If you pick $a$ to be close to $987654321$, say $ a=987654320 $, then you get fast convergence.






                      share|cite|improve this answer











                      $endgroup$













                      • $begingroup$
                        Hey, how do you calculate $pi$ and $Gamma(x)$ quickly in computer?
                        $endgroup$
                        – Frank Science
                        Feb 6 '13 at 9:44












                      • $begingroup$
                        @FrankScience: well you'd hardcode Pi of course. The gamma function is the largest hurdle here, I think.
                        $endgroup$
                        – orlp
                        Feb 6 '13 at 10:59










                      • $begingroup$
                        I think I remember reading something a while back about the natural log of gamma being easier to compute than gamma itself, then you just raise e to that power, but these are all iteratively long steps.
                        $endgroup$
                        – Albert Renshaw
                        Jun 29 '16 at 19:28














                      2












                      2








                      2





                      $begingroup$

                      A related problem. You can use the Taylor series of $sqrt{x}$ at a point $a$



                      $$ sqrt{x} = sum _{n=0}^{infty }frac{sqrt {pi }}{2},{frac {{a}^{frac{1}{2}-n} left( x-aright)^{n}}{Gammaleft( frac{3}{2}-n right)n! }}. $$



                      If you pick $a$ to be close to $987654321$, say $ a=987654320 $, then you get fast convergence.






                      share|cite|improve this answer











                      $endgroup$



                      A related problem. You can use the Taylor series of $sqrt{x}$ at a point $a$



                      $$ sqrt{x} = sum _{n=0}^{infty }frac{sqrt {pi }}{2},{frac {{a}^{frac{1}{2}-n} left( x-aright)^{n}}{Gammaleft( frac{3}{2}-n right)n! }}. $$



                      If you pick $a$ to be close to $987654321$, say $ a=987654320 $, then you get fast convergence.







                      share|cite|improve this answer














                      share|cite|improve this answer



                      share|cite|improve this answer








                      edited Apr 13 '17 at 12:21









                      Community

                      1




                      1










                      answered Feb 6 '13 at 7:52









                      Mhenni BenghorbalMhenni Benghorbal

                      43.2k63675




                      43.2k63675












                      • $begingroup$
                        Hey, how do you calculate $pi$ and $Gamma(x)$ quickly in computer?
                        $endgroup$
                        – Frank Science
                        Feb 6 '13 at 9:44












                      • $begingroup$
                        @FrankScience: well you'd hardcode Pi of course. The gamma function is the largest hurdle here, I think.
                        $endgroup$
                        – orlp
                        Feb 6 '13 at 10:59










                      • $begingroup$
                        I think I remember reading something a while back about the natural log of gamma being easier to compute than gamma itself, then you just raise e to that power, but these are all iteratively long steps.
                        $endgroup$
                        – Albert Renshaw
                        Jun 29 '16 at 19:28


















                      • $begingroup$
                        Hey, how do you calculate $pi$ and $Gamma(x)$ quickly in computer?
                        $endgroup$
                        – Frank Science
                        Feb 6 '13 at 9:44












                      • $begingroup$
                        @FrankScience: well you'd hardcode Pi of course. The gamma function is the largest hurdle here, I think.
                        $endgroup$
                        – orlp
                        Feb 6 '13 at 10:59










                      • $begingroup$
                        I think I remember reading something a while back about the natural log of gamma being easier to compute than gamma itself, then you just raise e to that power, but these are all iteratively long steps.
                        $endgroup$
                        – Albert Renshaw
                        Jun 29 '16 at 19:28
















                      $begingroup$
                      Hey, how do you calculate $pi$ and $Gamma(x)$ quickly in computer?
                      $endgroup$
                      – Frank Science
                      Feb 6 '13 at 9:44






                      $begingroup$
                      Hey, how do you calculate $pi$ and $Gamma(x)$ quickly in computer?
                      $endgroup$
                      – Frank Science
                      Feb 6 '13 at 9:44














                      $begingroup$
                      @FrankScience: well you'd hardcode Pi of course. The gamma function is the largest hurdle here, I think.
                      $endgroup$
                      – orlp
                      Feb 6 '13 at 10:59




                      $begingroup$
                      @FrankScience: well you'd hardcode Pi of course. The gamma function is the largest hurdle here, I think.
                      $endgroup$
                      – orlp
                      Feb 6 '13 at 10:59












                      $begingroup$
                      I think I remember reading something a while back about the natural log of gamma being easier to compute than gamma itself, then you just raise e to that power, but these are all iteratively long steps.
                      $endgroup$
                      – Albert Renshaw
                      Jun 29 '16 at 19:28




                      $begingroup$
                      I think I remember reading something a while back about the natural log of gamma being easier to compute than gamma itself, then you just raise e to that power, but these are all iteratively long steps.
                      $endgroup$
                      – Albert Renshaw
                      Jun 29 '16 at 19:28











                      1












                      $begingroup$

                      You can get 20 exact digits with only 3 iterations, but it require a better initial
                      guess ( do not use X0=1 )
                      This guess is easy to find with a simple table of square roots between 1 and 100.
                      Then you start with 3,16 x 10^4 for X0...
                      Apply Newton/Hero 3 times, and you get:
                      31426,968052931864079... that's 20 correct digits!
                      Hello again,
                      With my " 101 Dalmatians " algoritm, I can get 5 exact digits with 1 iteration.
                      and 15 exact digits with only 2 Newton/Heron iterations.
                      I use linear interpolation and second order polynomial to find a very good initial guess.
                      I use also a "lookup" in a simple table from 1 to 101, with 4 digits accuracy.
                      It is very fast...but an assembly langage code will do beter of course.






                      share|cite|improve this answer











                      $endgroup$













                      • $begingroup$
                        Here's a MathJax tutorial :)
                        $endgroup$
                        – Shaun
                        Oct 15 '14 at 17:47
















                      1












                      $begingroup$

                      You can get 20 exact digits with only 3 iterations, but it require a better initial
                      guess ( do not use X0=1 )
                      This guess is easy to find with a simple table of square roots between 1 and 100.
                      Then you start with 3,16 x 10^4 for X0...
                      Apply Newton/Hero 3 times, and you get:
                      31426,968052931864079... that's 20 correct digits!
                      Hello again,
                      With my " 101 Dalmatians " algoritm, I can get 5 exact digits with 1 iteration.
                      and 15 exact digits with only 2 Newton/Heron iterations.
                      I use linear interpolation and second order polynomial to find a very good initial guess.
                      I use also a "lookup" in a simple table from 1 to 101, with 4 digits accuracy.
                      It is very fast...but an assembly langage code will do beter of course.






                      share|cite|improve this answer











                      $endgroup$













                      • $begingroup$
                        Here's a MathJax tutorial :)
                        $endgroup$
                        – Shaun
                        Oct 15 '14 at 17:47














                      1












                      1








                      1





                      $begingroup$

                      You can get 20 exact digits with only 3 iterations, but it require a better initial
                      guess ( do not use X0=1 )
                      This guess is easy to find with a simple table of square roots between 1 and 100.
                      Then you start with 3,16 x 10^4 for X0...
                      Apply Newton/Hero 3 times, and you get:
                      31426,968052931864079... that's 20 correct digits!
                      Hello again,
                      With my " 101 Dalmatians " algoritm, I can get 5 exact digits with 1 iteration.
                      and 15 exact digits with only 2 Newton/Heron iterations.
                      I use linear interpolation and second order polynomial to find a very good initial guess.
                      I use also a "lookup" in a simple table from 1 to 101, with 4 digits accuracy.
                      It is very fast...but an assembly langage code will do beter of course.






                      share|cite|improve this answer











                      $endgroup$



                      You can get 20 exact digits with only 3 iterations, but it require a better initial
                      guess ( do not use X0=1 )
                      This guess is easy to find with a simple table of square roots between 1 and 100.
                      Then you start with 3,16 x 10^4 for X0...
                      Apply Newton/Hero 3 times, and you get:
                      31426,968052931864079... that's 20 correct digits!
                      Hello again,
                      With my " 101 Dalmatians " algoritm, I can get 5 exact digits with 1 iteration.
                      and 15 exact digits with only 2 Newton/Heron iterations.
                      I use linear interpolation and second order polynomial to find a very good initial guess.
                      I use also a "lookup" in a simple table from 1 to 101, with 4 digits accuracy.
                      It is very fast...but an assembly langage code will do beter of course.







                      share|cite|improve this answer














                      share|cite|improve this answer



                      share|cite|improve this answer








                      edited Oct 16 '14 at 17:40

























                      answered Oct 15 '14 at 17:43









                      SergeSerge

                      113




                      113












                      • $begingroup$
                        Here's a MathJax tutorial :)
                        $endgroup$
                        – Shaun
                        Oct 15 '14 at 17:47


















                      • $begingroup$
                        Here's a MathJax tutorial :)
                        $endgroup$
                        – Shaun
                        Oct 15 '14 at 17:47
















                      $begingroup$
                      Here's a MathJax tutorial :)
                      $endgroup$
                      – Shaun
                      Oct 15 '14 at 17:47




                      $begingroup$
                      Here's a MathJax tutorial :)
                      $endgroup$
                      – Shaun
                      Oct 15 '14 at 17:47











                      0












                      $begingroup$

                      Babylonian algorithm.



                      To determine square root of a.



                      $ x_{n+1} = frac{1}{2}(x_n + frac{a}{x_n}) $



                      Exponential convergence with $x_0 approx sqrt{a} $. Results independent of starting value $x_0$, shows fastest possible convergence.
                      http://en.wikipedia.org/wiki/Methods_of_computing_square_roots#Babylonian_method






                      share|cite|improve this answer











                      $endgroup$









                      • 1




                        $begingroup$
                        Babylonian method is newton's method
                        $endgroup$
                        – Rustyn
                        Feb 6 '13 at 7:44










                      • $begingroup$
                        Yes, but is it the best known? Or Householder's?
                        $endgroup$
                        – Vigneshwaren
                        Feb 6 '13 at 7:50










                      • $begingroup$
                        Babylonian will do fine. I'm not sure which is faster in the long run.
                        $endgroup$
                        – Rustyn
                        Feb 6 '13 at 7:55
















                      0












                      $begingroup$

                      Babylonian algorithm.



                      To determine square root of a.



                      $ x_{n+1} = frac{1}{2}(x_n + frac{a}{x_n}) $



                      Exponential convergence with $x_0 approx sqrt{a} $. Results independent of starting value $x_0$, shows fastest possible convergence.
                      http://en.wikipedia.org/wiki/Methods_of_computing_square_roots#Babylonian_method






                      share|cite|improve this answer











                      $endgroup$









                      • 1




                        $begingroup$
                        Babylonian method is newton's method
                        $endgroup$
                        – Rustyn
                        Feb 6 '13 at 7:44










                      • $begingroup$
                        Yes, but is it the best known? Or Householder's?
                        $endgroup$
                        – Vigneshwaren
                        Feb 6 '13 at 7:50










                      • $begingroup$
                        Babylonian will do fine. I'm not sure which is faster in the long run.
                        $endgroup$
                        – Rustyn
                        Feb 6 '13 at 7:55














                      0












                      0








                      0





                      $begingroup$

                      Babylonian algorithm.



                      To determine square root of a.



                      $ x_{n+1} = frac{1}{2}(x_n + frac{a}{x_n}) $



                      Exponential convergence with $x_0 approx sqrt{a} $. Results independent of starting value $x_0$, shows fastest possible convergence.
                      http://en.wikipedia.org/wiki/Methods_of_computing_square_roots#Babylonian_method






                      share|cite|improve this answer











                      $endgroup$



                      Babylonian algorithm.



                      To determine square root of a.



                      $ x_{n+1} = frac{1}{2}(x_n + frac{a}{x_n}) $



                      Exponential convergence with $x_0 approx sqrt{a} $. Results independent of starting value $x_0$, shows fastest possible convergence.
                      http://en.wikipedia.org/wiki/Methods_of_computing_square_roots#Babylonian_method







                      share|cite|improve this answer














                      share|cite|improve this answer



                      share|cite|improve this answer








                      edited Feb 6 '13 at 7:57

























                      answered Feb 6 '13 at 7:42









                      VigneshwarenVigneshwaren

                      308111




                      308111








                      • 1




                        $begingroup$
                        Babylonian method is newton's method
                        $endgroup$
                        – Rustyn
                        Feb 6 '13 at 7:44










                      • $begingroup$
                        Yes, but is it the best known? Or Householder's?
                        $endgroup$
                        – Vigneshwaren
                        Feb 6 '13 at 7:50










                      • $begingroup$
                        Babylonian will do fine. I'm not sure which is faster in the long run.
                        $endgroup$
                        – Rustyn
                        Feb 6 '13 at 7:55














                      • 1




                        $begingroup$
                        Babylonian method is newton's method
                        $endgroup$
                        – Rustyn
                        Feb 6 '13 at 7:44










                      • $begingroup$
                        Yes, but is it the best known? Or Householder's?
                        $endgroup$
                        – Vigneshwaren
                        Feb 6 '13 at 7:50










                      • $begingroup$
                        Babylonian will do fine. I'm not sure which is faster in the long run.
                        $endgroup$
                        – Rustyn
                        Feb 6 '13 at 7:55








                      1




                      1




                      $begingroup$
                      Babylonian method is newton's method
                      $endgroup$
                      – Rustyn
                      Feb 6 '13 at 7:44




                      $begingroup$
                      Babylonian method is newton's method
                      $endgroup$
                      – Rustyn
                      Feb 6 '13 at 7:44












                      $begingroup$
                      Yes, but is it the best known? Or Householder's?
                      $endgroup$
                      – Vigneshwaren
                      Feb 6 '13 at 7:50




                      $begingroup$
                      Yes, but is it the best known? Or Householder's?
                      $endgroup$
                      – Vigneshwaren
                      Feb 6 '13 at 7:50












                      $begingroup$
                      Babylonian will do fine. I'm not sure which is faster in the long run.
                      $endgroup$
                      – Rustyn
                      Feb 6 '13 at 7:55




                      $begingroup$
                      Babylonian will do fine. I'm not sure which is faster in the long run.
                      $endgroup$
                      – Rustyn
                      Feb 6 '13 at 7:55


















                      draft saved

                      draft discarded




















































                      Thanks for contributing an answer to Mathematics Stack Exchange!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      Use MathJax to format equations. MathJax reference.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f296102%2ffastest-square-root-algorithm%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      Bressuire

                      Cabo Verde

                      Gyllenstierna