OctPerformance – oct

Context Navigation

close Warning: Can't synchronize with repository "(default)" ("(default)" is not readable or not a Git repository.). Look in the Trac log for more information.

Here are some simple benchmarks on the performance of this quad-double implementation. These benchmarks were run using CMUCL on a 1.42 GHz PPC. The columns are times relative to a double-float. The %quad-double represents the time using the internal implementaiton, without the overhead of CLOS. The QD-REAL column shows the effect of CLOS dispatch.

Operation	%quad-double	`QD-REAL`	Notes
Addition	36	73
Multiplication	420	950
Division	900	1200
Square root	125	133	There is no FP sqrt instruction on a PPC

Here are some timing results using CMUCL on a 1.5 GHz UltraSparc IIIi

Operation	%quad-double	`QD-REAL`	Notes
Addition	120	240
Multiplication	390	660
Division	1100	1450
Square root	13400	13600	UltraSparc has a FP sqrt instruction

Here are some timing results using CMUCL with SSE2 support on a 3.06 GHz Core i3

Operation	%quad-double	`QD-REAL`	Notes
Addition	288	390
Multiplication	536	673
Division	2528	2785
Square root	3572	3739

Hida's QD package has a few timing tests. The lisp equivalent was written and here are the timing results. Note that the Lisp equivalent tried to be exactly the same as the QD reference, but no guarantees on that.

Test	QD	Oct	Relative speed Oct/QD
add	0.236	1.16	4.91
mul	0.749	1.54	2.06
div	3.00	3.11	1.03
sqrt	10.57	12.2	1.15
sin	57.33	64.5	1.12
log	194	119	0.613

The second and third columns are microsec per operation. The last column is the relative time of Oct vs QD. All of these were run on a 1.5 GHz Ultrasparc III. Sun Studio 11 was used to compile the C code. CMUCL 2007-10 was used for the Lisp code.

It's surprising that Oct does as well as it does. To be fair, the times for Oct include the cost of CLOS dispatch since QD uses templates and classes in the tests. Except for add and mul, QD and Oct are within a few percent. The sin test is a bit slower in Oct. I don't know why, but the test did include the accurate argument reduction. The log test is quite a bit faster for Oct. This is probably due to using a different algorithm. QD uses a Newton iteration to compute the log. Oct uses Halley's iteration.

Last modified 11 years ago Last modified on 02/10/13 18:01:08

Download in other formats:

Plain Text