Yorick Performance
The table below reports results of a serie of tests that compare the performance, in term of speed, of Yorick (1.6) versus IDL (6.0). The tests were performed on 2 computers, a G5 (dual 2GHz G5 processors, 3.5GB RAM, running MacOS 10.3.5) and a Linux Box (dual 2.8GHz Ahtlon processor, 2GB RAM, running Redhat 9 (the compilation flags are listed here).
These tests are rather arbitrary. I had to keep to relatively simple tests (basically single commands) to avoid comparing implementation rather than intrinsic speed. The selection of subject/command certainly reflects my own interests, but it's a start. This kind of tests are always difficult to make, so just take these with a grain of salt. I have not reported the results of graphic tests in this table (I did some, which also showed comparable performance). All these tests were performed while there was no significant other load on the computers.
Overall, Yorick does well compared to IDL. On the Linux box, Yorick (again, for this limited and selective serie of test) was 21% than IDL. On the G5, Yorick was 35% faster than IDL! Last data point: Yorick runs 35% faster on the G5 than on the Linux Box onto which these tests were made.
All results reported in the table below are in seconds. Results
were averaged over 10 instances of each operation. The average and
rms is given. Click on the test label on the left to see the
code.
Most of these operations were made on 1024x1024 arrays (see
code for details).
| G5 | Linux Box | |||
| Test | Yorick | IDL | Yorick | IDL |
| Memory allocation | 1.24+/-0.05 | 1.27+/-0.06 | 1.338+/-0.005 | 1.33+/-0.01 |
| Array Operations | 0.0207+/-0.0030 | 0.0183+/-0.0025 | 0.0265+/-0.0001 | 0.0191+/-0.0001 |
| FFT, stock | 0.65+/-0.02 | 0.90+/-0.04 | 0.75+/-0.07 | 1.21+/-0.01 |
| Optimized FFT | 0.245+/-0.027 | - | - | - |
| Matrix multiply | 2.47+/-0.04 | 2.48+/-0.05 | 9.47+/-0.02 | 5.15+/-0.01 |
| SVD decomposition | 0.62+/-0.04 | 1.42+/-0.05 | 1.137+/-0.001 | 1.971+/-0.003 |
| Matrix Transpose | 0.170+/-0.018 | 0.131+/-0.005 | 0.0730+/-0.0001 | 0.245+/-0.002 |
| Poidev | 1.23+/-0.03 | 4.25+/-0.17 | 1.208+/-0.002 | 3.47+/-0.09 |
Compilation:
Compilation flags for yorick (obviously, I don't have access to the IDL compilation flags as only the binaries are distributed):- Linux box:
-O3 - G5:
-O3 -fomit-frame-pointer -fstrict-aliasing -mcpu=970 -mtune=970 -mpowerpc-gpopt
Full tests commands
Memory allocation
Just a malloc() here, so no real wonder the 2 codes are so similar in performance.
- Yorick
etime=[]; for (i=1;i<=10;i++) { tic; im=array(float,[3,1024,1024,100]); grow,etime,tac(); } etime(avg); etime(rms) - IDL
etime=fltarr(10) for i=0,9 do begin tic=systime(1) im=fltarr(1024,1024,100) etime(i)=systime(1)-tic endfor print,avg(etime),stdev(etime)
Array Operations
- Yorick
im1=im2=array(float,[2,1024,1024]); etime=[]; for (i=1;i<=10;i++) { tic; im=im1*im2; grow,etime,tac(); } etime(avg); etime(rms) - IDL
im1=fltarr(1024,1024) im2=im1 etime=fltarr(10) for i=0,9 do begin tic=systime(1) im =im1*im2 etime(i)=systime(1)-tic endfor print,avg(etime),stdev(etime)
Stock FFTs
Note that the yorick FFT operator only processed double (the yorick complex type is made of 2 double).
- Yorick
im=array(complex,[2,1024,1024]) // each plan is of type double etime=[] for (i=1;i<=10;i++) { tic; out=fft(im,1); grow,etime,tac(); } etime(avg); etime(rms) - IDL
im1=fltarr(1024,1024) im=complex(im1,im1) etime=fltarr(10) for i=0,9 do begin tic=systime(1) out=fft(im,1) etime(i)=systime(1)-tic endfor print,avg(etime),stdev(etime)
Optimized FFT
This one use a special yorick plugin that uses the apple vectorial libraries (thus only available on apple computers). Listed for information- Yorick
im1=im2=array(float,[2,1024,1024]); etime=[]; for (i=1;i<=10;i++) { tic; out=fftVE(im1,im2,1); grow,etime,tac();} etime(avg); etime(rms)
Matrix multiply
I don't know why yorick is so much behind on the linux box for this test.
- Yorick
ar1=ar2=array(float,[2,1024,1024]); etime=[]; for (i=1;i<=10;i++) { tic; ar=ar1(,+)*ar2(+,); grow,etime,tac(); } etime(avg); etime(rms) - IDL
ar1=fltarr(1024,1024) ar2=ar1 etime=fltarr(10) for i=0,9 do begin tic=systime(1) ar=ar1#ar2 etime(i)=systime(1)-tic endfor print,avg(etime),stdev(etime)
SVD decomposition
- Yorick
a=unit(500); etime=[]; for (i=1;i<=10;i++) { tic; w=SVdec(a,u,vt); grow,etime,tac(); } etime(avg); etime(rms) - IDL
a=identity(500) etime=fltarr(10) for i=0,9 do begin tic=systime(1) svdc,a,w,u,v etime(i)=systime(1)-tic endfor print,avg(etime),stdev(etime)
Matrix Transpose
- Yorick
ar=array(float,[2,1024,1024]); etime=[]; for (i=1;i<=10;i++) { tic; ar1=transpose(ar); grow,etime,tac(); } etime(avg); etime(rms) - IDL
ar=fltarr(1024,1024) for i=0,9 do begin tic=systime(1) ar1=transpose(ar) etime(i)=systime(1)-tic endfor print,avg(etime),stdev(etime)
Poidev
Poidev is not in the standard yorick distribution. It belongs to a plugin (yao).- Yorick:
etime=[]; for (i=1;i<=10;i++) { tic; im=poidev(array(100,1024*1024)); grow,etime,tac(); } etime(avg); etime(rms) - IDL:
etime=fltarr(10) for i=0,9 do begin tic=systime(1) im=poidev(replicate(100,1024*1024l)) etime(i)=systime(1)-tic endfor print,avg(etime),stdev(etime)
Page written by Francois Rigaut, 2004sep15.