About Installing ATLAS with Athlon 64 X2 for Windows users

| コメント(0)

1.1 Requirement: cygwin (gcc, g++, g77, make, gdb )
(#Use cygwin.bat and don't use RXVT,VT102 terminal emulator or you can't go after 1.5.6.)

1.2 Access to http://www.netlib.org/atlas/and obtain atlas3.6.0.gz.

1.3 After downloading atlas3.6.0.gz in /cygwin/usr/local/, unzip atlas3.6.0.gz, when the command line is

$ gunzip -c atlas3.6.0.gz | tar xv

or

$ tar xvfz atlas3.6.0.gz

1.4 Since current directory is /cygwin/usr/local/, move it to /cygwin/usr/local/ATLAS/, when the command line is

$ cd ATLAS

1.5 Build xconfig.exe, when the command line is make and answer the question below appropriately.

1.5.1 160
159
158
...
 3
 2
 1
Enter number at top left of screen [0]: 160

1.5.2 Have you scoped the errata file? [y]: y

1.5.3 Are you ready to continue? [y]: y

1.5.4 Enter your machine type:
1. Other/Unknown
2. AMD Athlon
3. 32 bit AMD Hammer
4. 64 bit AMD Hammer
5. Pentium PRO
6. Pentium Ⅱ
7. Pentium Ⅲ
8. Pentium 4
Enter machine number [1]: 2

1.5.5 enable Posix threads support? [n]: y

1.5.6 Enter the number processors in system [0]: 2

1.5.7 use express setup? [y]: y

1.5.8 Enter Architecture name (ARCH) [WinNT_ATHLONSSE2_2]:
WinNT_ATHLONSSE2_2

1.5.9 Enter Maximum cache size (KB) [4096]: 4096

1.5.10 Enter File creation delay in seconds [0]: 0

1.5.11 Tune the Level 1 BLAS? [y]: y

1.6 Build ATLAS, when the command line is

# make install arch=WinNT_ATHLONSSE2_2

This work will take more than 2.5 hours to complete.

1.7 Copy the library files, when the command lines are given below.

$ cd ./lib/WinNT_ATHLONSSE2_2
# cp *.a /lib
$ ranlib /lib/liblapack.a
$ ranlib /lib/libatlas.a
$ ranlib /lib/libcblas.a
$ ranlib /lib/libf77blas.a
$ ranlib /lib/libptf77blas.a
$ ranlib /lib/libptcblas.a
$ ranlib /lib/libtstatlas.a

1.8 Now you can use ATLAS, when the command lines are given below.

$ g77 -o file01 file01.f -llapack -lf77blas -lcblas -latlas -lg2c -lm

or

$ g95 -o file01 file01.f90 -llapack -lf77blas -lcblas -latlas -lg2c -lm

file01 is an example and you can use other fortran 77 or 95 programs.

2.1 About the LINPACK benchmark
Access to http://www.netlib.org/benchmark/and you can obtain 1000s, 1000d, linpacks, and linpackd, which are benchmark programs. The data which has the largest mflops in the several trials is adopted.

2.1.1 The results from LINPACK 1000s benchmark

$ g77 -o 1000s 1000s.f
 norm resid  resid  machep 
 9.56832123E+00  5.70633099E-04  1.19209290E-07 
 X(1)  X(n)   
 1.00003088E+00  9.99999046E-01  
 factor  solve  total 
2.418E+000.000E+002.418E+00
 mflops   unit   ratio 
2.765E+027.232E-034.318E+01

2.1.2 The results from LINPACK 1000d benchmark

$ g77 -o 1000d 1000d.f
 norm resid  resid  machep 
 1.05174252E+01  1.16766853E-12  2.22044605E-16 
 X(1)  X(n)  
 1.00000000E+00  1.00000000E+00  
 factor  solve  total 
2.995E+000.000E+002.995E+00
 mflops   unit   ratio 
2.233E+028.958E-035.348E+01

2.2 About the DGEMM benchmark
Access to http://www.mcs.anl.gov/index.php, go to Software>MPICH>Win IA32 Binary (1.2.1p1), and obtain mpich2-1.2.1p1-win-ia32.msi. After downloading mpich2-1.2.1p1-win-ia32.msi, we install it on Windows system. Then we add C:\Program Files\MPICH2\bin to path in environment variables, copy libfmpich2g.a, libmpi.a, and libmpicxx.a into /lib, and enter the command lines given below.

$ ranlib /lib/libfmpich2g.a
$ ranlib /lib/libmpi.a
$ ranlib /lib/libmpicxx.a

Then access to https://computecanada.org/ and go to Committees>TECC>Working groups>Benchmarking>Benchmark Collection>Microbenchmarks>DGEMM, and you can obtain dgemm-1.0.0.tar.gz.

$ tar xvfz dgemm-1.0.0.tar.gz
$ cd dgemm-1.0.0
$ cp ./setup/Make.Linux_AtlasFBLAS_Lam ./

We edit Make.Linux_AtlasFBLAS_Lam to solve the problem about Message Passing library (MPI).

$ vi Make.Linux_AtlasFBLAS_Lam
  MPlib = -lfmpich2g -lmpi -lmpicxx
  # When the value of MPlib was null, it also
  # went well.

We build dgemm-1.0.0, when the command line is given below.

$ make arch=Linux_AtlasFBLAS_Lam

Now we can use mpiexec.exe and hpcc-dgemm.exe. Then we will confirm the performance differences ATLAS+BLAS and BLAS which is included in LAPACK using the benchmark software.

$ mpiexec -np 1 hpcc-dgemm 100

Figure1 shows the processing speed of ATLAS+BLAS is approximately 5.6 times faster than the speed of BLAS mentioned above regarding the performance measured by Single DGEMM Gflop/s and the performance differeneces related to the calculation accuracy is not mentioned.

Figure1
spaceFigure 1  The DGEMM Performance

Finally, I am happy to assist you in installing ATLAS and the DGEMM benchmark and evaluating the performance differences of some software libraries.

コメントする

My Photo
プロフィール!
2016・11・15 改訂
spacer01
rssspacer01foaf
spacer01
atom.xml
spacer01

この記事について

このページは、Suzuki TakashiがMay 18, 2008 6:52 PMに書いた記事です。

ひとつ前の記事は「のんびりとしたいが、パート5。」です。

次の記事は「のんびりとした詰め、パート5。」です。

最近のコンテンツはインデックスページで見られます。過去に書かれたものはアーカイブのページで見られます。

June 2024

Sun Mon Tue Wed Thu Fri Sat
1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30

月別 アーカイブ

OpenID対応しています OpenIDについて