smart BLAP95

This software is a Fortran 95 interface to the Level 3 BLAS routines and key LAPACK computational and driver routines. It supersedes the older smart BLAS95 package. Please note that the interfaces are still under development. Bug reports are appreciated.

All the interfaces provided are of modern F95-style, using assumed-shape arrays and optional arguments. The interfaces do not (well, at least at this moment) deal with packed and banded matrices – these are less often used, and are not so elegantly handled using Fortran array sections (for instance, the number of sub- and super-diagonals of a banded matrix in LAPACK storage cannot be inquired from its shape).

This software differs from other implementations, (such as LAPACK95, the F95 blas proposal by Zohair Maany and Sven Hammarling or Intel MKL) in the following:

  1. It aims to exploit the INC* and LD* arguments in the F77 BLAS and LAPACK routines to deal with noncontiguous arrays, avoiding a copy-in-copy-out if possible. This cannot, however, be achieved by means of standard Fortran, which hides the memory layout. For this purpose the interface uses one non-standard intrinsic function, loc. This is provided as an extension by a number of compilers, including gfortran,g95, Intel, Lahey, Portland and PathScale. (If you know about a compiler not listed here, please email me). Alternatively, external LOC coded in C may be used. The bytesizes for Fortran types are specified directly in the m4 sources and can be altered if necessary.

  2. It is far more compact. Each package of routines consists of only one source file. This is because I believe that a „wrapper“ library whose sole purpose is convenience (any Fortran programmer can call BLAS or LAPACK directly) must be very simple to add to a project in order to be useful.



Currently, the following packages are available:

blas3

Interfaces to level 3 BLAS routines and their associated level 2 counterparts (for instance, xGEMM with xGEMV, xGER,xGERU and xGERC). The interface is a subset of the BLAS technical forum standard. By „subset“ here I mean that a working code using these interfaces should also work with a standard implementation of the F95 (dense) BLAS (not counting utility routines). Arguments controlling transpositions, lower/upper side etc. are ordinary characters; however, blas_* constants as defined by the BLAST standard are alse provided.

Routines covered:

xGEMM, xGEMV, xGER, xGERU, xGERC,

xSYMM, xSYMV, xSYR, xSYR2, xSYRK, xSYR2K,

xHEMM, xHEMV, xHER, xHER2, xHERK, xHER2K,

xTRMV, xTRSV, xTRMM, xTRSM

laleq

Interfaces to LAPACK Linear equations routines (the first category in LAPACK user's guide).

Routines covered:

xGETRF,xGETRS,xGETRI

xPOTRF,xPOTRS,xPOTRI

xTRTRS,xTRTRI

more will be added in the future (SV and SVX drivers, xSY routines, etc.)

lalsq

Interfaces to LAPACK Linear least squares routines and orthogonal factorizations (the second category in LAPACK user's guide).

Routines covered:

xGEQRF,xORMQR,xUNMQR,xORGQR,xUNGQR

xGELQF,xORMLQ,xUNMLQ,xORGLQ,xUNGLQ

xGEQP3

more will be added (RQ factorizations, LS/LSX/LSY drivers etc.)

lasvd

Interfaces to xGESVD and xGESDD.

The can be found in this directory or this archive.

All packages are completely self-contained. For example, to use orthogonal factorizations, you only need the lalsq.f90 file. Some options can be tweaked by predefining some symbols for m4 – this is described in the makefile.

Important remarks:

The INCx argument for vectors in BLAS routines is usually able to handle any assumed-shape vector. The requirement for is that the difference in physical addresses od two elements must be a multiple of the size of an element. This is almost always the case, unless the vector is a component of an array of SEQUENCEd or misaligned derived types (compilers normally do not create such misaligned vectors do this if they're not forced to).

The LDx argument, however, do not suffice to cover arbitrary matrices without repacking. A matrix (2D array) must, in addition, be contiguous along its first dimension (i.e. sections a(:,i) must be all contiguous). For example, given the declaration

real:: a(100,100),b(20,50,50)

a(:,:50),a(:,::2),a(:50,51:),a(5:10,::3),b(:10,:20,10) are all OK. a(::2,:)and b(1,30:,30:) are not and have to be repacked.

The software is still under development. Many routines have not been tested. I encourage every user to write test programs. I'll try to fix bugs as soon as possible.

Recommendation: link this page as www.highegg.matfyz.cz/blap95/, because the physical location of the site might change in the future (eventually, I am planning to move the project to SourceForge).

Please send questions, bug reports or comments to highegg@gmail.com.

Software last updated 16.11.2006

Page last updated 16.11.2006