Forpedo
Originally published on macresearch.org, around 2007. Reproduced from the author's archive; some links may no longer resolve.
Advanced Fortran: Polymorphism and Generic Programming
As a regular user of languages like Objective-C, C++, and Python, I have at times become frustrated by the lack of features in Fortran. Although Fortran 90 brought the language into the modern age, adding user-defined types amongst other things, if you are looking for the sort of high-level features you find in just about every other programming language, the only option is to wait for wide support of the new Fortran 2003 standard, which could still be many years away. To bridge the gap, I developed a Python preprocessor called Forpedo, which adds a few advanced features to Fortran 90/95.
Forpedo supports two programming paradigms: generic programming and object-oriented programming. Generic programming is a paradigm in which a single piece of code can be used with multiple data types. For example, code for a linked list could be used to create lists of integers, reals, or even strings. The same source code is used to define each list; the compiler produces different instances of the code, substituting concrete data types (e.g., integer, real) as appropriate to generate a list capable of storing the required data. Without generic programming, a programmer would typically need to make virtually identical copies of the linked list source code for each data type used.
Object-Oriented programming is supported through polymorphic types, which are called protocols in Forpedo terminology. A protocol is very similar to a Java interface, for those familiar with that language. It defines a set of procedures that a conforming type must include. The term ‘protocol’ derives from Objective-C, and has been adopted because ‘interface’ is already used in Fortran.
Generic Programming
To give you a rough idea how it works, I will present a few simple examples. Here is some generic forpedo code:
#definetype WorldIdType Int integer
#definetype WorldIdType Real real
module HelloWorld<WorldIdType>
@WorldIdType :: worldId<WorldIdType>
contains
subroutine setId<WorldIdType>(id)
@WorldIdType :: id
worldId<WorldIdType> = id
end subroutine
subroutine print<WorldIdType>()
print *,'Hello World "',worldId<WorldIdType>,'"'
end subroutine
end module
The definetype preprocessor directive is used to define generic types, and
to stipulate the concrete data types that will be substituted in the eventual Fortran code. In the code above, there is one
generic type: WorldIdType. In this example, WorldIdType is a placeholder for the data type of a variable (i.e., id) that is used to store a world identifier.
The definetype directive takes 3 arguments: the first is the generic type label; the second is a tag that is used by Forpedo
to generate unique names; and the third is one of the Fortran types that will be substituted for the generic placeholder in
the output Fortran. There is one definetype directive for each concrete Fortran type that will be substituted for any given
generic type. In this case, two directives are included for WorldIdType, one that results in code for an integer world
identifier, and one for a real world identifier.
Whenever the generic type is needed in the Forpedo code, the type label is given, prepended with an @ symbol. For example, to define the type of the id parameter in the first subroutine, the following appears:
@WorldIdType :: id
When this code is run though Forpedo, it will be substituted with a concrete type. For example, in the case of an integer world identifier, it will become
integer :: id
The tag supplied in the definetype directive is used to avoid naming conflicts. This process is known as name mangling,
and is usually performed by the compiler. A C++ compiler, for example, would modify the name of a class template in
order to generate a unique class name for any given combination of data types.
With Forpedo, the programmer is responsible for determining where naming clashes could occur, and for avoiding them by inserting a tag placeholder. The tag placeholder is the name of the generic type, enclosed in triangular brackets. You will typically need to use the tag for any named entity with global scope. A module name is a typical example:
module HelloWorld<WorldIdType>
The tag placeholder <WorldIdType> will be replaced in the generated Fortran code with the tag corresponding to a
concrete data type. For example, when WorldIdType is integer, the module name will be HelloWorldInt, because the
integer data type corresponds to the Int tag.
The tag placeholder <WorldIdType> has also been used to mangle the names of other globally accessible entities, such
as the procedure names, and the name of the variable included in the module data section. All of these placeholders get
substituted with the same tag whenever an instance of the generic code is formed.
Running Forpedo
You can download Forpedo at the forpedo web page (link no longer available). It requires Python 2.4 to use. Running it is simple enough: you pipe the forpedo code into standard input, and Fortran 90 comes out on standard output.
forpedo.py < helloworld.f90t > helloworld.f90
In the example above, helloworld.f90 should look like this
module HelloWorldInt
integer :: worldIdInt
contains
subroutine setIdInt(id)
integer :: id
worldIdInt = id
end subroutine
subroutine printInt()
print *,'Hello World "',worldIdInt,'"'
end subroutine
end module
module HelloWorldReal
real :: worldIdReal
contains
subroutine setIdReal(id)
real :: id
worldIdReal = id
end subroutine
subroutine printReal()
print *,'Hello World "',worldIdReal,'"'
end subroutine
end module
This code includes two instances of the generic Forpedo code. The Fortran code in each case is virtually identical, with only the generic type placeholders and tags having been replaced to produce compliant Fortran. The potential of generic programming to reduce code duplication should be fairly evident, even from this simple example. There is around half as much Forpedo code as Fortran code. Not only that, but if you form another instance of the generic code for a different concrete data type, you only need add one line to the Forpedo code to induce a 50% increase in Fortran code.
To test the code, you can compile the helloworld.f90 file with the following main program
program HelloWorld
use HelloWorldInt
use HelloWorldReal
call setIdInt(3)
call printInt()
call setIdReal(3.0)
call printReal()
end program
Running the resulting executable should result in the following output
Hello World " 3 "
Hello World " 3.000000 "
Run-Time Polymorphism
The protocol directive is used to define a polymorphic type with forpedo.
A protocol defines the subroutines and functions that a type
must implement. Here is an example of a protocol declaration:
#protocol AnimalProtocol AnimalProtocolMod
#useblock
use SomeModule
#enduseblock
#method makeSound
type(AnimalProtocol), intent(in) :: self
#endmethod
#method increaseAgeInAnimalYears increase
type(AnimalProtocol), intent(inout) :: self
integer, intent(in) :: increase
#endmethod
#funcmethod increaseAgeAndReturnValue increase,returnVar
type(AnimalProtocol), intent(inout) :: self
integer, intent(in) :: increase
integer :: returnVar
#endmethod
#conformingtype Dog DogMod
#conformingtype Cat CatMod
#endprotocol
This declares a protocol that will be contained in the module AnimalProtocolMod,
which will be generated by Forpedo. The Fortran type corresponding to the polymorphic
type will be AnimalProtocol.
The method/funcmethod/endmethod directives, which must appear in the protocol block, declare
the interfaces of subroutines and functions that conforming types must implement. In this case,
the conforming types must have a makeSound and increaseAgeInAnimalYears subroutine, and
a increaseAgeAndReturnValue function.
The arguments list for each routine is given on the method directive line after the
method name. This list should not include the first argument, which is assumed to be
the instance ‘self’ (equivalent to ‘this’ in C++ and Java).
Note that the declaration of ‘self’ is included in the method/funcmethod/endmethod
block, so that you can assign attributes to it (eg intent(in)).
The types that conform to the protocol are given explicitly in the protocol block, using
the conformingtype directive. This directive requires the Fortran user-defined type that
conforms to the protocol, and the module that declares the type. Each conforming type
must be declared in a separate module.
The protocol above, having been run through Forpedo, can be used like this:
program Main
use AnimalProtocolMod
use DogMod
use CatMod
type (Dog), pointer :: d
type (Cat), pointer :: c
type (AnimalProtocol) :: p
allocate(d,c)
! Assign protocol to Dog
p = d
! Pass pointer to a subroutine that knows nothing about the concrete type Dog
call doStuffWithAnimal(p)
! Repeat for Cat. Results will be different, though subroutine call is the same.
p = c
call doStuffWithAnimal(p)
contains
subroutine doStuffWithAnimal(a)
type (AnimalProtocol) :: a
call makeSound(a)
call increaseAgeInAnimalYears(a, 2)
end subroutine
end program
Note that the subroutine doStuffWithAnimal is able to call subroutines
belonging to Dog and Cat without having any direct knowledge of those
types. Information about the concrete type is stored by the protocol,
and the correct subroutine invoked via the AnimalProtocol type.
All branching required to select the correct subroutine is encapsulated in
the protocol, and generated by Forpedo, making the code easier to read
and extend. Adding a new conforming type to the protocol only requires
a single line to be added, and changes often do not need to be made to
existing code. For example, adding a type Tiger to the program, and making it
conform to the protocol, would not require any changes to doStuffWithAnimal.
This is not generally true in traditional procedural programs, which require
wholesale changes to the code, because the branching blocks are typically
distributed throughout the code base.