Gfortran array descriptor update
See also LibgfortranAbiCleanup - when the ABI is changed, the clean up should be done.
The SVN/GIT branch fortran-dev contains a first version of the new descriptor, cf. announcement.
(14th July 2014) fortran-dev has advanced a long way. The main pre-requisite before merging with GCC 5 is to change over from array indexing to pointer arithmetic, so that all the stride-measure/element size's go away. That said the performance on the polyhedron suite is not too bad apart from fatigue2.f90, which has slowed down by a factor 2 over trunk using -march=native -ffast-math -static-libgfortran -funroll-loops -O3 . (124=>256s). The geometric means are 49.0s for trunk and 51.7s for fortran-dev.
In some cases it would be nice to improve gfortran's array descriptors (cf. PR37577).
See also discussion at comp.lang.fortran about base_address, base_address + offset, and base_address + virtual_origin. See also a reasoning why virtual origins are a bad idea.
* See discussion about hooks in fortran@
Due to backward compatibility issues, it should be designed carefully so we don't have to do it that often. Libgfortran backward compatibility can be handled with SymbolVersioning , but a more difficult problem will be existing user code (is it feasible at all?).
Features desirable that aren't available in the current descriptor:
Stride in bytes rather than sizeof(array_element_type). This is needed in order to efficiently support derived type components? PR ...? PR 40737, PR 38471, PR53800
Flags for allocated and associated status. This avoids ugly workarounds like allocating a 1 byte array for zero sized allocations, see e.g. PR 35719, and is probably necessary for solving PR 35718.
Store each dimension as a (lbound, stride, extent) triplet rather than the current (lbound, stride, ubound). With the former ubound can be calculated as ubound = lbound + stride * extent which is essentially free whereas with the current setup extent must be calculated as extent = (ubound - lbound) / stride where the division is quite expensive. See e.g. tables with instruction timings and throughput.
* Support rank 15 arrays (as planned for F 2008)
* Save coarray information. int corank might be enough.
* Save type information for (a) polymorphic data types but also (b) for run-time diagnostic.
* Support TR 29113 "Further Interoperability of Fortran with C", preferably as native format. See PR 37577. N1942 (FDTS as sent to SC22, 11 October 2012). F2018 draft contains TR 29113. For comparison, see Intel Fortran array descriptor (mixed language programming chapter), Pathscale array descriptor (Appendix D)
* Note that using the TR 29113 descriptor as the native one has implications for the procedure call interface as well. See ML message
The following is one possible implementation of the descriptor according to the Fortran 2018 2017-12-28 (N2146) draft. Note that the specification does not provide a header file, only textual descriptions of what should be found. See also current version in the fortran-dev branch.
#ifndef ISO_FORTRAN_BINDING_H #define ISO_FORTRAN_BINDING_H #include <stddef.h> #define CFI_VERSION 0 /* N2137 says that CFI_index_t "is a typedef name for a standard signed integer type capable of representing the result of subtracting two pointers". In C89, this is ptrdiff_t. */ typedef ptrdiff_t CFI_index_t; /* F2018 (N2146 says): CFI_attribute_t is a typedef name for a standard integer type capable of representing the values of the attribute codes. CFI_type_t is a typedef name for a standard integer type capable of representing the values for the supported type specifiers. CFI_rank_t is a typedef name for a standard integer type capable of representing the largest supported rank. */ typedef signed char CFI_type_t; typedef unsigned char CFI_rank_t; typedef unsigned char CFI_attribute_t; typedef struct { CFI_index_t lower_bound, extent, /* = number of elements in this dimension. */ sm; /* stride multiplier; = pointer difference in bytes between two consecutive elements. */ } CFI_dim_t; /* According to N2146, "The first three members of the structure shall be base_addr, elem_len, and version in that order. The final member shall be dim. All other members shall be between version and dim, in any order.". That is, other struct members are allowed, but at least the ones below should be present. */ typedef struct { void * base_addr; /* base address of object */ size_t elem_len; /* length of one element, in bytes */ int version; /* Must be CFI_VERSION */ CFI_rank_t rank; /* object rank, 0 .. CF_MAX_RANK */ CFI_type_t type; /* identifier for type of object */ CFI_attribute_t attribute; /* object attribute */ /* Gfortran specific fields here, must be between the version and dim[] fields. */ CFI_dim_t dim[]; /* dimension triples */ } CFI_cdesc_t; /* Sympolic names for attributes of objects */ #define CFI_attribute_pointer 1 #define CFI_attribute_allocatable 2 #define CFI_attribute_other 3 /* Symbolic names for type identifiers */ #define CFI_type_signed_char 0 #define CFI_type_short 1 #define CFI_type_int 2 #define CFI_type_long 3 #define CFI_type_long_long 4 #define CFI_type_size_t 5 #define CFI_type_int8_ 6 #define CFI_type_int16_t 7 #define CFI_type_int32_t 8 #define CFI_type_int64_t 9 #define CFI_type_int_least8_t 10 #define CFI_type_int_least16_t 11 #define CFI_type_int_least32_t 12 #define CFI_type_int_least64_t 13 #define CFI_type_int_fast8_t 14 #define CFI_type_int_fast16_t 15 #define CFI_type_int_fast32_t 16 #define CFI_type_int_fast64_t 17 #define CFI_type_intmax_t 18 #define CFI_type_intptr_ 19 #define CFI_type_ptrdiff_t 20 #define CFI_type_float 21 #define CFI_type_double 22 #define CFI_type_long_double 23 #define CFI_type_float_Complex 24 #define CFI_type_double_Complex 25 #define CFI_type_long_double_Complex 26 #define CFI_type_Bool 27 #define CFI_type_char 28 #define CFI_type_cptr 29 #define CFI_type_cfunptr 30 #define CFI_type_struct 31 #define CFI_type_other -2 /* The value for CFI type other shall be negative and distinct from all other type specifiers. */ #endif
The type is e.g. c_int8_t, c_char, etc., "TYPE" (presumably one should distinguish between sequence, bind(c) and extensible etc.). The Fortran descriptor is needed for character string lengths, flags like contiguous (cf. also F2008's contiguous attribute) or whether a pointer target is a pointer (and thus deallocatable). For Fortran 2008 one presumably also needs information about coarrays (e.g. number of co-ranks, number of contained ranks [for OpenMP/threads one can use the descriptor for co-ranks, for MPI not]). And one might want to store data for CLASS and derived-type length parameters. One should also make fdesc extensible.