Interoperability

Motivations

One of the central features of the ETSF electronic-structure and simulation programs is their interoperability. It is essential that the output of one code (typically to calculate the ground state properties of a system) be seamlessly readable by other codes, which will use it to calculate other derived properties (typically optical, or transport properties).  Unfortunately, most electronic structure packages store data in their own formats, which typically are incompatible among themselves and change almost continuously to match the code capabilities. Thus a prerequisite for such type of data exchange between applications is standardization, that is, well defined file format specifications.

The ETSF pioneered several standardization efforts in the field of electronic structure calculations. The philosophy of those projects was based as much as possible on existing file formats, such as XML or NetCDF, allowing to read or write these formats on a wide variety of platforms and using different programming languages (in particular Fortran 90, C, and Python).

 

Specifications for Input/Output of wavefunctions and related data

Widely used file format specifications are still lacking in the field of first-principles calculations of material properties. One of the objectives of the ETSF is precisely to specify such file formats, for content that is relevant to our scientific activity in theoretical spectroscopy (wavefunctions, crystallographic data, densities and potentials, etc...). A set of such specifications was developed during the Nanoquanta Network Of Excellence. Named ETSF I/O, these specifications include a detailed NetCDF description of the fields and content of a valid ETSF I/O file and they have been implemented as the ETSF I/O Library.

Although the ETSF I/O specifications were a big step forward for data exchange between the ETSF codes, they had some limitations that hindered their wider acceptance. In 2016, the newly established CECAM_Electronic Structure Library project decided join efforts with the ETSF, the NOMAD Laboratory CoE and the EUSpec network to extend and modify the ETSF I/O file format. This new project is called Electronic Structure Common Data Format (ESCDF). Based on HDF5, this file format aims to

  • enable a platform-independent exchange of data between electronic-structure programs;
  • provide specifications which are both flexible and suitable for High-Performance Computing (HPC);
  • hide the gory details of the I/O implementation, in particular the way parallelism is handled;
  • facilitate, strengthen and extend interdisciplinary collaborations within and without the electronic-structure community.

Like for the ETSF I/O format, tools for reading and writing data using the ESCDF are coded in a library, libescdf, which is currently under development.

For reference purposes, the latest set of specifications for the ETSF I/O to be published is provided here as a PDF document.