Norwegian Service Centre for Climate Modelling -> NoSerC project description 2003-2004 -> Project Description 2000-2002 -> Tasks
 
 

 

     

4. Tasks

4.1 National facility for archiving of climate modelling data

Objective: To facilitate fast and easy access to climate modelling data for all scientists involved in national climate projects.

The amount of data from observations, model experiments, and analysis will grow with increasing computer power, and it is important to have a facility in place that can deal with the enormous amount of data and allow an easy search and access to relevant data and tools. There will also be need for a searchable database of metadata, containing information about the data itself and any relevant references.

The facility will consist of disks and on-line tapes attached to the main national high performance computer, to be established at NTNU in Trondheim. Thus, the technical infrastructure provided by the high performance computing centre in Trondheim will be utilised. Rather than investing in new computers, disks, and tape-robots, NoSerC will increase the capacity of the disk subsystem and tape-robots in Trondheim. NoSerC will utilise the same archiving system as the installation in Trondheim, thus no further investments will be necessary for this purpose.

A database will have to be established for the metadata. This database should be easy accessible through a web-interface, and it should also be possible to access the data itself through this interface. Thus, it should also act as an interface to the archiving system.

The scientists will be able to store, inspect and retrieve data through the interface to the database/archiving system. In addition, the staff of the centre will ensure that the commonly used national and international data are available and updated. This includes relevant data from organizations where DNMI is a member, such as ECMWF and WMO. Ideally, all data should be accessible to all scientists. However, some organizations may have restrictions on access to their data. This should be taken care of by the interface.

4.2. Formats and conversion routines for storage of climate modelling data

Objective: To store climate modelling data in standard formats, which facilitates exchange of data and easy conversions to and from formats used by climate models and pre/post-processing tools, and to provide the required conversion routines.

Past experience at DNMI and international climate research centres demonstrates the long-term savings in adopting standard data formats. For instance, large amounts of data have been collected from various sources (internet sites) in conjunction with task PT1, PT2 and PT3 in the RegClim project (DNMI, PT1 and PT3). The number of different data formats means that the preprocessing of data is not trivial and takes much time (up to several days for some data sets when a read routine must be written and then tested) because different codes must be written or modified in order to read the different data sets. In order to facilitate an efficient access to the data, the various data used in PT3 were converted to a common data format (Unidata's netCFD). A common data format has enabled easy analysis, as only one piece of code is needed to analyze the various data sets. Furthermore, the use of one format also allows easy access to the data, even years after the data has produced. This reduces the possibility of processing errors, and other users of the same format can read the data without further specifications (all they need to know about the data is stored in the file header).

The philosophy of using a standard data format must be that it is

  1. Portable (machine independent);
  2. Self-descriptive, i.e. all the information needed to read and use the data are stored in the header;
  3. Fast, i.e. it must provide direct access to separate items;
  4. Space-efficient;
  5. Suitable for large amounts of data;

It is also important to adopt common conventions (definition of time and spatial axes, missing data, etc.) and take advantage of the strength of the format (for instance save the data in 16-bit sequences together with an offset and scaling factor).

The staff of the centre must work closely with the scientists at all involved Norwegian institutions to define an appropriate data format. In performing this task, they should take into account the format used at international climate centres and other use of the data, such as for other meteorological and environmental purposes.

When a standard has been defined, the centre must provide and maintain routines for conversion between this format and other commonly used formats (e.g. GRIB, FELT, HDF, netCFD). The routines should be based on the experience of and software developed by personnel at DNMI, and they should be made availably to scientists both nationally and internationally. The centre must convert all historical data stored in the national facility for climate data (see task 1) to the standard format.

4.3. Porting and optimisation of climate models

Objective: To improve the computational efficiency of the commonly used climate models on the available high performance computers in Norway.

Access to sufficient high performance computing resources is critical for the climate modelling projects. For example, in the project description for RegClim Phase II [3] it says (Chapter 6):

"Nevertheless, it is very clear that considerable computer resources will have to be dedicated to this project. Otherwise, the aims of the project will have to be reformulated to a considerably more modest level."

Although RegClim and other climate modelling projects will have high priority when computer resources on the national high performance computers are allocated, there are several benefits of optimizing the climate models for these computers. This will

  • Reduce the elapsed time of each model simulation, thus giving results faster;
  • Improve the utilization of the expensive high performance computing equipment;
  • Allow the climate projects to carry out more simulations within their allocated quotas;

While some of the climate models are optimised for certain computers, there is a large potential for improvement. For example, an air-pollution model used by the EMEP project at DNMI was first parallelised. This gave close to perfect scalability [4], thus reducing the elapsed time by a factor nearly equal to the number of processors applied. Later on, the model was optimised for the CRAY T3E, and this improved the efficiency of the model by a factor of 3 - 4 [5]. Consequently, the project was able to increase the number of simulations significantly without increasing the required amount of computing resources.

For the period 2000-2003, the NOTUR consortium will acquire several new high performance computers for use by Norwegian scientists. This will increase the amount of resources available. At the same time, there will be a need for porting of models to the new computers. This issue is discussed in the RegClim project description (chapter 6):

"In connection with the installation of new computer equipment in the High Performance Computing Programme in Norway, it is quite probable that problems will arise in parts of RegClim. Large program codes will have to be fitted to new hardware configurations. The project management group in RegClim has earlier expressed a wish that the resources are allocated to a single centre including a staff for user support. RegClim must be given priority if considerable extra work is needed to adjust program codes to new hardware in an optimal way. If necessary, extra funds should be allocated, and technical staff should be hired in RegClim."

The service centre should provide scientists with assistance for both porting and optimization of climate models. This work should be carried out in close co-operation with the NOTUR support personnel. The prioritisation of the climate models should be based on their importance for the climate projects, the amount of computer resources they require and the potential for improvement.

DNMI is a member of ECMWF and has access to computing resources on ECMWF's systems in Reading, UK. Norwegian scientist in the area of climate modelling may apply for resources on these systems. The service centre may provide assistance for application and guidance on use of the system.

4.4. Library of data analysis tools

Objective: To make a set of data analysis tools easily available to all Norwegian scientists working with climate modelling.

The climate models produce large amounts of output data to be interpreted. For this purpose, the scientists use various visualization and statistical analysis packages. There are various packages that can be used, but few are tailored for large geophysical/geographical data. Many of these packages (i.e. Matlab, IDL, Ferret) require scripts to do specific jobs, and some may even require programs written in FORTRAN or C. The development of such scripts is time consuming, and research communities will benefit from having access to a database with common data analysis tools. There are a large number of such scripts written at DNMI, which could be used by other research communities. It is important that such scripts and codes are well documented, so they can be used easily by other scientists/institutions. Some analytical and graphical packages are freely available from different climate and computing centres and may be used for the type of analysis common to climate researchers.

The staff of the service centre should be acquainted with the various analytical tools (e.g. MetView, GrADS, Ferret, ncview, Matlab, IDL, PV-Wave etc.) and be able to advice on matters concerning these and know who is using the various tools, so that difficult questions can be forwarded. They should collect and maintain a library of the most commonly used tools, and provide assistance in writing scripts and in other ways adapt these applications to the needs of the scientist. To the extent possible, the service centre should take care of a common license handling for commercial applications, and thus reduce the cost of using these applications.


 

Send comments to webmaster