Norwegian Service Centre for Climate Modelling -> NoSerC project description 2003-2004
 
 

 

     

NoSerC

Norwegian Service Centre for Climate Modelling

Project description 2003-2004





History:
Document Date Comments
Original 13.09.2002 Attachment to the application for 2003-2006
Revised 16.01.2003 Attachment to the revised application for 2003-2004

Project Summary

The Norwegian Service Centre for Climate Modelling (NoSerC) was established late 2000. The centre is located at the Norwegian Meteorological Institute (met.no) and financed by The Research Council of Norway and met.no. The centre supports scientists involved in Norwegian climate modelling projects. The overall aim of the project is to facilitate efficient climate research and effect studies in Norway by providing technical assistance in the areas of data handling and analysis and computation efficiency of climate models.

Main tasks for the period 2003-2004:

  1. Upgrade and operate the facility for archiving of climate modelling data;
  2. Provide data for effect studies;
  3. Programming, porting and computational optimisation of climate models;
  4. Develop format conversion routines for climate modelling data;
  5. Provide access to large international datasets;
  6. Assist scientists in presentation and analysis of datasets;
Early 2001 the national facility for archiving of climate modelling data was established and attached to the main national high performance computer at NTNU. The capacity of the facility is now fully utilized. For the period 2003-2006, the scientists estimate more than a factor of five increase in demand for archiving capacity.

Background

The Norwegian Service Centre for Climate Modelling (NoSerC) was established by the Norwegian Meteorological Institute (met.no) in October 2000 under contract 139804/720 with the The Research Council of Norway. The overall aim of the centre is to facilitate efficient climate research in Norway, by providing technical assistance in the areas of data handling and analysis and computational efficiency of climate models.

For the period 2000-2002 the tasks of the project has been

  1. National facility for archiving of climate modelling data
  2. Objective: To facilitate fast and easy access to climate data for all scientists involved in national climate modelling projects.
  3. Formats and conversion routines for storage of climate modelling data
  4. Objective: To store climate modelling data in standard formats, which facilitates exchange of data and easy conversions to and from formats used by climate models and pre/post-processing tools, and to provide the required conversion routines.
  5. Porting and optimisation of climate models
  6. Objective: To improve the computational efficiency of the commonly used climate models on the available high performance computers in Norway.
  7. Library of data analysis tools
  8. Objective: To make a set of data analysis tools easily available to all Norwegian scientists working with climate modelling.
The staff of the project is
  • Arild Burud
  • Egil Støren
  • Roar Skålin (project manager)

all at met.no in Oslo. A total of two man-years per year for 2001-2002 are included in the project, which implies that Burud and Støren are close to 100% engaged in the project. The steering committee consists of:

  • Dr. Bjørn Ådlandsvik, Institute for Marine Research - representing the Bjerknes collaboration, head
  • Prof. Trond Iversen, UiO - representing RegClim
  • Prof. Peter M. Haugan, UiB - representing NOClim
  • Post. Doc. Gunnar Myhre, UiO - representing Chemclim
  • Prof. Thor Erik Nordeng, met.no - representing met.no

Following a budget revision in 2001, the cost sharing of the project is (kNOK):


2000
2001
2002
Total
Research Council
1200
500
150
1850
met.no
80
910
1060
2050
Total
1280
1410
1210
3900

The budgeted use of the funds is (kNOK):

Acquisition and maintenance of the archiving facility:
1700
Personnel:
2080
Other operating costs:
120

Due to heavy use of the archiving facility, NOK 160 000 was moved from personnel cost to the archiving facility.

In this project description, we review the results obtained in 2000-2002 and propose aims, tasks, organisation and budget for NoSerC for the period 2003-2004.

Obtained results

National facility for archiving of climate modelling data

The national facility for archiving of climate modelling data consists of a disk system of 540 GB and a data migration system capable of handling up to 6 TB of data (12 TB including backup) provided necessary tapes are available. The system is attached to the high performance computing facility at NTNU, which is a part of the national supercomputing project NOTUR. This ensures fast access from all universities and met.no, and it is easy to store and use data from the high performance computers at NTNU. The disks, the data migration system, the units for reading/writing data and tapes have been acquired by NoSerC, while the tape robot belongs to the NOTUR project.

By end 2002 in excess of 5.5 TB are stored on the facility. Including backup, this requires tapes holding more than 11 TB of data. The growth has exceeded the budgeted figures for the NoSerC project, and new tapes have been added to the system in order to increase the capacity. The capacity limits of the system has now been reached. The storage system has been a success. Climate researchers have had a straightforward way of archiving large amounts of data. The only problem has been a recent period with degrading performance and stability problems due to lack of tapes.

The project has also invested personnel recourses in advising users on how to utilize the storage system. The system is easy to use, but usage patterns have important impact on the performance. Therefore, general advice on usage in the form of web pages ( http://noserc.met.no/dmfuse/ ), as well as direct communication, has been given to users.

A database system with a web interface (MetaMod: http://metamod.met.no ) has been developed to give an overview and search facilities of large gridded climate data files stored on the system. The database comprises metadata describing the files (data format, variables, time period, grid resolution and geographic projection etc.). Through the interface, users may copy selected files to work disks on the HPC machines, or by ftp to other machines. Metadata for some of the files on the storage system have been loaded into the database. The loading of metadata has unfortunately been delayed, due to difficulties in obtaining necessary documentation of the data files.

Formats and conversion routines for storage of climate modelling data

Format conversion tools have been collected and developed and made available to the projects on the NOTUR computer gridur.ntnu.no (IRIX/SGI Origin 3800). The tools include conversion between the formats: FELT ( met.no internal format), GRIB and netCDF.

Tools for formatted (ASCII) output of data files in these formats have also been included. Many of the tools are also available for use on the Linux/Intel platform.

Documentation and user guides have been written and made available on the project web pages http://noserc.met.no/formkonv/ .

For the conversion of data files to the netCDF format a new tool was developed to simplify the conversion process. This tool has generated international attention and is now being considered by UCAR for incorporation in the Unidata Decoders Package.

Porting and optimization of climate models

NoSerC has assisted researchers in RegClim in optimizing a version of the CCM3 global climate model. This effort resulted in a significant improvement in performance (CPU-time was reduced about 25 percent for a typical run using 8 processors. For more details, see: http://noserc.met.no/sgiopt/ ). Work is in progress on another version of this model, which includes some more program code developed internally by RegClim researchers.

Library of data analysis tools

Some work has also been done in making data analysis tools available for climate researchers, although the activity in this area has not been as large as anticipated. Two open source packages have been installed on the HPC facility at NOTUR/NTNU: Grads and the statistical package R. Documentation and example scripts have been made available through the NoSerC web pages (http://noserc.met.no/grtools/ ) and directly on the NOTUR/NTNU computers in Trondheim.

Other tasks

Throughout the project period, the web pages for the project (http://noserc.met.no ) have been continuously updated, and now contain a large amount of useful information. The pages comprise, among other things: News, documentation of tools, usage advice on the storage system, advice on optimization etc.

Another ongoing task throughout the project period has been direct support to users.

Overall aim and main tasks for the period 2003-2004

The overall aim of the centre is to facilitate efficient climate research and effect studies in Norway, by providing technical assistance in IT related areas, such as:
  • Archiving of climate modelling data;
  • Data access, handling and format conversion;
  • Programming, computational efficiency and porting of climate models;
  • Analysis, presentation and visualization of data.
This will be reached through the tasks listed below. The detailed content of the tasks, and the emphasis to put on each task, must be worked out in close co-operation with the national research projects on climate modelling and effect studies:
  1. Upgrade and operate the facility for archiving climate modelling data
  2. Objective: To facilitate fast and easy access to climate data for all scientists involved in national climate modelling projects.

    The maximum capacity of the current storage facility has been reached. For the period 2003-2006, the scientists involved in the RegClim project estimate a requirement for data storage in excess of 20 TB. Furthermore, there must be capacity to store parts of the ERA-40 dataset (see below) and other climate projects will have requirements for storage. Including backup, a conservative estimate for storage of climate modelling data during the period is 60 TB. We estimate that 20 TB should be available for the period 2003-2004.

    The data should be easily accessible from the high performance computers used by the scientists. Currently, these are the NOTUR computers at NTNU and UiB and, to a minor extent, the NOTUR computer at UiO. It is expected that a new high performance computing project will be initiated in 2004. However, the computers at NTNU and UiB were upgraded/installed in 2002, so it is likely that they will be in service at least until end 2004. Acquisitions of new computers may take place during 2004-2005.

    The investments in storage capacity should take advantage of investments already made by NoSerC and NOTUR, both in terms of acquisitions and competence in operations and use. New major investments should take advantage of the tendering process for new high performance computers. To cover the requirements for data storage in 2003-2004, we propose to extend the capacity by either upgrading the current NoSerC facility or adding a new low-cost file server and/or disk system. A tendering process will be used to decide the most cost-effective solution. The existing facility will be supported throughout 2004, even if a new file server is chosen for the upgrade.

    The NoSerC staff, in co-operation with NOTUR, should continue to give support on the storage facility. In particular, information and assistance in efficient use of the facility is required. Furthermore, the MetaMod web-interface to the facility should be improved and the climate modelling scientists should be encouraged to register datasets of interest to scientists involved in effect studies, as this will ease access to information and retrieval of data.

  3. Provide data for effect studies
  4. Objective: To make data produced by the national climate projects easily available for studies on impacts of climate change.

    According to the specific objectives of the KlimaProg action plan for 2003-2006, national climate projects are encouraged to produce results that are applicable for research on effects of climate change. Such results may be produced as large data sets describing the climate change under various preconditions. NoSerC will provide the storage, search, retrieval, adoption and some presentation facilities for these data.

    The MetaMod application could be modified to serve as a Web interface for the data. Such solutions should be developed in close co-operation with the data providers (the national climate projects), and the users of such data (researchers on climate effects). The Web interfaces could, if desired, be incorporated into the Web portal belonging to the data providing projects.

    The details this task will be discussed with the KlimaEffekter programme and the projects supported by this programme.

  5. Assist the climate projects in technical work on climate models (programming, computational efficiency and porting)
  6. Objective: To improve the quality, computational efficiency and availability of climate models used by the national climate research community.

    A substantial part of the computer related problems that must be solved by climate projects are related to large climate models. The NoSerC project has shown its value in the work on improving the computational efficiency of a model used by RegClim scientists. This kind of assistance will also be available for the climate projects in the forthcoming project period. Even better assistance could be achieved if NoSerC personnel were engaged at an earlier stage of the model development effort. If substantial extensions are planned on existing models, NoSerC personnel could also contribute to the initial design and programming work.

    Computer technology is rapidly changing, and new architectures are emerging as the best cost/performance alternatives. If climate research is to benefit from this development, software must be ported to new platforms. This is an area where NoSerC can offer assistance. One emerging technology which could be considered is clusters of industry standard computers based on open source operating systems (i.e. Intel/Linux clusters). At met.no one such system is already installed, running Hirlam and MM5, this system could be used as a test bed for porting climate models.

    Porting and optimisation of climate models will be an ongoing activity throughout the project period. However, this task will be of particular importance during migration to new high performance computers.

  7. Formats and conversion routines for storage of climate modelling data
  8. Objective: To provide conversion tools for climate modelling data between commonly used data formats.

    The format conversion tools provided in the first project period should be extended to include conversion to/from HDF and conversion from netCDF to GRIB, if desired by users.

    Additionally, all tools should be made available on the Linux/Intel platform and source code available on internet.

  9. Provide access to international climate data sets
  10. Objective: To provide efficient access to large international data sets used in climate research.

    Large climate data sets (ERA-40 among others) are available internationally, and are of interest for several national climate projects. A natural task for NoSerC will be to make some of these data easily available for national climate researchers.

    Although several national projects are interested in the same data sets, they may have diverging interests regarding which parts of a data set should be considered for easy national access. NoSerC has developed scripts for easy retrieval of data from the original ERA-40 source (ECMWF) on demand, thus reducing the storage requirement on the NoSerC facility to a minimum. Similar solutions can be implemented for other international data sets, provided they can be retrieved from the original source through a network of sufficient capacity.

  11. Assist the climate projects in solving specific problems related to data analysis, presentation and visualization
  12. Objective: To improve the quality and efficiency of national climate research projects in their use of computers to present and analyze large data sets.

    The scientists in climate research have generally much experience in using computer tools. However, as the NoSerC staff has computer technology as their main interest, climate projects could benefit by direct assistance from the NoSerC engineers on some tasks. The actual problems best suited for this kind of assistance have to be identified in close co-operation with individual climate projects. The work should be organized as small projects where the individual climate projects are in charge. This will ensure the best utilization of obtained results. If several such small projects could be implemented, involving different climate projects, another benefit would be the transfer of experience gained in one climate project to other climate projects.

    The most relevant problem areas will probably be:

    • Making presentations of research results on the web. The MetaMod application developed by NoSerC could be extended or tailored for each project to facilitate automatic data retrieval and presentation on the web.
    • Visualization tools and techniques for presentation of results, such as 2D/3D visualization/animation of climate fields etc.
    • Use of tools for statistics and data analysis.

Organisation

We propose to continue the NoSerC project with a staff of employees from met.no. The core project team will consist of
  • Arild Burud, Oslo
  • Egil Støren, Oslo
  • Roar Skålin, Oslo, project manager
In order to improve the co-operation with the climate modelling groups in Bergen, we may also use one of met.no 's IT employees in Bergen, Reinoud Bokhorst, for some of the tasks in the project. Furthermore, met.no and NOTUR personnel will be used for operation of the archiving system.

NoSerC should continue to have a steering committee (SC) representing the main user groups of the centre. The SC should be appointed by the Research Council. The current SC discussed the composition of the NoSerC SC in meeting 2/2001, and in the minutes of the meetings their view is reflected in the following paragraph (in Norwegian):
"Styringsgruppen gjorde ingen ytterligere endringer i prosjektets styringsstruktur, men det ble foreslått at Programstyret bør få signaler om at et eventuelt "NoSerC 2" bør ha styrerepresentanter fra institusjonene, ikke fra prosjektene."

The project manager is responsible for

  • Co-ordination the work of the centre;
  • Ensuring that the work follows the approved plan;
  • Ensuring that the centre maintain good co-operation with other Norwegian institutions involved in climate research, in order serve all Norwegian scientists involved in this area;
  • Ensuring that the centre co-operates with similar centres internationally;
  • Reporting progress to the Research Council;
  • Ensuring that the deliverables are completed according to schedule:
  • Evaluation progress and proposing changes in the scope and organization of the centre.
The SC approves, and if required, changes the project plan for the centre. This includes prioritizing among the different tasks and, to the extent required, within a task. The project manager reports to the SC on the issues listed above, and the SC shall approve all reports to be sent to the Research Council.

In 2003 NoSerC will evaluate the long-term demand for its services, including the requirements for data storage and high performance computing equipment. The user groups will be involved in the evaluation, and we will consider the possibility of closer co-operation between NoSerC and the national high performance computing organisation. The different options for financing the NoSerC activity will be considered. The evaluation will be carried out in close co-operation with the Research Council.

Budget

National facility for archiving of climate modelling data

The data stored at the NoSerC facility is typically not accessed very often, although some of the data are input datasets. The cost of storing data not accessed often is currently varying from NOK 50 - 150 per GB. The first number represent the cost of an inexpensive Linux file server with slow disk access and no redundancy, or alternatively the costs of tapes that can be used in a data migration system. The second number reflects the total costs of a hierarchical storage system (disk, tape robot, software and tapes).

Using and estimate of NOK 85 per GB and taking into account maintenance of both the existing and new storage equipment, the following budget will allow for 10 new TB of storage (numbers in kNOK):


2003 2004
Investment 850
Maintenance of existing equipment 150 150
Maintenance of new equipment 50 100

As discussed above, a tendering process will be used for the acquisition of the new equipment. We believe that will maximise the capacity, given a limited budget. However, it will not be possible to reach the estimated requirement of 20 TB.

It is too early to estimate the cost of storage for the period 2005-2006. We suggest that KlimaProg enter into a discussion with the coming high performance computing project on the configuration and use of new computers. This will clarify whether extra investments are required for storing climate modelling data, and if that is the case, how these investments can be taken into account in the tendering process for new computers. One should also be aware that cost of storage currently drops by a factor of two every 12 months, so any decision on investments beyond the next two years should be postponed.

Personnel and other operating costs

We propose an effort of 1.5 man-years per year for 2003 and 2004. Project management is included in this figure. The cost of the personnel, including indirect costs, is based on 1200 hours a year at a rate of 1.6 promilla of annual salary. The resulting cost is NOK 648 000 per man-year.

For 2003, a contribution of NOK 250 000 from KlimaEffekter is included in the personnel budget. An application for this contribution for 2004 will be submitted in 2003, subject to the requirements of the scientific projects supported by KlimaEffekter. Without this contribution in 2004, the effort will be reduced accordingly.

The contribution from met.no in terms of software developed for data handling is not included in the budget. However, we anticipate similar contributions by other institutions served by the centre. No costs for the steering committee are included in the budget. It is expected that they cover their own costs.

Overall budget

The proposed budget for NoSerC is given in the tables below. The first table gives the overall budget. The two next shows the contribution from the Research Council and met.no respectively. All numbers are in kNOK as of 2002.

Overall budget:


2003
2004
Personnel costs and indirect costs
972
722+?
Equipment (Acquisition and maintenance)
1050
250
Total
2022
972+?

Contributions from the Research Council:


2003
2004
Personnel costs and indirect costs
250
?
Equipment (Acquisition and maintenance)
700
200
Total
950
200

Contributions from met.no:


2003
2004
Personnel costs and indirect costs
722
722
Equipment (Acquisition and maintenance)
350
50
Total
1072
772

Deliverables

Storage system and database

The project will extend the existing storage system. Improved search and data presentation facilities will be provided, in close co-operation with the climate projects.

Progress reports and Final report

Regular progress reports will be delivered yearly to the Research Council according to the Council's requirements. At the end of the project a final report will be delivered.

Task reports

Scientific and technical reports will be written for each task where appropriate, and delivered to the Research Council.

Internet

The centre will further develop its homepage ( http://noserc.met.no ) on the Web. The target audience for this page are the scientists involved in climate modelling.

 

Send comments to webmaster