Norwegian Service Centre for Climate Modelling -> NoSerC project description 2003-2004
Norwegian Service Centre for Climate Modelling
Project description 2003-2004
Main tasks for the period 2003-2004:
The Norwegian Service Centre for Climate Modelling (NoSerC) was established by the Norwegian Meteorological Institute (met.no) in October 2000 under contract 139804/720 with the The Research Council of Norway. The overall aim of the centre is to facilitate efficient climate research in Norway, by providing technical assistance in the areas of data handling and analysis and computational efficiency of climate models.
For the period 2000-2002 the tasks of the project has been
all at met.no in Oslo. A total of two man-years per year for 2001-2002 are included in the project, which implies that Burud and Støren are close to 100% engaged in the project. The steering committee consists of:
Following a budget revision in 2001, the cost sharing of the project is (kNOK):
The budgeted use of the funds is (kNOK):
Due to heavy use of the archiving facility, NOK 160 000 was moved from personnel cost to the archiving facility.
In this project description, we review the
results obtained in 2000-2002 and propose aims, tasks, organisation and
budget for NoSerC for the period 2003-2004.
National facility for archiving of climate modelling dataThe national facility for archiving of climate modelling data consists of a disk system of 540 GB and a data migration system capable of handling up to 6 TB of data (12 TB including backup) provided necessary tapes are available. The system is attached to the high performance computing facility at NTNU, which is a part of the national supercomputing project NOTUR. This ensures fast access from all universities and met.no, and it is easy to store and use data from the high performance computers at NTNU. The disks, the data migration system, the units for reading/writing data and tapes have been acquired by NoSerC, while the tape robot belongs to the NOTUR project.
By end 2002 in excess of 5.5 TB are stored on the facility. Including backup, this requires tapes holding more than 11 TB of data. The growth has exceeded the budgeted figures for the NoSerC project, and new tapes have been added to the system in order to increase the capacity. The capacity limits of the system has now been reached. The storage system has been a success. Climate researchers have had a straightforward way of archiving large amounts of data. The only problem has been a recent period with degrading performance and stability problems due to lack of tapes.
The project has also invested personnel recourses in advising users on how to utilize the storage system. The system is easy to use, but usage patterns have important impact on the performance. Therefore, general advice on usage in the form of web pages ( http://noserc.met.no/dmfuse/ ), as well as direct communication, has been given to users.
A database system with a web interface (MetaMod: http://metamod.met.no ) has been developed to give an overview and search facilities of large gridded climate data files stored on the system. The database comprises metadata describing the files (data format, variables, time period, grid resolution and geographic projection etc.). Through the interface, users may copy selected files to work disks on the HPC machines, or by ftp to other machines. Metadata for some of the files on the storage system have been loaded into the database. The loading of metadata has unfortunately been delayed, due to difficulties in obtaining necessary documentation of the data files.
Formats and conversion routines for storage of climate modelling dataFormat conversion tools have been collected and developed and made available to the projects on the NOTUR computer gridur.ntnu.no (IRIX/SGI Origin 3800). The tools include conversion between the formats: FELT ( met.no internal format), GRIB and netCDF.
Tools for formatted (ASCII) output of data files in these formats have also been included. Many of the tools are also available for use on the Linux/Intel platform.
Documentation and user guides have been written and made available on the project web pages http://noserc.met.no/formkonv/ .
For the conversion of data files to the netCDF format a new tool was developed to simplify the conversion process. This tool has generated international attention and is now being considered by UCAR for incorporation in the Unidata Decoders Package.
Porting and optimization of climate modelsNoSerC has assisted researchers in RegClim in optimizing a version of the CCM3 global climate model. This effort resulted in a significant improvement in performance (CPU-time was reduced about 25 percent for a typical run using 8 processors. For more details, see: http://noserc.met.no/sgiopt/ ). Work is in progress on another version of this model, which includes some more program code developed internally by RegClim researchers.
Library of data analysis toolsSome work has also been done in making data analysis tools available for climate researchers, although the activity in this area has not been as large as anticipated. Two open source packages have been installed on the HPC facility at NOTUR/NTNU: Grads and the statistical package R. Documentation and example scripts have been made available through the NoSerC web pages (http://noserc.met.no/grtools/ ) and directly on the NOTUR/NTNU computers in Trondheim.
Other tasksThroughout the project period, the web pages for the project (http://noserc.met.no ) have been continuously updated, and now contain a large amount of useful information. The pages comprise, among other things: News, documentation of tools, usage advice on the storage system, advice on optimization etc.
Another ongoing task throughout the project period has been direct support to users.The overall aim of the centre is to facilitate efficient climate research and effect studies in Norway, by providing technical assistance in IT related areas, such as:
Objective: To facilitate fast and easy access to climate data for all scientists involved in national climate modelling projects.
The maximum capacity of the current storage facility has been reached. For the period 2003-2006, the scientists involved in the RegClim project estimate a requirement for data storage in excess of 20 TB. Furthermore, there must be capacity to store parts of the ERA-40 dataset (see below) and other climate projects will have requirements for storage. Including backup, a conservative estimate for storage of climate modelling data during the period is 60 TB. We estimate that 20 TB should be available for the period 2003-2004.
The data should be easily accessible from the high performance computers used by the scientists. Currently, these are the NOTUR computers at NTNU and UiB and, to a minor extent, the NOTUR computer at UiO. It is expected that a new high performance computing project will be initiated in 2004. However, the computers at NTNU and UiB were upgraded/installed in 2002, so it is likely that they will be in service at least until end 2004. Acquisitions of new computers may take place during 2004-2005.
The investments in storage capacity should take advantage of investments already made by NoSerC and NOTUR, both in terms of acquisitions and competence in operations and use. New major investments should take advantage of the tendering process for new high performance computers. To cover the requirements for data storage in 2003-2004, we propose to extend the capacity by either upgrading the current NoSerC facility or adding a new low-cost file server and/or disk system. A tendering process will be used to decide the most cost-effective solution. The existing facility will be supported throughout 2004, even if a new file server is chosen for the upgrade.
The NoSerC staff, in co-operation with NOTUR, should continue to give support on the storage facility. In particular, information and assistance in efficient use of the facility is required. Furthermore, the MetaMod web-interface to the facility should be improved and the climate modelling scientists should be encouraged to register datasets of interest to scientists involved in effect studies, as this will ease access to information and retrieval of data.
Objective: To make data produced by the national climate projects easily available for studies on impacts of climate change.
According to the specific objectives of the KlimaProg action plan for 2003-2006, national climate projects are encouraged to produce results that are applicable for research on effects of climate change. Such results may be produced as large data sets describing the climate change under various preconditions. NoSerC will provide the storage, search, retrieval, adoption and some presentation facilities for these data.
The MetaMod application could be modified to serve as a Web interface for the data. Such solutions should be developed in close co-operation with the data providers (the national climate projects), and the users of such data (researchers on climate effects). The Web interfaces could, if desired, be incorporated into the Web portal belonging to the data providing projects.
The details this task will be discussed with the KlimaEffekter programme and the projects supported by this programme.
Objective: To improve the quality, computational efficiency and availability of climate models used by the national climate research community.
A substantial part of the computer related problems that must be solved by climate projects are related to large climate models. The NoSerC project has shown its value in the work on improving the computational efficiency of a model used by RegClim scientists. This kind of assistance will also be available for the climate projects in the forthcoming project period. Even better assistance could be achieved if NoSerC personnel were engaged at an earlier stage of the model development effort. If substantial extensions are planned on existing models, NoSerC personnel could also contribute to the initial design and programming work.
Computer technology is rapidly changing, and new architectures are emerging as the best cost/performance alternatives. If climate research is to benefit from this development, software must be ported to new platforms. This is an area where NoSerC can offer assistance. One emerging technology which could be considered is clusters of industry standard computers based on open source operating systems (i.e. Intel/Linux clusters). At met.no one such system is already installed, running Hirlam and MM5, this system could be used as a test bed for porting climate models.
Porting and optimisation of climate models will be an ongoing activity throughout the project period. However, this task will be of particular importance during migration to new high performance computers.
Objective: To provide conversion tools for climate modelling data between commonly used data formats.
The format conversion tools provided in the first project period should be extended to include conversion to/from HDF and conversion from netCDF to GRIB, if desired by users.
Additionally, all tools should be made available on the Linux/Intel platform and source code available on internet.
Objective: To provide efficient access to large international data sets used in climate research.
Large climate data sets (ERA-40 among others) are available internationally, and are of interest for several national climate projects. A natural task for NoSerC will be to make some of these data easily available for national climate researchers.
Although several national projects are interested in the same data sets, they may have diverging interests regarding which parts of a data set should be considered for easy national access. NoSerC has developed scripts for easy retrieval of data from the original ERA-40 source (ECMWF) on demand, thus reducing the storage requirement on the NoSerC facility to a minimum. Similar solutions can be implemented for other international data sets, provided they can be retrieved from the original source through a network of sufficient capacity.
Objective: To improve the quality and efficiency of national climate research projects in their use of computers to present and analyze large data sets.
The scientists in climate research have generally much experience in using computer tools. However, as the NoSerC staff has computer technology as their main interest, climate projects could benefit by direct assistance from the NoSerC engineers on some tasks. The actual problems best suited for this kind of assistance have to be identified in close co-operation with individual climate projects. The work should be organized as small projects where the individual climate projects are in charge. This will ensure the best utilization of obtained results. If several such small projects could be implemented, involving different climate projects, another benefit would be the transfer of experience gained in one climate project to other climate projects.
The most relevant problem areas will probably be:
We propose to continue the NoSerC project with a staff of employees from met.no. The core project team will consist of
NoSerC should continue to have a steering committee (SC) representing the main user groups of the centre. The SC should be appointed by the Research Council. The current SC discussed the composition of the NoSerC SC in meeting 2/2001, and in the minutes of the meetings their view is reflected in the following paragraph (in Norwegian):
"Styringsgruppen gjorde ingen ytterligere endringer i prosjektets styringsstruktur, men det ble foreslått at Programstyret bør få signaler om at et eventuelt "NoSerC 2" bør ha styrerepresentanter fra institusjonene, ikke fra prosjektene."
The project manager is responsible for
In 2003 NoSerC will evaluate the long-term demand for its services,
including the requirements for data storage and high performance
computing equipment. The user groups will be involved in the evaluation,
and we will consider the possibility of closer co-operation between
NoSerC and the national high performance computing organisation. The
different options for financing the NoSerC activity will be considered.
The evaluation will be carried out in close co-operation with the
National facility for archiving of climate modelling dataThe data stored at the NoSerC facility is typically not accessed very often, although some of the data are input datasets. The cost of storing data not accessed often is currently varying from NOK 50 - 150 per GB. The first number represent the cost of an inexpensive Linux file server with slow disk access and no redundancy, or alternatively the costs of tapes that can be used in a data migration system. The second number reflects the total costs of a hierarchical storage system (disk, tape robot, software and tapes).
Using and estimate of NOK 85 per GB and taking into account maintenance of both the existing and new storage equipment, the following budget will allow for 10 new TB of storage (numbers in kNOK):
As discussed above, a tendering process will be used for the acquisition of the new equipment. We believe that will maximise the capacity, given a limited budget. However, it will not be possible to reach the estimated requirement of 20 TB.
It is too early to estimate the cost of storage for the period 2005-2006. We suggest that KlimaProg enter into a discussion with the coming high performance computing project on the configuration and use of new computers. This will clarify whether extra investments are required for storing climate modelling data, and if that is the case, how these investments can be taken into account in the tendering process for new computers. One should also be aware that cost of storage currently drops by a factor of two every 12 months, so any decision on investments beyond the next two years should be postponed.
Personnel and other operating costsWe propose an effort of 1.5 man-years per year for 2003 and 2004. Project management is included in this figure. The cost of the personnel, including indirect costs, is based on 1200 hours a year at a rate of 1.6 promilla of annual salary. The resulting cost is NOK 648 000 per man-year.
For 2003, a contribution of NOK 250 000 from KlimaEffekter is included in the personnel budget. An application for this contribution for 2004 will be submitted in 2003, subject to the requirements of the scientific projects supported by KlimaEffekter. Without this contribution in 2004, the effort will be reduced accordingly.
The contribution from met.no in terms of software developed for data handling is not included in the budget. However, we anticipate similar contributions by other institutions served by the centre. No costs for the steering committee are included in the budget. It is expected that they cover their own costs.
Overall budgetThe proposed budget for NoSerC is given in the tables below. The first table gives the overall budget. The two next shows the contribution from the Research Council and met.no respectively. All numbers are in kNOK as of 2002.
Contributions from the Research Council:
Contributions from met.no:
Storage system and databaseThe project will extend the existing storage system. Improved search and data presentation facilities will be provided, in close co-operation with the climate projects.
Progress reports and Final reportRegular progress reports will be delivered yearly to the Research Council according to the Council's requirements. At the end of the project a final report will be delivered.
Task reportsScientific and technical reports will be written for each task where appropriate, and delivered to the Research Council.
InternetThe centre will further develop its homepage ( http://noserc.met.no ) on the Web. The target audience for this page are the scientists involved in climate modelling.
Send comments to webmaster