Global Ring Network for Advanced Applications Development GLORIAD

Home | Team | Applications | Monitoring | Operations | Education | Community

Investigation of Distributed Enviromental Archives System (IDEAS) (Project in progress) (Atmospheric Science)

Headlines

2007-11-13 - GLORIAD News Wire: GLORIAD Korea's KISTI Relies on Force10 Networks for Supercomputer Build. Full article

2007-10-03 - GLORIAD Press Release: SAGE-enabled Cyberspace Demonstration over GLORIAD Takes Place as Part of Spu. Full article

2007-09-24 - GLORIAD Press Release: USA and Russian GLORIAD Partnership Building Lightpath for International Geoscience Collaboration. Full article

2007-07-31 - GLORIAD Update July 2007 Issue release. Full article

2007-07-15 - Official GLORIAD-2007 Map Release. Full article

Investigation of Distributed Environmental Archives System (IDEAS)

The Investigation of Distributed Environmental Archives System (IDEAS) project, initially named Environmental Scenario Generator (ESG), began in support of the U.S. Department of Defense (DoD) Modeling and Simulation (M&S) office. The DoD M&S Master Plan states that the next generation of M&S programs will require the inclusion of an integrated authoritative representation of the natural environment. Here the natural environment includes elements from multiple domains such as space, oceans, terrestrial weather and terrain. The capability exists today to model a highly realistic environment on a wide range of scales. Systems such as the Master Environmental Library (MEL), NVDS and others provide the ability to search for the environmental data sets distributed across the network, but the ability to search for specific “scenarios” (sets of conditions within the archived data) does not exist.

Imagine for example that the end user doesn’t need arbitrary terrestrial weather data covering Florida but rather needs and example of a typical Florida spring storm. The IDEAS was developed to address this problem. Because the functionality of the IDEAS mimics that typically performed by a human expert, it was natural to turn to the field of artificial intelligence in our search for a solution. Another prime requirement of the IDEAS system design was to allow the IDEAS user to query the archives in human linguistic terms. Fuzzy logic is a superset of conventional (Boolean) logic that has been extended to handle the concept of partial truth -- truth values between "completely true" and "completely false". It was introduced by Dr. Lotfi Zadeh of UC/Berkeley in the 1960's as a means to model the uncertainty of natural language. Some of the major advantages of taking a fuzzy based approach are: It allows more realistic (natural) definition of sets; It provides for more graceful handling of boundaries/intersections between sets; It provides more human-like searching than a classical approach. The IDEAS architecture relies heavily on a Java based fuzzy logic engine to perform searching and analysis for the user.

The mission of the IDEAS is fundamentally to help a user distill the vast amount of available data down to a manageable amount of information. The researcher can use IDEAS to find out if and when a particular type of event occurs in a region, how often it might occur and what the trend has been for a given time period. This could be used for example in modeling communication, trafficability or emergency services in response to environmental events. Beyond this however the IDEAS has applications in the area of data quality control, data classification and even forecasting. The increasing data volumes available in the future demand different techniques to handle it and the IDEAS framework presented below is one proven method for a data manager to handle it.

The IDEAS is designed in the distributed N-tier pattern, which may be roughly divided into User Interface, Services, and Data Sources layers. IDEAS User Interface is implemented as a web site which interacts with the Services and Data Sources and generates dynamical content using Java Server Pages (JSP) technology. Part of the interaction with user, such as verification of web forms and animated plots of environmental data, is done in the client browser using Java applets and JavaScript. IDEAS Services may be seen as a set of Java servlets and JAX-RPC web services activated at different steps of the system workflow and performing data discovery, collection, mining and modeling, visualization and mapping, encoding and delivery to the end user. IDEAS data sources are a set of environmental databases and repositories of transient environmental data files interfaced to the system using web services which all are conform to the simple IDEAS Data API standard.

The IDEAS system is deployed as a hierarchy of several layers of parallel computer clusters, with a cluster of application servers running IDEAS Services at the top connected to a set of database clusters serving as separate IDEAS data sources. For example, NCEP/NCAR Reanalysis Project data source with global coverage meteorological data for 1950-2003 is a cluster of 5 parallel database servers (we call them local database servers, LDS) each loaded with 10 years of data. The NGDC Space Physics Interactive Data Resource (SPIDR) for space weather data for 933-2003 is a cluster of 6 parallel database servers each storing subject-specific database, e.g. geomagnetic variations or GOES satellites measurements. IDEAS Data API services to each of the two data sources reside in a separate web service container (we call it global database server, GDS) in the cluster of IDEAS application servers. To data mine an environmental scenario, which will include combination of meteorological and space weather factors, we may run parallel search on the 2 application servers with Fuzzy Engines requesting data separately from NCEP/NCAR and SPIDR GDS’s, and at the end merge the fuzzy scores of the candidate events on the IDEAS application server dedicated to the User Interface.

The project has two key development centers: one at National Geophysics Data Center (NGDC) in Boulder, CO, USA, technical lead Eric KIHN, Eric.A.Kihn@noaa.gov; another in Center of Geophysical Data Studies Russian Acad. Sci. (CGDS) in Moscow, Russia, technical lead Mikhail ZHIZHIN, jjn@wdcb.ru. These organizations also are hosting IDEAS network nodes at http://ideas.ngdc.noaa.govand ttp://clust1.wdcb.ru/ideas.

In the framework of the GLORIAD project IDEAS web-services will be implemented as GRID Globus OGSA (later WSRF) data sources and mining services, and the huge archives of the environmental data will be mirrored and synchronized between the participating countries using GridFTP service. The network bandwidth provided by GLORIAD will be used for exchange of the global weather now- and fore-casts, and the high-performance computational clusters in GLORIAD may be used for high-resolution modeling of environmental events using data mined by IDEAS as their initial boundary conditions.

IDEAS Developers Team:
Eric Kihn, NGDC NOAA, USA
Rob Redmon, NGDC NOAA, USA
Richard Sequig, NRL, USA
Dr. Mikhail Zhizhin, Institute of Physics of the Earth and Geophysical Center,
Russian Acad. Sci.

About Us | Site Map | Privacy Policy | Contact Us | ©GLORIAD-US Team

GLORIAD is proud to contribute to !