As the large volume of resources involved and the power of computational Grids increased, there is a corresponding and urgent need for employ the grid technologies into problem solving environment (PSE) domain in order to improve the PSE to a better usability and higher performance. Problem Solving Environment (PSE) is a newly emerging scientific and technological active area in eScience. PSEs provide innovative computational facilities for easy incorporation of novel solution methods to solve a target class of problems in Grid environments, distributed and heterogeneous resources, collaborative environments and so on. For our research, we have developed a framework of a Grid enabled PSE for text categorization (Named: PSE-TC) with related Grid technologies. This grid-enabled PSE is able to support the activities that concern the building of the text classifier service, the classifying of the texts, the defining of the workflow, the selecting of service¡¯s Grid nodes and the reflection of the execution status through the web portal. Our main aim is to visual the process of text categorization, steer the process and modify it while executing task with this PSE. Through this PSE, the user can easily launch own service module and make a comparison to the existed module, and the most important is the user can merge own service module in to this PSE smoothly. Our researches make it possible to let the specialist of the text categorization area to use the latest Grid technology and let the new researchers to under the process of e text categorization. Meanwhile, the particular features of the text categorization issue the challenge to the Grid technology, such as the transfer of the large data volume, the selection of optimum executing path, tolerant mechanism, service quality assurance etc.
This PSE for text categorization contains mainly four components: Web Portal Service, Construct Text Classifier Service, Classify Text Proceeding Service and Workflow Management Service (see Fig.1)
 |
In our Grid enabled PSE, the text categorization process is focus on the two components, ¡°Construct Text Classifier¡± and ¡°Classify Text¡±. A whole process from the beginning of inputting new texts to the end of getting the classify results is divided into several independent modules, which are assigned to specific services offered by Grid nodes.
On solving a problem of text categorization, we deal with the texts on two stages. One is to construct the text classifier, and the other is to classify the texts. Firstly, the users log on the Web Portal, define a workflow and input the training texts into ¡°Construct Text Classifier¡±. After the text classifier is constructed, the user input new texts into ¡°Classify Text¡±. Finally, the ¡°Classify Text¡± proceeds to classify the texts based on the text classifier built on the first stage and get the final classify results.
The Workflow Management Service is employed to record the information of the workflows, such as the service names, the related Grid nodes and the execution efficiency, etc. When the users define the workflow, they consider the Workflow Management Service as a reference.
More Detail<<<<
| Wu zhang |
Professor |
ShangHai university, Department of computer Science and Technology |
| Suge wang |
Vice Professor PH.D candidator |
ShangHai university, Department of computer Science and Technology
ShanXi university, Department of mathmatics |
| Jian mei |
PH.D candidator |
ShangHai university, Department of computer Science and Technology |
| Xiaobin zhang |
PH.D candidator |
ShangHai university, Department of computer Science and Technology |
| Jiang xie |
PH.D candidator |
ShangHai university, Department of computer Science and Technology |
| Shigeo Kawata |
Professor |
Computatinal Science & Engineering, Utsunomiya University,Graduate School of Engineering |
| Mo mu |
Professor |
Department of Mathematics,Hong Kong University of Science and Technology |
|