ICEIS 2009 Abstracts


Area 1 - Databases and Information Systems Integration

Full Papers
Paper Nr: 40
Title:

MIDAS: A middleware for information systems with QoS concerns

Authors:

Luis Fernando Orleans and Geraldo Zimbrão

Abstract: One of the most difficult tasks in the design of information systems is how to control the behaviour of the back-end storage engine, usually a relational database. As the load on the database increases, the longer issued transactions will take to execute, mainly because the presence of a high number of locks required to provide isolation and concurrency. In this paper we present MIDAS, a middleware designed to manage the behaviour of database servers, focusing primarily on guaranteeing transaction execution within an specified amount of time (deadline). MIDAS was developed for Java applications that connects to storage engines through JDBC. It provides a transparent QoS layer and can be adopted with very few code modifications. All transactions issued by the application are captured, forcing them to pass through an Admission Control (AC) mechanism. To accomplish such QoS constraints, we propose a novel AC strategy, called 2-Phase Admission Control (2PAC), that minimizes the amount of transactions that exceed the established maximum time by accepting only those transactions that are not expected to miss their deadlines. We also implemented an enhancement over 2PAC, called diffserv – which gives priority to small transactions and can adopted when their occurrences are not often.

Paper Nr: 108
Title:

INSTANCE-BASED OWL SCHEMA MATCHING

Authors:

Luiz André P. Paes Leme, Marco A. Casanova, Karin Koogan Breitman and Antonio Furtado

Abstract: Schema matching is a fundamental issue in many database applications, such as query mediation and data warehousing. It becomes a difficult challenge when different vocabularies are used to refer to the same real-world concepts. In this context, a convenient approach, sometimes called extensional, instance-based or semantic, is to detect how the same real world objects are represented in different databases and to use the information thus obtained to match the schemas. This paper describes an instance-based schema matching technique for OWL schemas. The technique is based on similarity functions and is backed up by experimental results with real data downloaded from data sources found on the Web.

Paper Nr: 123
Title:

The Integrative Role of IT in Product and Process Innovation: Growth and Productivity outcomes for manufacturing SMEs

Authors:

Louis Raymond, Anne-Marie Croteau and Francois Bergeron

Abstract: The assimilation of IT for business process integration plays an integrative role by providing an organization with the ability to exploit innovation opportunities with the purpose of increasing their growth and productivity. Based on survey data obtained from 309 Canadian manufacturing SMEs, this study aims at a deeper understanding of the assimilation of IT for business process integration with regard to product and process innovation. The first objective is to identify the effect of the assimilation of IT for business process integration on growth and productivity. The second objective is to verify if the assimilation of IT for business process integration varies amongst low, medium and high-tech SMEs. Results indicate that the assimilation of IT for business process integration depends upon the type of innovation. It also varies as per the technological intensity of the firms. The assimilation of IT for business process integration has two effects: it increases the growth of manufacturing SMEs by enabling product innovation; but it decreases their productivity by impeding the process innovation.

Paper Nr: 141
Title:

Vectorizing Instance-Based Integration Processes

Authors:

Matthias Boehm, Wolfgang Lehner, Dirk Habich, Uwe Wloka and Steffen Preissler

Abstract: The inefficiency of integration processes—as an abstraction of workflow-based integration tasks—is often reasoned by low resource utilization and significant waiting times for external systems. Due to the increasing use of integration processes within IT infrastructures, the throughput optimization has high influence on the overall performance of such an infrastructure. In the area of computational engineering, low resource utilization is addressed with vectorization techniques. In this paper, we introduce the concept of vectorization in the context of integration processes in order to achieve a higher degree of parallelism. Here, transactional behavior and serialized execution must be ensured. In conclusion of our evaluation, the message throughput can be significantly increased.

Paper Nr: 142
Title:

Invisible Deployment of Integration Processes

Authors:

Matthias Boehm, Dirk Habich, Wolfgang Lehner and Uwe Wloka

Abstract: Due to the changing scope of data management towards the management of heterogeneous and distributed systems and applications, integration processes gain in importance. This is particularly true for those processes used as abstractions of workflow-based integration tasks; these are widely applied in practice. In such scenarios, a typical IT infrastructure comprises multiple integration systems with overlapping functionalities. The major problems in this area are high development effort, low portability and inefficiency. Therefore, in this paper, we introduce the vision of invisible deployment that addresses the virtualization of multiple, heterogeneous, physical integration systems into a single logical integration system. This vision comprises several challenging issues in the fields of deployment aspects as well as runtime aspects. Here, we describe those challenges, discuss possible solutions and present a detailed system architecture for that approach. As a result, the development effort can be reduced and the portability as well as the performance can be improved significantly.

Paper Nr: 148
Title:

Customizing Enterprise Software as a Service Applications: Back-end Extension in a multi-tenancy Environment

Authors:

Jürgen Müller, Jens Krueger, Sebastian Enderlein, Marco Helmich and Alexander Zeier

Abstract: Since the emerge of Salesforce.com, more and more business applications tend to move towards Software as a Service. In order to target Small and Medium-sized Enterprises, platform providers need to lower their operational costs and establish an ecosystem of partners, customizing their generic solution, to push their products into spot markets. This paper categorizes customization options, identifies cornerstones of a customizable, multi-tenancy aware infrastructure, proposes a framework that encapsulates multi-tenancy, and introduces a technique for partner back-end customizations with regard to a given real-world scenario.

Paper Nr: 193
Title:

Pattern-based Refactoring of Legacy Software Systems

Authors:

Sascha Hunold, Björn Krellner, Thomas Rauber, Thomas Reichel and Gudula Rünger

Abstract: Rearchitecturing large software systems becomes more and more complex after years of development and a growing size of the code base. Nonetheless, a constant adaptation of software in production is needed to cope with new requirements. Thus, refactoring legacy code requires tool support to help developers performing this demanding task. Since the code base of legacy software systems is far beyond the size that developers can handle manually we present an approach to perform refactoring tasks automatically. In the pattern-based transformation the abstract syntax tree of a legacy software system is scanned for a particular software pattern. If the pattern is found it is automatically substituted by a target pattern. In particular, we focus on software refactorings to move methods or groups of methods and dependent member variables. The main objective of this refactoring is to reduce the number of dependencies within a software architecture which leads to a less coupled architecture. We demonstrate the effectiveness of our approach in a case study.

Paper Nr: 196
Title:

Natural and Multi-layered Approach to Detect Changes in Tree-based Textual Documents

Authors:

Angelo Di Iorio, Michele Schirinzi, Carlo Marchetti and Fabio Vitali

Abstract: Several efficient and very powerful algorithms exist for detecting changes in tree-based textual documents, such as those encoded in XML. An important aspect is still underestimated in their design and implementation: the quality of the output, in terms of readability, clearness and accuracy for human users. Such requirement is particularly relevant when diff-ing literary documents, such as books, reports, reviews, acts, and so on. This paper introduces the concept of ’naturalness’ in diff-ing tree-based textual documents, and discusses a new extensible set of changes which can and should be detected. A naturalness-based algorithm is presented, as well as its application for diff-ing XML-encoded legislative documents. The algorithm, called JNDiff, proved to detect significantly better matchings (since new operations are recognized) and to be very efficient.

Paper Nr: 198
Title:

CrimsonHex: A Service Oriented Repository of Specialised Learning Objects

Authors:

José P. Leal and Ricardo Queirós

Abstract: The corner stone of the interoperability of eLearning systems is the standard definition of learning objects. Nevertheless, for some domains this standard is insufficient to fully describe all the assets, especially when they are used as input for other eLearning services. On the other hand, a standard definition of learning objects in not enough to ensure interoperability among eLearning systems; they must also use a standard API to exchange learning objects. This paper presents the design and implementation of a service oriented repository of learning objects called crimsonHex. This repository is fully compliant with the existing interoperability standards and supports new definitions of learning objects for specialized domains. We illustrate this feature with the definition of programming problems as learning objects and its validation by the repository. This repository is also prepared to store usage data on learning objects to tailor the presentation order and adapt it to learner profiles.

Paper Nr: 213
Title:

A SCALABLE PARAMETRIC-RBAC ARCHITECTURE FOR THE PROPAGATION OF A MULTI-MODALITY, MULTI-RESOURCE INFORMATICS SYSTEM

Authors:

Remo Mueller, Guo-Qiang Zhang and Van Anh Tran

Abstract: We present a scalable architecture called X-MIMI for the propagation of MIMI (Multi-modality, Multi-resource, Informatics In- frastructure System) to the biomedical research community. MIMI is a web-based system for managing the latest instruments and resources used by clinical and translational investigators. To deploy MIMI broadly, X-MIMI utilizes a parametric Role-Based Access Control model to de- centralize the management of user-role assignment, facilitating the de- ployment and system administration in a flexible manner that minimizes operational overhead. We use Formal Concept Analysis to specify the semantics of roles according to their permissions, resulting in a lattice hierarchy that dictates the cascades of RBAC authority. Additional com- ponents of the architecture are based on the Model-View-Controller pat- tern, implemented in Ruby-on-Rails. The X-MIMI architecture provides a uniform setup interface for centers and facilities, as well as a set of seamlessly integrated scientific and administrative functionalities in a Web 2.0 environment.

Paper Nr: 228
Title:

MINABLE DATAWAREHOUSE

Authors:

Jai Kang, James Kang and David Morgan

Abstract: Data warehouses have been widely used in various capacities such as large corporations or public institutions. These systems contain large and rich datasets that are often used by several data mining techniques to discover interesting patterns. However, before data mining techniques can be applied to data warehouses, arduous and convoluted preprocessing techniques must be completed. Thus, we propose a minable data warehouse that integrates the preprocessing stage in a data mining technique within the cleansing and transformation process in a data warehouse. This framework will allow data mining techniques to be computed without any additional preprocessing steps. We present our proposed framework using a synthetically generated dataset and a classical data mining technique called Apriori to discover association rules within instant messaging datasets.

Paper Nr: 236
Title:

A Step Forward in Semi-Automatic Metamodel Matching: Algorithms and Tool

Authors:

José de Sousa, Denivaldo Lopes, Denivaldo Lopes, Zair Abdelouahab, Daniela B. Claro and José de Sousa Jr

Abstract: In recent years the complexity of producing softwares systems has increased due the continuous evolution of the requirements, the creation of new technologies and integration with legacy systems. When complexity increases the phases of software development, maintenance and evolution become more difficult to deal with, i.e. they became more subject to error-prone factors. Recently, Model Driven Architecture (MDA) has made the management of this complexity possible thanks to models and the transformation of Platform-Independent Model (PIM) in Platform-Specific Models (PSM). However, the manual creation of transformation definitions is a programming activity which is error-prone because it is a manual task. In the MDA context, the solution is to provide semi-automatic creation of a mapping specification that can be used to generate transformation definitions in a specific transformation language. In this paper, we present an algorithm to match metamodels and enhancements in the MT4MDE and SAMT4MDE tool in order to implement this matching algorithm.

Paper Nr: 238
Title:

A Study of Indexing Strategies for Hybrid Data Spaces

Authors:

Sakti Pramanik, Sakti Pramanik, Qiang Zhu and Gang Qian

Abstract: Different indexing techniques have been proposed to index either the continuous data space (CDS) or the non-ordered discrete data space (NDDS). However, modern database applications sometimes require indexing the hybrid data space (HDS), which involves both continuous and non-ordered discrete subspaces. In this paper, the structure and heuristics of the ND-tree, which is a recently-proposed indexing technique for NDDSs, are first extended to the HDS. A novel power value adjustment strategy is then used to make the continuous and discrete dimensions comparable and controllable in the HDS. An estimation model is developed to predict the box query performance of the hybrid indexing. Our experimental results show that the original ND-tree's heuristics are effective in supporting efficient box queries in the hybrid data space, and could be further improved with our proposed strategies to address the unique characteristics of the HDS.

Paper Nr: 299
Title:

Relaxing XML Preference Queries for Cooperative Retrieval

Authors:

SungRan Cho and Wolf-Tilo Balke

Abstract: Today XML is an essential technology for knowledge management within enterprises and dissemination of data over the Web. Therefore the efficient evaluation of XML queries has been thoroughly researched. But given the ever growing amount of information available in different sources, also querying becomes more complex. In contrast to simple exact match retrieval, approximate matches become far more appropriate over collections of complex XML documents. Only recently approximate XML query processing has been proposed where structure and value are subject to necessary relaxations. All the possible query relaxations determined by the user's preferences are generated in a way that predicates are progressively relaxed until a suitable set of best possible results is retrieved. In this paper we present a novel framework for developing preference relaxations to the query permitting additional flexibility in order to fulfill a user’s wishes. Additionally we design IPX, an interface for XML preference query processing, that enables users to express and formulate complex user preferences, and provides a first solution for the aspects of XML preference query processing that allow preference querying and returning ranked answers.

Paper Nr: 461
Title:

DEXIN--An Extensible Framework for Distributed XQuery over Heterogeneous Data Sources

Authors:

Muhammad I. Ali, Schahram Dustdar, Reinhard Pichler and Hong-Linh Truong

Abstract: In the Web environment, rich, diverse sources of heterogeneous and distributed data are ubiquitous. In fact, even the information characterizing a single entity - like, for example, the information related to a Web service - is normally scattered over various data sources using various languages such as XML, RDF, and OWL. Hence, there is a strong need for Web applications to handle queries over heterogeneous, autonomous, and distributed data sources. However, existing techniques do not provide sufficient support for this task. In this paper we present DeXIN, an extensible framework for providing integrated access over heterogeneous, autonomous, and distributed web data sources, which can be utilized for data integration in modern Web applications and Service Oriented Architecture. DeXIN extends the XQuery language by supporting the specification and execution of SPARQL queries inside XQuery, thus facilitating the query of data modeled in XML, RDF, and OWL. DeXIN facilitates data integration in distributed Web and Service Oriented environment by avoiding the transfer of large amounts of data to a central server for centralized data integration and exonerates the transformation of huge amount of data into common format for integrated access. We also present typical application scenarios and report on experiments with DeXIN. These experiments demonstrate the ease of use and the good performance of our framework.

Paper Nr: 496
Title:

Dimensional Templates in Data Warehouses - Automating the Multidimensional Design of Data Warehouse Prototypes

Authors:

Rui Oliveira, Fátima Rodrigues, Paulo Martins and João P. Moura

Abstract: Prototypes are valuable tools in Data Warehouse (DW) projects. DW prototypes can help end-users to get an accurate preview of a future DW system, along with its advantages and constraints. However, DW prototypes have considerably smaller development time windows when compared to complete DW projects. This puts additional pressure on the achievement of the expected prototypes' high quality standards, especially at the highly time consuming multidimensional design: in it, a thin margin for harmful unreflected decisions exists. Some devised methods for automating DW multidimensional design can be used to accelerate this stage, yet they are more suitable to DW full projects rather than to prototypes, due to the effort, cost and expertise they require. This paper proposes the semi-automation of DW multidimensional designs using templates. We believe this approach better fits the development speed and cost constraints of DW prototyping since templates are pre-built highly adaptable and highly reusable solutions.

Paper Nr: 560
Title:

MULTIVIEWS COMPONENTS FOR USER-AWARE WEB SERVICES

Authors:

Bouchra El Asri, Adil Kenzi, Mahmoud Nassar, Abdelaziz Kriouile and Abdelaziz Barrahmoune

Abstract: Component based software (CBS) intends to meet the need of reusability and productivity; Web service technologies allows interoperability. This work addresses the development of CBS using web services technologies. Undeniably, web service may interact with several types of web service clients. The central problem is, therefore, how to handle the multidimensional aspect of web service clients’ needs and requirements. To tackle this problem, we propose the concept of multiview component as a first class modeling entity that allows the capture of the various needs of web service clients by separating their concerns. In this paper, we propose a model driven approach for the development of user-aware web services on the basis of the multiview component concept. So, we describe how multiview component based PIM are transformed into two PSMs for the purpose of the automatic generation of both the user-aware web service description and implementation. We specify transformations as a collection of transformation rules implemented using ATL as a model transformation language.

Paper Nr: 575
Title:

KNOWLEDGE BASED QUERY PROCESSING IN LARGE SCALE VIRTUAL ORGANIZATIONS

Authors:

Alexandra Pomares, Claudia Roncancio, José Abasolo and María Villamil

Abstract: This work concerns query processing to support data sharing in large scale Virtual Organizations(VO). Characterization of VO’s data sharing contexts reflects the coexistence of factors like sources overlapping, uncertain data location, and fuzzy copies in dynamic large scale environments that hinder query processing. Existing results on distributed query evaluation are useful for VOs, but there is no appropriate solution combining high semantic level and dynamic large scale environments required by VOs. This paper proposes a characterization of VOs data sources, called Data Profile, and a query processing strategy (called QPro2e) for large scale VOs with complex data profiles. QPro2e uses an evolving distributed knowledge base describing data sources roles w.r.t shared domain concepts. It allows the identification of logical data source clusters which improve query evaluation in presence of a very large number of data sources

Paper Nr: 597
Title:

Applying Recommendation Technology in OLAP Systems

Authors:

Houssem Jerbi, Olivier Teste, Gilles Zurfluh and Franck Ravat

Abstract: OLAP systems offering multidimensional and large information space cannot solely rely on standard navigation but need to apply recommendations to make the analysis process easy and to help users quickly find relevant data for decision-making. In this paper, we propose a recommendation methodology that aims at assisting the user during his decision-support analysis. The system helps the user in querying multidimensional data and exposes him to the most interesting patterns, i.e. it provides to the user anticipatory as well as alternative decision-support data. We provide a preference-based approach to apply such methodology.

Paper Nr: 606
Title:

CLASSIFICATION AND PREDICTION OF SOFTWARE COST THROUGH FUZZY DECISION TREES

Authors:

Efi Papatheocharous and Andreas Andreou

Abstract: This work addresses the issue of software effort prediction via fuzzy decision trees generated using historical project data samples. Moreover, the effect that various numerical and nominal project characteristics used as predictors have on software development effort is investigated utilizing the classification rules extracted. The approach attempts to achieve successful classification of past project data into homogeneous clusters so as to provide accurate and reliable cost estimates within each cluster. Two algorithms, namely CHAID and CART, are applied on approximately 1000 project empirical software cost data records. The data first passed through analysis and pre-processing activities and then used for generating fuzzy decision trees instances. Then an evaluation method is performed based on the prediction accuracy of the classification rules produced. Even though the experimentation follows a heuristic approach, the trees built were found to fit the data quite successfully, while the predicted effort values approximate well the actual effort. Therefore, the model proposed may be used for future cost predictions and better allocation and control of project resources.

Paper Nr: 615
Title:

s-OLAP: a System for Supporting Approximate OLAP Query Evaluation on Very Large Data Warehouses via Probabilistic Synopses

Authors:

Alfredo Cuzzocrea

Abstract: In this paper, we propose s-OLAP, a multi-user middleware system for supporting approximate range aggregate queries on data cubes. The application scenario of s-OLAP is a networked and heterogeneous very large data warehousing environment where applying traditional algorithms for processing OLAP queries is too much expensive and not convenient because of the size of the multidimensional data, and the computational cost needed for accessing and processing them. s-OLAP relies on intelligent data representation and processing techniques, among are: (i) the amenity of exploiting the Karhunen-Loeve Transform (KLT) for obtaining dimensionality reduction of data cubes, and (ii) the definition of a probabilistic framework that allows us to provide a rigorous theoretical basis for ensuring probabilistic guarantees over the degree of approximation of the retrieved answers, which is a critical point in the context of approximate query answering techniques.

Short Papers
Paper Nr: 112
Title:

EXPERIENCES OF ERP USE IN SMALL ENTERPRISES

Authors:

Päivi Iskanius, Matti Möttönen and Raija Halonen

Abstract: This paper investigates the role of Enterprise Resource Planning (ERP) systems in the context of small and medium size enterprises (SMEs). The paper reports on research findings from a case study that has been conducted in 14 SMEs, operating in steel manufacturing and woodworking. By dividing the enterprises into three different groups; medium-sized, small, and micro enterprises, this study provides a richer understanding of enterprise size related issues in motivations, risks and challenges of ERP adoption.
Download

Paper Nr: 124
Title:

BUSINESS INTELLIGENCE BASED ON A WI-FI REAL TIME POSITIONING ENGINE - A Practical Application in a Major Retail Company

Authors:

Vasco Vinhas, Pedro Abreu and Pedro Mendes

Abstract: Collecting relevant data to perform business intelligence on a real time basis has always been a crucial objective for managers responsible for economic activities on large spaces. Following this emergent need, the authors propose a platform to perform data gathering and analysis on the location of people and assets by automatic means. The developed system is retail business oriented and has a fairly distributed architecture. It couples the core elements of a real-time Wi-Fi based location system with a set of developed functional views so to better explicit the information that one can observe for each tracked entity, the undertaken path on the space, demographic concentration patterns. Tests were conducted on a real production environment as a partnership outcome with a major player in the retail sector and the obtained results were completely satisfactory having the managers confirmed the provided knowledge relevance.
Download

Paper Nr: 126
Title:

DIRECTED ACYCLIC GRAPHS AND DISJOINT CHAINS

Authors:

Yangjun Chen

Abstract: The problem of decomposing a DAG (directed acyclic graph) into a set of disjoint chains has many applications in data engineering. One of them is the compression of transitive closures to support reachability queries on whether a given node v in a directed graph G is reachable from another node u through a path in G. Recently, an interesting algorithm is proposed by Chen et al. [Y. Chen and Y. Chen, An Efficient Algorithm for Answering Graph Reachability Queries, Proceedings of ICDE, 2008, pp. 893 - 902], which claims to be able to decompose G into a minimal set of disjoint chains in O(n2 + bn) time, where n is the number of the nodes of G, and b is G’s width, defined to be the size of a largest node subset U of G such that for every pair of nodes u, v Œ U, there does not exist a path from u to v or from v to u. However, in some cases, it fails to do so. In this paper, we analyze this algorithm and show the problem. More importantly, a new algorithm is discussed, which can always find a minimal set of disjoint chains in the same time complexity as Chen’s.
Download

Paper Nr: 152
Title:

AN OBJECT MODEL FOR THE MANAGEMENT OF DIGITAL IMAGES

Authors:

Souheil Khaddaj and Andreas Hoppe

Abstract: With digital image volumes rising dramatically there exists an important and urgent need for novel techniques and mechanisms that provide efficient storage and retrieval facilities of the voluminous data generated daily. It is already widely accepted that the use of data abstraction in object oriented modelling enables real world objects to be well represented in information systems. In this work we are particularly interested with the use of object oriented techniques for the management of digital images. Object orientation is well suited for such systems, which require the ability to handle multiple type content. This paper aims to investigate a conceptual model, based on object versioning techniques, which will represent the semantics in order to allow the continuity and pattern of changes of images to be determined over time.
Download

Paper Nr: 155
Title:

A MapReduce framework for change propagation in geographic databases

Authors:

Mario Vacca, Ferdinando Di Martino and Giuseppe Polese

Abstract: Updating a schema is a very important activity which occurs naturally during the life cycle of database systems, due to different causes. A challenging problem arising when a schema evolves is the change propagation problem, i.e. the updating of the database ground instances to make them consistent with the evolved schema. Spatial datasets, a stored representation of geographical areas, are VLDBs and so the change propagation process, involving an enormous mass of data among geographical distributed nodes, is very expensive and call for efficient processing. Moreover, the problem of designing languages and tools for spatial data sets change propagation is relevant, for the shortage of tools for schema evolution, and, in particular, for the limitations of those for spatial data sets. In this paper, we take in account both efficiency and limitations and we propose an instance update language, based on the efficient and popular Map-Reduce Google programming paradigm, which allows to perform in a parallel way a wide category of schema changes. A system embodying the language has been implementing.
Download

Paper Nr: 161
Title:

Establishing Trust Networks based on Data Quality Criteria for Selecting Data Suppliers

Authors:

Ricardo Pérez-Castillo, Ismael Caballero, Eugenio Verbo, Ignacio G. Rodríguez De Guzmán, Macario Polo and Mario Piattini

Abstract: Nowadays, organizations may haveWeb portals tailoring several websites where a wide variety of information is integrated. These portals are typically composed of a set of Web applications and services that interchange data among them. In this setting, there is no way to find out how the quality of the interchanged data is going to evolve successively. A framework is proposed for establishing trust networks based on the Data Quality (DQ) levels of the interchanged data. We shall consider two kinds of DQ: inherent DQ and pragmatic DQ. Making a decision about the selection of the most suitable data supplier will be based on the estimation of the best expected pragmatic DQ levels. In addition, an example is presented to ilustrate framework operation.
Download

Paper Nr: 168
Title:

Algorithms for Efficient Top-k Spatial Preference Query Execution in a Heterogeneous Distributed Environment

Authors:

Marcin Gorawski and Kamil Dowlaszewicz

Abstract: Top-k spatial preference queries allow searching for objects on the basis of their neighbourhoods’ character. They find k objects whose neighbouring objects satisfy the query conditions to the greatest extent. The execution of the queries is complex and lengthy as it requires performing numerous accesses to index structures and data. Existing algorithms therefore employ various optimization techniques. The algorithms assume, however, that all data sets required to execute the query are aggregated in one location. In reality data is often distributed on remote nodes like for example data accumulated by different organizations. This motivated developing algorithm capable of efficiently executing the queries in a heterogeneous distributed environment. The paper describes the specifics of operating in such environment, presents the developed algorithm, describes the mechanisms it employs and discusses the results of conducted experiments.
Download

Paper Nr: 185
Title:

AN INFORMATION SYSTEM FOR THE MANAGEMENT OF CHANGES DURING THE DESIGN OF BUILDING PROJECTS

Authors:

Essam Zaneldin

Abstract: Design is an important stage in a project's life cycle with the greatest impact on the overall performance and cost. For several reasons, changes introduced by design participants are imminent. Despite the importance of coordinating these changes among the different participants during the design stage, current practice exhibits severe information transfer problems. Since corrections to finalized designs or even designs at late stages in the process are extremely costly, it is less costly to spend the effort in managing changes and producing highly coordinated and easily constructible designs. To support this objective, this paper presents an information system with a built-in database for representing design information, including design rationale and history of changes, to support the management of changes during the design of building projects. The components of the system are discussed and possible future extensions to the present study are presented. This research is expected to help engineering and design-build firms to effectively manage design changes and produce better coordinated and constructible designs with less cost and time.
Download

Paper Nr: 189
Title:

EFFICIENT SYSTEMINTEGRATION USING SEMANTIC REQUIREMENTS AND CAPABILITY MODELS: An approach for integrating heterogeneous Business Services

Authors:

Thomas Moser, Richard Mordinyi, Alexander Mikula and Stefan Biffl

Abstract: Business system designers want to integrate heterogeneous legacy systems to provide flexible business ser-vices cheaper and faster. Unfortunately, modern integration technologies represent important integration knowledge only implicitly making solutions harder to understand, verify, and maintain. In this paper we propose a data-driven approach, “Semantically-Enabled Externalization of Knowledge” (SEEK), that expli-citly models the semantics of integration requirements & capabilities, and data transformations between he-terogeneous legacy systems. Goal of SEEK is to make the systems integration process more efficient by providing tool support for quality assurance (QA) steps and generation of system configurations. Based on use cases from industry partners, we compare the SEEK approach with UML-based modeling. In the evalua-tion context SEEK was found to be more effective to make expert knowledge on system requirements and capabilities available for more efficient tool support and reuse.
Download

Paper Nr: 197
Title:

SEMANTIC FRAMEWORK FOR INFORMATION INTEGRATION Using Service-Oriented Analysis and Design

Authors:

Prima Gustiene, Irina Peltomaa and Heli Helaakoski

Abstract: Today’s dynamic markets demand from companies’ new ways of thinking, adaptation of new technologies and more flexible production. These business drivers can be met effectively and efficiently only if people and enterprise resources, such as information systems collaborate together. The gap between organizational business aspects and information technology causes problems for companies to reach their goals. Information systems have increasingly important role in realization of business processes demands which leads to demand of close interaction and understanding between organizational and technical components. It is critical for enterprise interoperability, where semantic integration of information and technology is the prerequisite for successful collaboration. The paper presents a new semantic framework for better quality of semantic interoperability.
Download

Paper Nr: 207
Title:

Injecting Semantics into Event-Driven Architectures

Authors:

Juergen Dunkel, Alberto Fernández, Ruben Ortiz and Sascha Ossowski

Abstract: Event-driven architectures (EDA) have been proposed as a new architectural paradigm for event-based systems to process complex event streams. However, EDA have not yet reached the maturity of well-established software architectures because methodologies, models and standards are still missing. Despite the fact that EDA-based systems are essentially built on events, there is a lack of a general event modeling approach. In this paper we put forward a semantic approach to event modeling that is expressive enough to cover a broad variety of domains. Our approach is based on semantically rich event models using ontologies that allow the representation of structural properties of event types and constraints between them. Then, we argue in favour of a declarative approach to complex event processing that draws upon well established rule languages such as JESS and integrates the structural event model. We illustrate the adequacy of our approach with relation to a prototype for an event-based road traffic management system.
Download

Paper Nr: 230
Title:

C3: A METAMODEL FOR ARCHITECTURE DESCRIPTION LANGUAGE BASED ON FIRST-ORDER CONNECTOR TYPES

Authors:

Abdelkrim Amirat

Abstract: To provide hierarchical description from different software architectural viewpoints we need more than one abstraction hierarchy and connection mechanisms to support the interactions among components. Also, these mechanisms will support the refinement and traceability of architectural elements through the different levels of each hierarchy. Current methods and tools provide poor support for the challenge posed by developing system using hierarchical description. This paper describes an architecture-centric approach allowing the user to describe the logical architecture view where a physical architecture view is generated automatically for all application instances of the logical architecture.
Download

Paper Nr: 243
Title:

QUERY MELTING: A NEW PARADIGM FOR GIS MULTIPLE QUERY OPTIMIZATION

Authors:

Haifa E. Elariss, Darrel Greenhill and Souheil Khaddaj

Abstract: Recently, non-expert mobile-user applications have been developed to query Geographic Information Systems (GIS) particularly Location Based Services where users ask questions related to their position whether they are moving (dynamic) or not (static). A new Iconic Visual Query Language (IVQL) has been developed to handle proximity analysis queries that find k-nearest-neighbours and objects within a buffer area. Each operator in IVQL queries corresponds to an execution plan to be evaluated by the GIS server. Since commonalities exist between the execution plans, the same operations are executed many times leading to slow results. Hence, the need arises to develop a multi-user dynamic complex query optimizer that handles commonalities and processes the queries faster especially with the large-scale of mobile-users. We present a new query processor, a generic optimization framework for GIS and a middleware, which employs the new Query Melting paradigm (QM) that is based on the sharing paradigm and push-down optimization strategy. QM is implemented through a new Melting-Ruler strategy that works at the low-level, melts repetitions in plans to share spatial areas, temporal intervals, objects, intermediate results, maps, user locations, and functions, then re-orders them to get time-cost effective results, and is illustrated using a sample tourist GIS system.
Download

Paper Nr: 261
Title:

Modeling Web Documents as Objects for Automatic Web Content Extraction

Authors:

Estella Annoni and C. I. Ezeife

Abstract: Traditionally, mining web page contents involves modeling their contents to discover the underlying knowledge. Data extraction proposals represent web data in a formal structure such as database structures specific to application domains. Those models fail to catch the full diversity of web data structures which can be composed of different types of contents, and can be also unstructured. In fact, with these proposals, it is not possible to focus on a given type of contents, to work on data of different structures and to mine on data of different application domains as required to mine efficiently a given content type or web documents from different domains. On top of that, since web pages are designed to be understood by users, this paper considers modeling of web document presentations expressed through HTML tag attributes as useful for an efficient web content mining. Hence, this paper provides a general framework composed of an object-oriented web data model based on HTML tags and algorithms for web content and web presentation object extraction from any given web document. From the HTML code of a web document, web objects are extracted for mining, regardless of the domain.
Download

Paper Nr: 286
Title:

TOWARD A QUALITY MODEL FOR CBSE - Conceptual Model Proposal

Authors:

María R. Ramírez, Luis E. Mendoza, Maryoly Ortega, Maria Angelica Perez de Ovalles, Kenyer Domínguez and Anna Grimán

Abstract: In this paper, which is part of a research in progress, we analyze the conceptual elements behind Component-Based Software Engineering (CBSE) and propose a model that will support its quality evaluation. The conceptual model proposed integrates the product perspective, a view that includes components and Component-Based Software (CBS), as well as the process perspective, a view that represents the component and CBS development life cycle. The model proposal was developed under a systemic approach that will allow for assessing and improving products and processes immersed in CBSE. Future actions include proposing metrics to operationalize the model and validate them through a case study. The model application will allow studying the behavior of each perspective and the relationships among them.
Download

Paper Nr: 298
Title:

OPTIMIZATION OF SPARQL BY USING CORESPARQL

Authors:

Jinghua Groppe, Sven Groppe and Jan Kolbaum

Abstract: SPARQL is becoming an important query language for RDF data. Query optimization to speed up query processing has been an important research topic for all query languages. In order to optimize SPARQL queries, we suggest a core fragment of the SPARQL language, which we call the coreSPARQL language. coreSPARQL has the same expressive power as SPARQL, but eliminates redundant language constructs of SPARQL. SPARQL engines and optimization approaches will benefit from using coreSPARQL, because fewer cases need to be considered when processing coreSPARQL queries and the coreSPARQL syntax is machine-friendly. In this paper, we present an approach to automatically transforming SPARQL to coreSPARQL, and develop a set of rewriting rules to optimize coreSPRQL queries. Our experimental results show that our optimization of SPARQL speeds up RDF querying.
Download

Paper Nr: 311
Title:

FedDW: A Tool for Querying Federations of Data Warehouses

Authors:

Stefan Berger and Michael Schrefl

Abstract: Recently, Federated Data Warehouses – collections of autonomous and heterogeneous Data Marts – have become increasingly attractive as they enable the exchange of business information across organization boundaries. The advantage of federated architectures is that users may access the global, mediated schema with OLAP applications, while the Data Marts need not be changed and retain full autonomy. Although the underlying concepts are mature, tool support for Federated DWs has been poor so far. This paper presents the prototype of the “FedDW” Query Tool that supports distributed query processing in federations of ROLAP Data Marts. It acts as middleware component that reformulates user queries according to semantic correspondences between the autonomous Data Marts. We explain FedDW’s architecture, demonstrate a use-case and explain our implementation. We regard our proof-of-concept prototype as a first step towards the development of industrial strength query tools for DW federations.
Download

Paper Nr: 318
Title:

An User-centric and Semantic-driven Query Rewriting Over Proteomics XML Sources

Authors:

Hassan Badir, Kunale Kudagba and Omar El BEQQALI

Abstract: Querying and sharing Web proteomics data is not an easy task. Given that, several data sources can be used to answer the same sub-goals in the Global query, it is obvious that we can have many candidates rewritings. The user-query is formulated using Concepts an Properties related to Proteomics research (Domain Ontology). Semantic mappings describe the contents of underlying sources. In this paper, we propose a characterization of query rewriting problem using semantic mappings as an associated hypergraph. Hence, the generation of candidates rewritings can be formulated as the discovery of minimal Transversals of an hypergraph. We exploit and adapt algorithms available in Hypergraph Theory to find all candidates rewritings from a query answering problem. Then, in future work, some relevant criteria could be help to determine optimal and qualitative rewritings, according to user needs, and sources performances.
Download

Paper Nr: 323
Title:

A PSO-BASED RESOURCE SCHEDULING ALGORITHM FOR PARALLEL QUERY PROCESSING ON GRIDS

Authors:

J. P. C., Gilberto Martinez-Luna and Nareli Cruz-Cortes

Abstract: The accelerated development in Grid computing has positioned them as as promising next generation computing platforms. Grid computing contains resource management, task scheduling, security problems, information management and so on. In the context of database query processing, existing parallelisation techniques can not operate well in Grid environments, because the way they select machines and allocate queries. This is due to the geographic distribution of resources that are owned by different organizations. The resource owners of each of these resources have different usage or access policies and cost models, and varying loads and availability. It is a big challenge for efficient scheduling algorithm design and implementation. In this paper, a heuristic approach based on particle swarm optimization algorithm is adopted to solving parallel query scheduling problem in Grid environment.
Download

Paper Nr: 359
Title:

APPLYING INFORMATION RETRIEVAL FOR MARKET BASKET RECOMMENDER SYSTEMS

Authors:

Tapio Pitkäranta

Abstract: Coded data sets form the basis for many well known applications from healthcare prospective payment system to recommender systems in online shopping. Previous studies on coded data sets have introduced methods for the analysis of rather small data sets. This study proposes applying information retrieval methods for enabling high performance analysis of data masses that scale beyond traditional approaches. An essential component in today’s data warehouses to which coded data sets are collected is a database management system (DBMS). This study presents experimental results how information retrieval indexes scale and outperform common database schemas with a leading commercial DBMS engine in analysis of coded data sets. The results show that flexible analysis of hundreds of millions of coded data sets is possible with a regular desktop hardware.
Download

Paper Nr: 372
Title:

SYMBOLIC EXECUTION FOR DYNAMIC, EVOLUTIONARY TEST DATA GENERATION

Authors:

Anastasis Sofokleous, Andreas Andreou and Antonis Kouras

Abstract: This paper combines the advantages of symbolic execution with search based testing to produce automatically test data for JAVA programs. A framework is proposed comprising two systems which collaborate to generate test data. The first system is a program analyser capable of performing dynamic and static program analysis. The program analyser creates the control flow graph of the source code under testing and uses a symbolic transformation to simplify the graph and generate paths as independent control flow graphs. The second system is a test data generator that aims to create a set of test cases for covering each path. The implementation details of the framework, as well as the relevant experiments carried out on a number of JAVA programs are presented. The experimental results demonstrate the efficiency and efficacy of the framework and show that it can outperform the performance of related approaches.
Download

Paper Nr: 375
Title:

A BIT-SELECTOR TECHNIQUE FOR PERFORMANCE OPTIMIZATION OF DECISION-SUPPORT QUERIES

Authors:

Ricardo Santos and Jorge Bernardino

Abstract: As data warehouses are growing into the multi-terabyte range, adequate performance for decision support queries remains challenging for database query processors. A large amount of wide-ranging techniques have been used in research to overcome this problem. Bit-based techniques such as bitmap indexes and bitmap join indexes have been used and are generally accepted as standard common practice for optimizing data warehouses. These techniques are very promising due to their relatively low overhead and fast bitwise operations. In this paper, we propose a new technique which performs optimized row selection for decision support queries, introducing a bit-based attribute into the fact table. This attribute’s value for each row is set according to its relevance for processing each decision support query by using bitwise operations. Simply inserting a new column in the fact table’s structure and using bitwise operations for performing row selection makes it a simple and practical technique, which is easy to implement in any Database Management System. The experimental results, using benchmark TPC-H, demonstrates that it is an efficient optimization method which significantly improves query performance.
Download

Paper Nr: 396
Title:

A DOMAIN SPECIFIC LANGUAGE FOR THE I* FRAMEWORK

Authors:

João Araújo, Vasco Amaral, Carlos Nunes and Carla Silva

Abstract: The i* framework proposes a goal-oriented analysis method for requirements engineering. It is a systematic approach to discover and structure requirements at organizational level where nonfunctional requirements and their relations are specified. A Domain Specific Language (DSL) has the purpose to specify and model concepts in some domain, having several advantages in relation to general purpose languages, such as allow expressing a solution in the desired language and at the desired abstraction level. In order to create such a DSL, normally it is necessary to start by specifying its syntax by means of a metamodel to be given as input to the language workbenches that generate the corresponding edtors for it. With a proper editor for the language we can specify models with the proposed notation. This paper presents a DSL for the i* framework, with the purpose to handle complexity and scalability of its concrete models by introducing some inovations in the i* framework metamodel like mechanisms that will help to manage the models scalability.
Download

Paper Nr: 460
Title:

EXTENDING THE UML-GEOFRAME DATA MODEL FOR CONCEPTUAL MODELING OF NETWORK APPLICATIONS

Authors:

Sergio Stempliuc, Jugurta Lisboa-Filho, Karla Borges and Marcus Andrade

Abstract: This paper presents an extension of the UML-GeoFrame data model that includes a set of new constructors to allow the definition of conceptual schemas for spatial database applications whose elements relationship forms a network.. Also, it is discussed how the GeoFrame conceptual framework is changed with the inclusion of new metaclasses and the corresponding stereotypes related to network elements. The extension proposed in this paper is evaluated using a class diagram for a water distribution company.
Download

Paper Nr: 478
Title:

INTEGRATION METHOD AMONG BSC, CMMI AND SIX SIGMA USING GQM TO SUPPORT MEASUREMENT DEFINITION (MIBCIS)

Authors:

Leonardo Romeu, Jorge Audy and Andressa Covatti

Abstract: The software quality area has presented various studies and surveys in different fronts, either about products or processes. There are many initiatives in the area of software process improvement, which might be more than often conflicting in an organization. If we observe some of the existing models and methodologies in the market, the CMMI Model and the Six Sigma Methodology stand head and shoulders above the rest for being complemented. While CMMI focuses on organization and on process management and Six Sigma has its focus on the client and on the financial results, both highlight the importance of the data produced for decision making. This study presents a method for the integrated implementation of the CMMI Model and the Six Sigma Methodology for programs of process improvement, having as a backup measurement and assessment techniques such as the Balanced Scorecard (BSC) and the Goal-Question-Metric (GQM).
Download

Paper Nr: 484
Title:

gisEIEL: An Open Source GIS for Territorial Management

Authors:

Pedro A. González, Miguel Lorenzo, Miguel R. Luaces, José Ignacio Lamas Fonte and David Trillo

Abstract: The provincial government A Coruña, in Spain, has been working on the last years in the construction of a geographic information system for the management of its territory. The result of this work are three software products: WebEIEL, gisEIEL and the ideAC node. WebEIEL is the web application that publishes the information on the Internet. gisEIEL is the desktop application that is used by the provincial government and the municipalities to create, query, visualize, analyze and update the information in the system. Finally, the ideAC node is a spatial data infrastructure that uses international standards to publish the information as a part of the Spanish spatial data infrastructure. In this paper, we describe the functionality and the architecture of the system and we present the problems that we had to face during the development of the system and the solutions that we applied.
Download

Paper Nr: 492
Title:

ASSESSING WORKFLOW MANAGEMENT SYSTEMS: A QUANTITATIVE ANALYSIS OF A WORFKLOW EVALUATION MODEL

Authors:

Stephan Poelmans and Hajo A. Reijers

Abstract: Despite the enormous interest in workflow management systems and their widespread adoption by industry, few research studies are available that empirically assess the effectiveness and acceptance of this technology. Our work exactly aims at providing such insights and this paper presents some of our preliminary quantitative findings. Using a theory-based workflow success model, we have studied the impact of operational workflow technologies on end-users in terms of perceived usefulness, end-user satisfaction and perceived organisational benefits. A survey instrument was used to gather a sample of 246 end-users from two different organizations. Our findings show that the considered workflow applications are generally accepted and positively evaluated. Using partial least squares analysis, the success model was well supported, making it a usefull instrument to evaluate future workflow projects.
Download

Paper Nr: 504
Title:

EFFICIENT COMMUNITY MANAGEMENT IN AN INDUSTRIAL DESIGN ENGINEERING WIKI

Authors:

Regine Vroom, Adrie Kooijman and Raymond Jelierse

Abstract: Industrial design engineers use a wide variety of research fields when making decisions that will eventually have significant impact on their designs. Obviously, designers cannot master every field, so they are therefore often looking for a simple set of rules of thumb on a particular subject. For this reason a wiki has been set up: www.wikid.eu. Whilst Wikipedia already offers a lot of this information, there is a distinct difference between WikID and Wikipedia; Wikipedia aims to be an encyclopaedia, and therefore tries to be as complete as possible. WikID aims to be a design tool. It offers information in a compact manner tailored to its user group, being the Industrial Designers. The main subjects of this paper are the research on how to create an efficient structure for the community of WikID and the creation of a tool for managing the community. With the new functionality for managing group memberships and viewing information on users, it will be easier to maintain the community. This will also help in creating a better community which will be more inviting to participate in, provided that the assumptions made in this area hold true.
Download

Paper Nr: 509
Title:

Is the application of aspect-oriented software development beneficial? First experimental results

Authors:

Sebastian Kleinschmager and Stefan Hanenberg

Abstract: Aspect-oriented software development is an approach which addresses the construction of software artifacts which traditional software engineering constructs fail to modularize: the so-called crosscutting concerns. However, although aspect-orientation claims to permit a better modularization of crosscutting concerns, it is still not clear whether the application of aspect-oriented constructs has a measurable, positive impact on the construction of software artifacts. This paper addresses this issue by an empirical study which compares the specification of crosscutting concerns using traditional composi-tion techniques and aspect-oriented composition techniques using the object-oriented programming language Java and the aspect-oriented programming lan-guage AspectJ.
Download

Paper Nr: 528
Title:

INTRODUCING REAL-TIME BUSINESS CASE DATABASE - An approach to improve system maintenance of complex application landscapes

Authors:

Oliver Daute

Abstract: While system maintenance of single systems is under control nowadays, new challenges come up due to the use of linked up software applications in order to implement business scenarios. Numerous business processes exchange data across complex application landscapes, for that they use various applications and compute data. The technology underneath has to provide a stable environment maintaining diverse software, databases and operating system components. The challenge is to keep the application environment under control at any given time. The goal is to avoid incidents to business processes and to sustain the application landscape with regard to smaller and larger changes. For system maintenance of complex environments information about process run-states is indispensable, for example when parts of a system environment must be restored. This paper introduces the Real-Time Business Case Database (RT-BCDB) to control business processes and improve maintenance activities in complex application landscapes. It is about a concept, to gain more transparency and visibility of business processes activities. RT-BCDB stores information about business cases and theirs run-states continuously. Service frameworks such as IT Service Management (ITIL) can benefit of RT-BCDB as well.
Download

Paper Nr: 535
Title:

AN EXTENSION OF ONTOLOGY BASED DATABASES TO HANDLE PREFERENCES

Authors:

Dilek Tapucu, Yamine Ait-ameur, Stéphane Jean and Murat Osman UNALIR

Abstract: Ontologies have been defined to make explicit the semantics of data. With the emergence of the SemanticWeb, the amount of ontological data (or instances) available has increased. To manage such data, Ontology Based DataBases (OBDBs), that store ontologies and their instance data in the same repository have been proposed. These databases are associated with exploitation languages supporting description, querying, etc. on both ontologies and data. However, usually queries return a big amount of data that may be sorted in order to find the relevant ones. Moreover, in the current, few approaches considering user preferences when querying have been developed. Yet this problem is fundamental for many applications especially in the e-commerce domain. In this paper, we first propose an extension of an existing OBDB, called OntoDB through extension of their ontology model in order to support semantic description of preferences. Secondly, an extension of an ontology based query language, called OntoQL defined on OntoDB for querying ontological data with preferences is presented. Finally, an implementation of the proposed extensions are described.
Download

Paper Nr: 544
Title:

A USER-DRIVEN AND A SEMANTIC-BASED ONTOLOGY MAPPING EVOLUTION APPROACH

Authors:

Hélio Martins and Nuno Silva

Abstract: Systems or software agents do not always agree on the information being shared, justifying the use of distinct ontologies for the same domain. For achieving interoperability, declarative mappings are used as a basis for exchanging information between systems. However, in dynamic environments like the Web and the Semantic Web, ontologies constantly evolve, potentially leading to invalid ontology mappings. This paper presents two approaches for managing ontology mapping evolution: a user-centric approach in which the user defines the mapping evolution strategies to be applied automatically by the system, and a semantic-based approach, in which the ontology’s evolution logs are exploited to capture the semantics of changes and then adapted to (and applied on at) the ontology mapping evolution process.
Download

Paper Nr: 562
Title:

A SERVICE-BASED APPROACH FOR DATA INTEGRATION BASED ON BUSINESS PROCESS MODELS

Authors:

Hesley Py, Lucia Castro, Fernanda Baião and Asterio Tanaka

Abstract: Business-IT alignment is gaining more importance in enterprises, and is already considered essential for efficiently achieving enterprise goals. This led organizations to follow Enterprise Architecture approaches, with the Information Architecture as one of its pillars. Information architecture aims at providing an integrated and holistic view of the business information, and this requires applying a data integration approach. However, despite several works on data integration research, the problem is far from being solved. The highly heterogeneous computer environments present new challenges such as distinct DBMSs, distinct data models, distinct schemas and distinct semantics, all in the same scenario. On the other hand, new issues in enterprise environment, such as the emergence of BPM and SOA approaches, contribute to a new solution for the problem. This paper presents a service-based approach for data integration, in which the services are derived from the organization’s business process models. The proposed approach comprises a framework of different types of services (data services, concept services), a method for data integration service identification from process models, and a metaschema needed for the automation and customization of the proposed approach in a specific organization. We focus on handling heterogeneities with regard to different DBMSs and differences among data models, schemas and semantics
Download

Paper Nr: 567
Title:

AUTOMATIC DERIVATION OF SPRING-OSGI BASED WEB ENTERPRISE APPLICATIONS

Authors:

Elder Cirilo, Uirá Kulesza and Carlos J. Pereira de Lucena

Abstract: Component-based technologies (CBTs) are nowadays widely adopted in the development of different kinds of applications. They provide functionalities to facilitate the management of the application components and their different configurations. Spring and OSGi are two relevant examples of CBTs in the mainstream scenario. In this paper, we explore the use of Spring/OSGi technologies in the context of automatic product derivation. We illustrated through a typical web-based enterprise application: (i) how different models of a feature-based product derivation tool can be automatically generated based on the configuration files of Spring and OSGi, and Java annotations; and (ii) how the different abstractions provided by these CBTs can be related to a feature model with the aim to automatically derive an Spring/OSGi based application or product line.
Download

Paper Nr: 585
Title:

Grounding and Making Sense of Agile Software Development

Authors:

Mark Woodman and Aboubakr Moteleb

Abstract: The paper explores areas of strategic frameworks for sense-making, knowledge management and Grounded Theory methodologies to offer a rationalization of some aspects of agile software development. In a variety of projects where knowledge management form part of the solution we have begun to see activities and principles that closely correspond to many aspects of the wide family of agile development methods. We offer reflection on why as a community we are attracted to agile methods and consider why they work.
Download

Paper Nr: 633
Title:

KEYMANTIC: A KEYWORD-BASED SEARCH ENGINE USING STRUCTURAL KNOWLEDGE

Authors:

Francesco Guerra, Mirko Orsini, Claudio Sartori, Sonia Bergamaschi and Antonio Sala

Abstract: Traditional techniques for query formulation need the knowledge of the database contents, i.e. which data are stored in the data source and how they are represented. In this paper, we discuss the development of a keyword-based search engine for structured data sources. The idea is to couple the ease of use and flexibility of keyword-based search with metadata extracted from data schemata and extensional knowledge which constitute a semantic network of knowledge. Translating keywords into SQL statements, we will develop a search engine that is effective, semantic-based, and applicable also when instance are not continuously available, such as in integrated data sources or in data sources extracted from the deep web.
Download

Paper Nr: 641
Title:

ESPACE: Web-scale Integration One Step At A Time

Authors:

Kajal Claypool, Jeremy Mineweaser, Dan Van Hook, Elke Rundensteiner and Michael Scarito

Abstract: This paper presents ESpace, a community collaboration infrastructure, that provides a social, collaborative network that allows its users to harness the collective knowledge to promote communities of interest and expertise within the enterprise. ESpace is a prototype for a pay-as-you-go integration framework that supports loosely to tightly integrated resources within the same infrastructure, where loose integration is supported in the sense of pulling resources on the web together, based on the tag meta-information associated with them. This is but the first step in enabling web-scale pay-as-you-go integration by providing fine-grained analysis and integrating substructures within resources – achieving tighter integration for select resources on the user’s behest.
Download

Paper Nr: 52
Title:

GENERIC APPROACH TO AUTOMATIC INDEX UPDATING IN OODBMS

Authors:

Tomasz Kowalski, Radosław Adamus, Kamil Kuliberda and Jacek Wiślicki

Abstract: In this paper, we describe a robust approach to the problem of the automatic index updating, i.e. maintaining cohesion between data and indices. Introducing object-oriented notions (classes, inheritance, polymorphism, class methods, etc.) in databases allows defining more complex selection predicates; nevertheless, in order to facilitate selection process through indices, index updating requires substantial revising. Inadequate index maintenance can lead to serious errors in query processing what has been shown on the example of Oracle 11g ORDBMS. The authors work is based on the Stack-Based Architecture (SBA) and has been implemented and tested in the ODRA (Object Database for Rapid Applications development) OODBMS prototype.
Download

Paper Nr: 127
Title:

Semi-Supervised Information Extraction from Variable-Length Web-Page Lists

Authors:

Daniel Nikovski, Alan Esenther and Akihiro Baba

Abstract: We propose two methods for constructing automated programs for extraction of information from a class of web pages that are very common and of high practical significance --- variable-length lists of records with identical structure. Whereas most existing methods would require multiple example instances of the target web page in order to be able to construct extraction rules, our algorithms require only a single example instance. The first method analyzes the document object model (DOM) tree of the web page to identify repeatable structure that includes all of the specified data fields of interest. The second method provides an interactive way of discovering the list node of the DOM tree by visualizing the correspondence between portions of XPath expressions and visual elements in the web page. Both methods construct extraction rules in the form of XPath expressions, facilitating ease of deployment and integration with other information systems.
Download

Paper Nr: 157
Title:

TOWARDS A COMMON PUBLIC SERVICE INFRASTRUCTURE FOR SWISS UNIVERSITIES

Authors:

Florian Schnabel, Uwe Heck and Eva Bucherer

Abstract: Due to the Bologna Declaration and the according procedures of performance management and output funding universities are undergoing organisational changes both within and across the universities. The need for an appropriate organisational structure and for efficient and effective processes makes the support through a correspondent IT essential. The IT environment of Swiss universities is currently dominated by a high level of decentralisation and a high degree of proprietary solutions. Economies of scale through joint development or shared services remain untapped. Also the increasingly essential integration of applications to support either university-internal or cross-organizational processes is hindered. In this paper we propose an approach for a comprehensive service-oriented architecture for swiss universities to overcome the current situation and to cope with organizational and technical challenges. We further present an application scenario revealing how Swiss universities will benefit of the proposed architecture.
Download

Paper Nr: 200
Title:

AN ARCHITECTURE FOR THE RAPID DEVELOPMENT OF XML-BASED WEB APPLICATIONS

Authors:

José P. Leal and Jorge B. Gonçalves

Abstract: Our research goal is the generation of working web applications from high level specifications. Based on our experience in using XML transformations for that purpose, we applied this approach to the rapid development of database management applications. The result is an architecture that defines of a web application as a set of XML transformations, and generates these transformations using second order transformations from a database schema. We used the Model-View-Controller architectural pattern to assign different roles to transformations, and defined a pipeline of transformations to process an HTTP request. The definition of these transformations is based on a correspondence between data-oriented XML Schema definitions and the Entity-Relationship model. Using this correspondence we were able produce transformations that implement database operations, forms interfaces generators and application controllers, as well as the second order trans- formations that produce all of them. This paper includes also a description of a RAD system following this architecture that allowed us to perform a critical evaluation of this proposal.
Download

Paper Nr: 211
Title:

ASSESSING DATABASES IN .NET: COMPARING APPROACHES

Authors:

Daniela da Cruz and Pedro R. Henriques

Abstract: Language-Integrated Query (LINQ) appeared recently as the new language of the .NET framework — is the new kid of the town. This query-language, an extension to C# and Visual Basic, allows the query expressions to benefit from the features previously available only to imperative code — the rich metadata, IntelliSense, compile-time syntax checking, and static typing. In this paper, we intend to compare the methods provided by .NET to query databases (LINQ, SQL and Objects). This comparison will be done in terms of performance and in terms of the approach used. To guide this comparison, a running-example will be used.
Download

Paper Nr: 246
Title:

Automatic Detection of Duplicated Attributes in Ontology

Authors:

Irina Astrova and Arne Koschel

Abstract: Semantic heterogeneity is the ambiguous interpretation of terms describing the meaning of data in heterogeneous data sources such as databases. This is a well-known problem in data integration. A recent solution to this problem is to use ontologies, which is called ontology-based data integration. However, ontologies can contain duplicated attributes, which can lead to improper integration results. This paper proposes a novel approach that analyzes a workload of queries over an ontology to automatically calculate (semantic) distances between attributes, which are then used for duplicate detection.
Download

Paper Nr: 268
Title:

Efficiently Locating Web Services Using A Sequence-based Schema Matching Approach

Authors:

Alsayed Algergawy, Gunter Saake and Eike Schallehn

Abstract: Locating desiredWeb services has become a challenging research problem due to the vast number of available Web services within an organization and on the Web. This necessitates the need for developing flexible, effective, and efficient Web service discovery frameworks. To this purpose, both the semantic description and the structure information of Web services should be exploited in an efficient manner. This paper presents a flexible and efficient service discovery approach, which is based on the use of the Pr¨ufer encoding method to construct a one-to-one correspondence between Web services and sequence representations. In this paper, we describe and experimentally evaluate our Web service discovery approach.
Download

Paper Nr: 328
Title:

A FLEXIBLE EVENT-CONDITION-ACTION (ECA) RULE PROCESSING MECHANISM BASED ON A DYNAMICALLY RECONFIGURABLE STRUCTURE

Authors:

Xiang Li, Ying Qiao and Hongan Wang

Abstract: Adding and deleting Event-Condition-Action (ECA) rules, i.e. modifications of processing structures in an active database are expected to happen on-the-fly, to cause minimum impact on existing processing procedures of ECA rules. In this paper, we present a flexible ECA rule processing mechanism in active database. It uses a dynamically reconfigurable structure, called unit-mail graph (UMG) and a middleware, called Unit Modification and Management Layer (UMML) to localize the impact of adding and deleting ECA rules so as to support on-the-fly rule modification. The ECA rule processing mechanism can continue to work when the user adds or deletes the rules. This makes active database to be able to react to external events arriving at the system during rule modification. We also use a smart home environment to evaluate our work.
Download

Paper Nr: 477
Title:

AN MDA APPROACH FOR OBJECT-RELATIONAL MAPPING

Authors:

Catalin Strimbei and Marin Fotache

Abstract: This paper reviews several emergent approaches that attempt to capitalize on SQL data “engineering” standard in current object-to-object relational mapping methodologies. As a particular contribution we will discuss a slightly different OR mapping approach, based on ORDBMS extension mechanisms that allow to publish new data structures as Abstract Data Types (ADT).
Download

Paper Nr: 483
Title:

INNOVATIVE PROCESS EXECUTION IN SERVICE-ORIENTED ENVIRONMENTS

Authors:

Dirk Habich, Wolfgang Lehner, Steffen Preissler and Hannes Voigt

Abstract: Today's information systems are often built on the foundation of service-oriented environments. Although the fundamental purpose of an information system is the processing of data and information, the service-oriented architecture (SOA) does not treat data as a core first class citizen. Current SOA technologies support neither the explicit modeling of data flows in common business process modeling languages (such as BPMN) nor the usage of specialized data transformation and propagation technologies (for instance ETL-tools) on the process execution layer (BPEL). In this paper, we introduce our data-aware approach on the execution perspective as well as on the modeling perspective of business processes.
Download

Paper Nr: 630
Title:

TISM, a Tool for Information Systems Management

Authors:

António Trigo, João Barroso and João Varajão

Abstract: The complexity of Information Technology and Information Systems within organizations keeps growing rapidly. As a result, the work of the Chief Information Officer is becoming increasingly difficult, since he has to manage multiple technologies and perform several activities of different nature. In this position paper, we prove the development of a new tool for Chief Information Officers, which will systematize and aggregate the enterprise Information Systems Function information.
Download

Area 2 - Artificial Intelligence and Decision Support Systems

Full Papers
Paper Nr: 85
Title:

A Self-Learning System for Object Categorization

Authors:

Danil Prokhorov

Abstract: We propose a learning system for object categorization which utilizes information from multiple sensors. The system learns not only prior to its deployment in a supervised mode but also in a self-learning mode. A competition based neural network learning algorithm is used to distinguish between representations of different categories. We illustrate the system application on an example of image categorization. A radar guides a selection of candidate images provided by the camera for subsequent analysis by our learning method. Radar information gets coupled with navigational information for improved localization of objects during self-learning.

Paper Nr: 88
Title:

A SELF-TUNING OF MEMBERSHIP FUNCTIONS FOR MEDICAL DIAGNOSIS

Authors:

Nuanwan Soonthornphisaj

Abstract: In this paper, a self-tuning of membership functions for fuzzy logic is proposed for medical diagnosis. Our algorithm uses decision tree as a tool to generate three kinds of membership functions which are triangular, bell shape and Gaussian curve. The system can automatically select the best form of membership function for the classification process that can provide the best classification result. The advantage of our system is that it doesn’t need the expert to create a membership functions for each feature. But the system can create various membership functions using learning algorithm that learns from the training set. In some domains, user can provide prior knowledge that can be used to enhance the performance of the classifier. However, in medical domain, we found that some diseases are difficult to diagnose. It would not be a problem if that disease has been completely explored in medical area. In order to rule out the patient, we need a domain expert to provide the membership function for many attributes obtained from the laboratory test. Since the disease has not been completely explored in medical area, the membership function provided by the expert might be biased and lead to the poor classification performance. The performance of our proposed algorithm has been investigated on 2 medical data sets. The experimental results show that our approach can effectively enhance the classification performance compare to neural networks and the traditional fuzzy logic.

Paper Nr: 128
Title:

INSOLVENCY PREDICTION OF IRISH COMPANIES USING BACKPROPAGATION AND FUZZY ARTMAP NEURAL NETWORKS

Authors:

Anatoli Nachev, Borislav Stoyanov and Seamus Hill

Abstract: This study explores experimentally the potential of BPNNs and Fuzzy ARTMAP neural networks to predict insolvency of Irish firms. We used financial information for Irish companies for a period of six years, preprocessed properly in order to be used with neural networks. Prediction results show that with certain network parameters the Fuzzy ARTMAP model outperforms BPNN. It outperforms also self-organising feature maps as reported by other studies that use the same dataset. Accuracy of predictions was validated by ROC analysis, AUC metrics, and leave-one-out cross-validation.

Paper Nr: 135
Title:

FREQUENT SUBGRAPH-BASED APPROACH FOR CLASSIFYING VIETNAMESE TEXT DOCUMENTS

Authors:

Tu H. Nguyen and Kiem Hoang

Abstract: In this paper we present a simple approach for Vietnamese text classification without word segmentation, based on frequent subgraph mining techniques. A graph-based instead of traditional vector-based model is used for document representation. The classification model employs structural patterns (subgraphs) and Dice measure of similarity to identify a class of documents. This method is evaluated on Vietnamese data set for measuring classification accuracy. Results show that it can outperform k-NN algorithm (based on vector, hybrid document representation) in terms of accuracy and classification time.

Paper Nr: 173
Title:

RANDOM PROJECTION ENSEMBLE CLASSIFIERS

Authors:

Alon Schclar and Lior Rokach

Abstract: We introduce a novel ensemble model based on random projections. The contribution of using random projections is two-fold. First, the randomness provides the diversity which is required for the construction of an ensemble model. Second, random projections embed the original set into a space of lower dimension while preserving the dataset’s geometrical structure to a given distortion. This reduces the computational complexity of the model construction as well as the complexity of the classification. Furthermore, dimensionality reduction removes noisy features from the data and also represents the information which is inherent in the raw data by using a small number of features. The noise removal increases the accuracy of the classifier. The proposed scheme was tested using WEKA based procedures that were applied to 16 benchmark dataset from the UCI repository.

Paper Nr: 188
Title:

KNOWLEDGE REUSE IN DATA MINING PROJECTS AND ITS PRACTICAL APPLICATION

Authors:

Rodrigo Cunha, Paulo Adeodato and Silvio Romero De Lemos Meira

Abstract: The objective of this paper is providing an integrated environment for knowledge reuse in KDD, for preventing recurrence of known errors and reinforcing project successes, based on previous experience. It combines methodologies from project management, data warehousing, data mining and knowledge representation. Different from purely algorithmic papers, this one focuses on performance metrics used for managerial such as the time taken for solution development, the amount of files not automatically managed and other, while preserving equivalent performance on the technical solution quality metrics. This environment has been validated with metadata collected from previous KDD projects developed and deployed for real world applications by the development team members. The case study carried out in actual contracted projects have shown that this environment assesses the risk of failure for new projects, controls and documents all the KDD project development process and helps understanding the conditions that lead KDD projects to success or failure.

Paper Nr: 231
Title:

Enhancing Text Clustering Performance Using Semantic Similarity

Authors:

Walaa Gad and Mohamed Kamel

Abstract: Text documents clustering can be challenging due to complex linguistics properties of the text documents. Most of clustering techniques are based on traditional bag of words to represent the documents. In such document representation, ambiguity, synonymy and semantic similarities may not be captured using traditional text mining techniques that are based on words and/or phrases frequencies in the text. In this paper, we propose a semantic similarity based model to capture the semantic of the text. The proposed model in conjunction with lexical ontology solves the synonyms and hypernyms problems. It utilizes WordNet as an ontology and uses the adapted Lesk algorithm to examine and extract the relationships between terms. The proposed model reflects the relationships by the semantic weighs added to the term frequency weight to represent the semantic similarity between terms. Experiments using the proposed semantic similarity based model in text clustering are conducted. The obtained results show promising performance improvements compared to the traditional vector space model as well as other existing methods that include semantic similarity measures in text clustering.

Paper Nr: 233
Title:

Stereo Matching Using Synchronous Hopfield Neural Network

Authors:

Te-Hsiu Sun

Abstract: Deriving depth information has been an important issue in computer vision. In this area, stereo vision is an important technique for 3D information acquisition. This paper presents a scnaline-based stereo matching technique using synchronous Hopfield neural networks (SHNN). Feature points are extracted and selected using the Sobel operator and a user-defined threshold for a pair of scanned images. Then, the scanline-based stereo matching problem is formulated as an optimization task where an energy function, including dissimilarity, continuity, disparity and uniqueness mapping properties, is minimized. Finally, the incorrect matches are eliminated by applying a false target removing rule. The proposed method is verified with an experiment using several commonly used stereo images. The experimental results show that the proposed method solves effectively the stereo matching problem and is applicable to various areas.

Paper Nr: 245
Title:

Monotonic Monitoring of Discrete-Event Systems with Uncertain Temporal Observations

Authors:

Marina Zanella and Gianfranco Lamperti

Abstract: In discrete-event system monitoring, the observation is fragmented over time and a set of candidate diagnoses is output at the reception of each fragment (so as to allow for possible control and recovery actions). When the observation is uncertain (typically, a DAG with partial temporal ordering) a problem arises about the significance of the monitoring output: two sets of diagnoses, relevant to two consecutive observation fragments, may be unrelated to one another, and, even worse, they may be unrelated to the actual diagnosis. To cope with this problem, the notion of monotonic monitoring is introduced, which is supported by specific constraints on the fragmentation of the uncertain temporal observation, leading to the notion of stratification. The paper shows that only under stratified observations can significant monitoring results be guaranteed.

Paper Nr: 250
Title:

A SERVICE COMPOSITION FRAMEWORK FOR DECISION MAKING UNDER UNCERTAINTY

Authors:

Malak Al-Nory, Alexander Brodsky and Hadon Nash

Abstract: Proposed and developed is a service composition framework for decision-making under uncertainty, which is applicable to stochastic optimization of supply chains. Also developed is a library of modeling components which include Scenario, Random Environment, and Stochastic Service. Service models are classes in the Java programming language extended with decision variables, assertions, and business objective constructs. The constructor of a stochastic service formulates a recourse stochastic program and finds the optimal instantiation of real values into the service initial and corrective decision variables leading to the optimal business objective. The optimization is not done by repeated simulation runs, but rather by automatic compilation of the simulation model in Java into a mathematical programming model in AMPL and solving it using an external solver.

Paper Nr: 295
Title:

A MULTI-CRITERIA RESOURCE SELECTION METHOD FOR SOFTWARE PROJECTS USING FUZZY LOGIC

Authors:

Daniel A. Callegari and Ricardo M. Bastos

Abstract: When planning a software project, we must assign resources to tasks. Resource selection is a fundamental step to resource allocation since we first need to find the most suitable candidates for each task before deciding who will actually perform them. In order to rank available resources, we have to evaluate their skills and define the corresponding selection criteria for the tasks. While being the choice of many approaches, representing skill levels by means of ordinal scales and defining selection criteria using binary operations imply some limitations. Pure mathematical approaches are difficult to model and suffer from a partial loss in meaning in terms of knowledge representation. Fuzzy Logic, as an extension to classical sets and logic, uses linguistic variables and a continuous range of truth values for decision and set membership. It allows handling inherent uncertainties in this process, while hiding the complexity from the final user. In this paper we show how Fuzzy Logic can be applied to the resource selection problem. A prototype was built to demonstrate and evaluate the results.

Paper Nr: 382
Title:

AN OPTIMIZED HYBRID KOHONEN NEURAL NETWORK FOR AMBIGUITY DETECTION IN CLUSTER ANALYSIS USING SIMULATED ANNEALING

Authors:

Ehsan Mohebi

Abstract: One of the popular tools in the exploratory phase of Data mining and Pattern Recognition is the Kohonen Self Organizing Map (SOM). The SOM maps the input space into a 2-dimensional grid and forms clusters. Recently experiments represented that to catch the ambiguity involved in cluster analysis, it is not necessary to have crisp boundaries in some clustering operations. In this paper to overcome the ambiguity involved in cluster analysis, a combination of Rough set Theory and Simulated Annealing is proposed that has been applied on the output grid of SOM. Experiments show that the proposed two-stage algorithm, first using SOM to produce the prototypes then applying rough set and SA in the second stage in order to assign the overlapped data to true clusters they belong to, outperforms the proposed crisp clustering algorithms (i.e. I-SOM) and reduces the errors.

Paper Nr: 441
Title:

INTERACTIVE QUALITY ANALYSIS IN THE AUTOMOTIVE INDUSTRY: Concept and Design of an Interactive, Web-based Data Mining Application

Authors:

Steffen Fritzsche, Markus Mueller and Carsten Lanquillon

Abstract: In this paper we present an interactive, web-based data mining application that supports quality analysis in the automotive industry. Our tool is designed to help automotive engineers in their task of identifying the root cause of quality issues. Knowing what exactly caused a problem and identifying vehicles that are most likely to be affected by the issue, helps in planning and implementing effective service actions. We show how data mining can be applied in the given application domain, point out the key role of interactivity and propose an appropriate software architecture.

Paper Nr: 466
Title:

NARFO ALGORITHM: MINING NON-REDUNDANT AND GENERALIZED ASSOCIATION RULES BASED ON FUZZY ONTOLOGIES

Authors:

Rafael G. Miani, Marilde Terezinha Prado Santos, Cristiane A. Yaguinuma and Mauro Biajiz

Abstract: Traditional approaches for mining generalized association rules are based only on database contents, and focus on exact matches among items. However, in many applications, the use of some background knowledge, as ontologies, can enlarge the discovery process and generate semantically richer rules. In this way, this paper proposes the NARFO algorithm, a new algorithm for mining non-redundant and generalized association rules based on fuzzy ontologies. Fuzzy ontology is used as background knowledge, to aid the discovery process and the generation of rules. One contribution of this work is the generalization of non-frequent itemsets that helps to extract important and meaningful knowledge. NARFO algorithm also contributes at post-processing stage with its generalization and redundancy treatment. Our experiments showed that the number of rules had been reduced considerably, without redundancy, obtaining 49.45% of reduction for a very low minimum support value (0.05) in comparison with XSSDM algorithm.

Paper Nr: 516
Title:

AUTOMATED CONSTRUCTION OF PROCESS GOAL TREES FROM EPC-MODELS TO FACILITATE EXTRACTION OF PROCESS PATTERNS

Authors:

Andreas Bögl, Michael Schrefl, Gustav Pomberger and Norbert Weber

Abstract: A system that enables reuse of process solutions should be able to retrieve “common” or “best practice” pattern solutions (common modelling practices) from existing process descriptions for a certain business goal. A manual extraction of common modelling practices is labour-intensive, tedious and cumbersome. This paper presents an approach for an automated extraction of process goals from Event-driven Process Chains (EPC) and its annotation to EPC functions and events. In order to facilitate goal reasoning for the identification of common modelling practices an algorithm (G-Tree-Construction) is proposed that constructs a hierarchical goal tree.

Short Papers
Paper Nr: 58
Title:

AUTOMATIC INFORMATION PROCESSING AND UNDERSTANDING IN COGNITIVE BUSINESS SYSTEMS

Authors:

Lidia Ogiela, Ryszard Tadeusiewicz and Marek Ogiela

Abstract: The concept of new generation in area of information systems is automatic understanding systems (AUS) to the attention of the computer sciences community as a new possibility for the systems analysis and design. The novelty of this new idea is in the previously used method of automatic understanding in the area of medical image analysis, classification and interpretation, to a more general and needed area of systems analysis. The concept of the AUS systems approach is, in essence, different from other approaches such as, for example, those based on neural networks, pattern analysis, image interpretation or machine learning. AUS enables the determination of the meaning of analysed data, both numeric and descriptive. Cognitive methods, on which the AUS concept and construct are based, have roots in the psychological and neurophysiological processes of understanding and describing analysed data as they take place in the human brain.
Download

Paper Nr: 60
Title:

Detecting domestic violence

Authors:

Paul Elzinga, Guido Dedene, Stijn Viaene and Jonas Poelmans

Abstract: Over 90% of the case data from police inquiries is stored as unstructured text in police databases. We use the combination of Formal Concept Analysis and Emergent Self Organizing Maps for exploring a dataset of unstructured police reports out of the Amsterdam-Amstelland police region in the Netherlands. In this paper, we specifically aim at making the reader familiar with how we used these two tools for browsing the dataset and how we discovered useful patterns for labelling cases as domestic or as non-domestic violence.
Download

Paper Nr: 68
Title:

USING QUALITY COSTS IN A MULTI-AGENT SYSTEM FOR AN AIRLINE OPERATIONS CONTROL

Authors:

Antonio M. Castro and Eugénio Oliveira

Abstract: The Airline Operations Control Centre (AOCC) tries to solve unexpected problems that might occur during the airline operation. Problems related to aircrafts, crewmembers and passengers are common and the actions towards the solution of these problems are usually known as operations recovery. Usually, the AOCC tries to minimize the operational costs while satisfying all the required rules. In this paper we present the implementation of a Distributed Multi-Agent System (MAS) representing the existing roles in an AOCC. This MAS has several specialized software agents that implement different algorithms, competing to find the best solution for each problem that include not only operational costs but, also, quality costs so that passenger satisfaction can be considered in the final decision. We present a real case study where a crew recovery problem is solved. We show that it is possible to find valid solutions, with better passenger satisfaction and, in certain conditions, without increasing significantly the operational costs.
Download

Paper Nr: 120
Title:

Frequency Assignment Optimization using the Swarm Intelligence Multi-agent Based Algorithm (SIMBA)

Authors:

Grant B. O'Reilly

Abstract: The swarm intelligence multi-agent based algorithm (SIMBA) is presented in this paper. The SIMBA utilizes swarm intelligence and a multi-agent system (MAS) to optimize the frequency assignment problem (FAP). The SIMBA optimises by considering both local and global i.e. collective solutions in the optimization process. Stigmergy single cell optimization (SSCO) is also used by the individual agents in SIMBA. SSCO enables the agents to recognize interference patterns in the frequency assignment structure that is being optimized and to augment it with frequency selections that minimized the interference. The changing configurations of the frequency assignment structure acts as a source of information that aids the agents when making further decisions. Due to the increasing demand of cellular communication services and the available frequency spectrum optimal frequency assignment is necessary. The SIMBA was used to optimize the fixed-spectrum frequency assignment problem (FS-FAP) in cellular radio networks. The results produced by the SIMBA were benchmarked against the COST 259 Siemens scenarios. The frequency assignment solutions produced by the SIMBA were also implemented in a commercial cellular radio network and the results are presented.
Download

Paper Nr: 121
Title:

A new heuristic function in Ant-Miner approach

Authors:

Urszula Boryczka and Jan Kozak

Abstract: In this paper, a novel rule discovery system that utilizes the Ant Colony Optimization (ACO) is presented. The ACO is a metaheuristic inspired by the behavior of real ants, where they search optimal solutions by considering both local heuristic and previous knowledge, observed by pheromone changes. In our approach we want to ensure the good performance of Ant-Miner by applying the new versions of heuristic functions in a main rule. We want to emphasize the role of the heuristic function by analyzing the influence of different propositions of these functions to the performance of Ant-Miner. The comparative study will be done using the 5 data sets from the UCI Machine Learning repository.
Download

Paper Nr: 129
Title:

FORMULATING ASPECTS OF PAYPAL IN THE LOGIC FRAMEWORK OF GBMF

Authors:

Min Li and Christopher Hogger

Abstract: Logic-based modelling methods can benefit business organizations in constructing models offering flexible knowledge representation supported by correct and effective inference. It remains a continuing research issue as to how best to apply logic-based formalization to informal/semi-formal business modelling. In this paper, we formulate aspects of the general business specification of PayPal in logic programming by applying this in logic-based GBMF which is a declarative, context-independent, implementable and highly expressive framework for modelling high-level aspects of business. In particular, we introduce the primary PayPal business concepts and relations; specify simple but essential PayPal business processes associated with a knowledge base, and set core business rules and controls to simulate the PayPal case in a fully automatic manner. This specific modelling method gives the advantages of general-purpose expressiveness and well-understood execution regimes, avoiding the need for a special-purpose engine supporting a specialized modelling language.
Download

Paper Nr: 133
Title:

AN AGENT-BASED SYSTEM FOR HEALTHCARE PROCESS MANAGEMENT

Authors:

Bian Wu, Maggie M. Wang and Hongmin Yun

Abstract: An effective approach for healthcare process management is the key to delivery of high-quality services in healthcare. An agent-based and process-oriented system is presented in this study to facilitate dynamic and interactive processes in healthcare environment. The system is developed in three layers: the agent layer for healthcare process management, the database layer for maintenance of medical records and knowledge, and the interface layer for human-computer interaction. The treatment of primary open angle glaucoma is used as an example to demonstrate the effectiveness of approach.
Download

Paper Nr: 136
Title:

AOI BASED NEUROFUZZY SYSTEM TO EVALUATE SOLDER JOINT QUALITY

Authors:

Girolamo Fornarelli, Gioacchino Brunetti, Domenico Maiullari, Giuseppe Acciani and Antonio Giaquinto

Abstract: Surface Mount Technology is extensively used in the production of Printed Circuit Boards due to the high level of density in the electronic device integration. In such production process several defects could occur on the final electronic components, compromising their correct working. In this paper a neurofuzzy solution to process information deriving from an automatic optical system is proposed. The designed solution provides a Quality Index of a solder joint, by reproducing the modus operandi of an expert and making it automatic. Moreover, the considered solution presents some attractive advantages: a complex acquisition system is not needed, reducing the equipment costs and shifting the assessment of a solder joint on the fuzzy parts. Finally, the typical low computational costs of the fuzzy systems could satisfy urgent time constrains in the in-line detection of some industrial productive processes.
Download

Paper Nr: 145
Title:

AN ORDER CLUSTERING SYSTEM USING ART2 NEURAL NETWORK AND PARTICLE SWARM OPTIMIZATION METHODN

Authors:

R. J. Kuo, T. W. Huang, M. J. Wang and Tung-Lai Hu

Abstract: Surface mount technology (SMT) production system set up is quite time consuming for industrial personal computers (PC) because of high level of customization. Therefore, this study intends to propose a novel two-stage clustering algorithm for grouping the orders together before scheduling in order to reduce the SMT setup time. The first stage first uses the adaptive resonance theory 2 (ART2) neural network for finding the number of clusters and then feed the results to the second stage, which uses particle swarm K-means optimization (PSKO) algorithm. An internationally well-known industrial PC manufacturer provided the related evaluation information. The results show that the proposed clustering method outperforms other three clustering algorithms. Through order clustering, scheduling products belonging to the same cluster together can reduce the production time and the machine idle time.
Download

Paper Nr: 162
Title:

Using UML Class Diagram as a Knowledge Engineering Tool

Authors:

Thomas Raimbault, Stéphane Loiseau and David Genest

Abstract: UML class diagram is the de facto standard, including in Knowledge Engineering, for modeling structural knowledge of systems. Attaching importance to visual representation and based on a previous work, where we have given a logical defined extension of UML class diagram to represent queries and constraints into the UML visual environment, we present here how using the model of conceptuals graphs to answer queries and to check constraints in concrete terms.
Download

Paper Nr: 167
Title:

K-ANNOTATIONS, An Approach for Conceptual Knowledge Implementation using Metadata Annotations

Authors:

Eduardo S. Estima de Castro, Roberto Tom Price and Mara Abel

Abstract: A number of Knowledge Engineering methodologies have been proposed during the last decades. These methodologies use different languages for knowledge modelling. As most of these languages are based on logic, knowledge models defined using theses languages cannot be easily converted to the Object-Oriented (OO) paradigm. This brings a relevant problem to the development phase of KS projects: several complex knowledge systems are developed using OO languages. So, even if the conceptual model can be modelled using the logical paradigm, it is important to provide a standard knowledge representation with the OO paradigm. This paper introduces the k-annotations, an approach for conceptual knowledge implementation using metadata annotations and the aspect oriented paradigm. The proposed approach allows the development of the conceptual model using the OO paradigm and it establishes a standard path to implement this model. The main goal of the approach is to provide ways to reuse both the knowledge design and related programming code of the model based on a single model representation.
Download

Paper Nr: 170
Title:

ANT PAGERANK ALGORITHM

Authors:

Mahmoud Z. Abdo, Manal Ahmed Ismail and Mohamed Ebraheem Eladawy

Abstract: The amount of global information in the World Wide Web is growing at an incredible rate. Millions of results are returned from search engines. The rank of pages in the search engines is very important. One of the basic rank algorithms is PageRank algorithm. This paper proposes an enhancement of PageRank algorithm to speed up the computational process. The enhancement of PageRank algorithm depends on using the Ant algorithm. On average, this technique yields about 7.5 out of ten relevant pages to the query topic, and the total time reduced by 19.9 %.
Download

Paper Nr: 176
Title:

STUDY ON IMAGE CLASSIFICATION BASED ON SVM AND THE FUSION OF MULTIPLE FEATURES

Authors:

Dequan Zheng, Tiejun Zhao, Sheng Li and Yufeng Li

Abstract: In this paper, an adaptive feature-weight adjusted image categorization algorithm was proposed, which is based on the SVM and the fusion of multiple features. Firstly, classifier of each feature was separately constructed, automatically learned the weight coefficient of each feature by training data set then, final, a complexity classifier was constructed by combining the separate classifier and the corresponding weight coefficient. The experiment result showed that our scheme improved the performance of image categorization and had adaptive ability comparing with general approach. Moreover, the scheme has certain robustness because of avoiding the impact brought by various dimension of each feature.
Download

Paper Nr: 183
Title:

A FUZZY-GUIDED GENETIC ALGORITHM FOR QUALITY ENHANCEMENT IN THE SUPPLY CHAIN

Authors:

Cassandra Tang and H.C.W Lau

Abstract: To respond to the globalization and fierce competition, manufacturers gradually realize the challenge of demanding customers who strongly seek for products of high-quality and low-cost, which implicitly calls for the quality improvement of the products in a cost-effective way. Traditional methods focused on specified process optimization for quality enhancement instead of emphasizing the organizational collaboration to ensure qualitative performance. This paper introduces artificial intelligence (AI) approach to attain quality enhancement by automating the selection of process parameters within the supply chain. The originality of this research is providing an optimal configuration of process parameters along the supply chain and delivering qualified outputs to raise customer satisfaction.
Download

Paper Nr: 186
Title:

OPTIMUM DCT COMPRESSION OF MEDICAL IMAGES USING NEURAL NETWORKS

Authors:

Adnan Khashman and Kamil Dimililer

Abstract: Medical imaging requires storage of large quantities of digitized data Efficient storage and transmission of medical images in telemedicine is of utmost importance however,. Due to the constrained bandwidth and storage capacity, a medical image must be compressed before transmission or storage. An ideal image compression system must yield high quality compressed images with high compression ratio; this can be achieved using DCT-based image compression, however the contents of the image affects the choice of an optimum compression ratio. In this paper, a neural network is trained to relate the x-ray image contents to their optimum compression ratio. Once trained, the optimum DCT compression ratio of the x-ray image can be chosen upon presenting the image to the network. Experimental results suggest that out proposed system, can be efficiently used to compress x-rays while maintaining high image quality.
Download

Paper Nr: 210
Title:

A Mining Framework to detect non-technical losses in Power Utilities

Authors:

Felix Biscarri, Ignacio Monedero, Carlos León de Mora, Juan Ignacio Guerrero, Jesús Biscarri and Rocío Millán

Abstract: This paper deals with the characterization of customers in power companies in order to detect consumption Non-Technical Losses (NTL). A new framework is presented, to find relevant knowledge about the particular characteristics of the electric power customers. The authors uses two innovative statistical estimators to weigh variability and trend of the customer consumption. The final classification model is presented by a rule set, based on discovering association rules in the data. The work is illustrated by a case study considering a real data base.
Download

Paper Nr: 216
Title:

Intelligent Surveillance for Trajectory Analysis

Authors:

Javier Alonso Albusac Jiménez, José Jesus Castro-Schez, Lorenzo M. López-López, David Vallejo and Luis Jiménez Linares

Abstract: Recently, there is a growing interest in the development and deployment of intelligent surveillance systems capable of finding out and analyzing simple and complex events that take place on scenes monitored by cameras. Within this context, the use of expert knowledge may offer a realistic solution when dealing with the design of a surveillance system. In this paper, we briefly describe the architecture of an intelligent surveillance system based on normality components and expert knowledge. These components specify how a certain object must ideally behave according to one concept. A specific normality component which analyzes the trajectories followed by objects is studied in depth in order to analyze behaviors in an outdoor environment. The analysis of trajectories in the surveillance context is an interesting issue because any moving object has always a goal in an environment, and it usually goes towards one destination to achieve it.
Download

Paper Nr: 234
Title:

USING GRA FOR 2D INVARIANT OBJECT RECOGNITION

Authors:

Te-Hsiu Sun, C.H. Tang, J.C. Liu and Fang-Chih Tien

Abstract: Invariant features are vital to domain of pattern recognition. This research develops a vision-based invariant recognizer for 2D object. We perform a recognition method which adopted KRA invariant feature extractor and used grey relational analysis. The feature extraction is to derive translation, rotation, and scaling-free features through the sequential boundary and is described with its K-curvature. Our work represents the object profile with the K-curvature to obtain the position invariant property; and then the transformation of autocorrelation is to ensure orientation-invariant property. Experimental also reveals that proposed method with either GRA or MD methods offers distinctiveness and effectiveness for part recognition.
Download

Paper Nr: 244
Title:

AN INVESTIGATION INTO DYNAMIC CUSTOMER REQUIREMENT USING COMPUTATIONAL INTELLIGENCE

Authors:

Yih T. Chong and Chun-Hsien Chen

Abstract: The twenty-first century is marked by fast evolution of customer tastes and needs. Research has shown that customer requirements could vary in the temporal space between product conceptualisation and market introduction. In markets characterized by fast changing consumer needs, products generated might often not fit the consumer needs as the companies have originally expected. This paper advocates the proactive management and analysis of the dynamic customer requirements in bid to lower the risk inherent in developing products for fast shifting markets. A customer requirements analysis and forecast (CRAF) system that can address the issue is introduced in this paper. Computational intelligence methodologies, viz. artificial immune system and artificial neural network, are found to be potential techniques in handling and analysing dynamic customer requirements. The investigation aims to support product development functions in the pursuit of generating products for near future markets.
Download

Paper Nr: 247
Title:

The Role of Data Mining Techniques in Emergency Management

Authors:

Ning Chen and an chen

Abstract: Emergency management is becoming more and more attractive in both theory and practice due to the frequently occurring incidents in the world. The objective of emergency management is to make optimal decisions to decrease or diminish harm caused by incidents. Nowadays the overwhelming amount of information leads to a great need of effective data analysis for the purpose of well informed decision. The potential of data mining has been demonstrated through the success of decision-making module in present-day emergency management systems. In this paper, we review advanced data mining techniques applied in emergency management and indicate some promising future research directions.
Download

Paper Nr: 266
Title:

A decision support system for multi-plant assembly sequence planning using a pso approach

Authors:

Yuan-Jye Tseng, Feng-Yi Huang, Feng-Yi Huang and Jian-Yu Chen

Abstract: In a multi-plant collaborative manufacturing system in a global logistics chain, a product can be manufactured and assembled at different plants located at various locations. In this research, a decision support system for multi-plant assembly sequence planning is presented. The multi-plant assembly sequence planning model integrates two tasks, assembly sequence planning and plant assignment. In assembly sequence planning, the components and assembly operations are sequenced according to the operational constraints and precedence constraints to achieve assembly cost objectives. In plant assignment, the components and assembly operations are assigned to the suitable plants under the constraints of plant capabilities to achieve multi-plant cost objectives. A particle swarm optimization (PSO) solution approach is presented by encoding a particle using a position matrix defined by the numbers of components and plants. The PSO algorithm simultaneously performs assembly sequence planning and plant assignment with an objective of minimizing the total of assembly operational costs and multi-plant costs. The main contribution lies in the new multi-plant assembly sequence planning model and the new PSO solution method. The test results show that the presented method is feasible and efficient for solving the multi-plant assembly sequence planning problem. In this paper, an example product is tested and illustrated.
Download

Paper Nr: 294
Title:

TERM WEIGHTING: NOVEL FUZZY LOGIC BASED METHOD VS. CLASSICAL TF-IDF METHOD FOR WEB INFORMATION EXTRACTION

Authors:

Jorge Ropero, Ariel Gomez, Alejandro Carrasco Muñoz and Carlos León de Mora

Abstract: Solving Term Weighting problem is one of the most important tasks for Information Retrieval and Information Extraction. Tipically, the TF-IDF method have been widely used for determining the weight of a term. In this paper, we propose a novel alternative fuzzy logic based method. The main advantage for the proposed method is the obtention of better results, especially in terms of extracting not only the most suitable information but also related information. This method will be used for the design of a Web Intelligent Agent which will soon start to work for the University of Seville web page.
Download

Paper Nr: 297
Title:

DECISION SUPPORT SYSTEM FOR CLASSIFICATION OF NATURAL RISK IN MARITIME CONSTRUCTION

Authors:

Marco García, Andrés Alonso Quintanilla, Amelia Bilbao Terol and Alfredo Alguero

Abstract: The objective of this paper is the prevention of workplace hazards in maritime works – ports, drilling and others – that may arise from the natural surroundings: tides, wind, visibility, rain and so on. On the basis of both historical and predicted data in certain variables, a system has been designed that uses data mining techniques to provide prior decision-making support as to whether to execute given work on a particular day. The system also yields a numerical evaluation of the risk of performing the activity according to the additional circumstances affecting it: the number of workers and the machinery involved, the estimated monetary cost of an accident and so on. The computer tool used as a framework is powerful and versatile, allowing the user to define the activities engaged in work, each with the variables and types deemed suitable. Each variable can be fed data directly by the user or automatically from text files, tables, chromatic maps on web sites, data loggers, etc. The tool can also define alternative models for risk prognosis based on functional formulas of considerable complexity.
Download

Paper Nr: 300
Title:

Building tailored ontologies from very large knowledge resources

Authors:

Victoria Nebot and Rafael Berlanga Llavori

Abstract: Nowadays very large domain knowledge resources are being developed in domains like Biomedicine. Users and applications can benefit enormously from these repositories in very different tasks, such as visualization, vocabulary homogenizing and classification. However, due to their large size and lack of formal semantics, they cannot be properly managed and exploited. Instead, it is necessary to provide small and useful logic-based ontologies from these large knowledge resource so that they become manageable and the user can take benefit from the semantics encoded. In this work we present a novel framework for efficiently indexing and generating ontologies according to the user requirements. Moreover, the generated ontologies are encoded using OWL logic-based axioms so that ontologies are provided with reasoning capabilities. Such a framework relies on an interval labeling scheme that efficiently manages the transitive relationships present in the domain knowledge resources. We have evaluated the proposed framework over the Unified Medical Language System (UMLS). Results show very good performance and scalability, demonstrating the applicability of the proposed framework in real scenarios.
Download

Paper Nr: 315
Title:

A PROJECTION-BASED HYBRID SEQUENTIAL PATTERNS MINING ALGORITHM

Authors:

Chichang Jou

Abstract: Sequential pattern mining finds frequently occurring patterns of item sequences from serial orders of items in the transaction database. The set of frequent hybrid sequential patterns obtained by previous researches either is incomplete or does not scale with growing database sizes. We design and implement a Projection-based Hybrid Sequential PAttern Mining algorithm, PHSPAM, to remedy these problems. PHSPAM first builds Supplemented Frequent One Sequence itemset to collect items that may appear in frequent hybrid sequential patterns. The mining procedure is then performed recursively in the pattern growth manner to calculate the support of patterns through projected position arrays, projected support arrays, and projected databases. We compare the results and performances of PHSPAM with those of other hybrid sequential pattern mining algorithms, GFP2 and CHSPAM.
Download

Paper Nr: 316
Title:

The Signing of a Professional Athlete: Reducing Uncertainty with a Weighted Mean Hemimetric for Phi−Fuzzy Subsets

Authors:

Julio Rojas-Mora and Jaime Gil-Lafuente

Abstract: In this paper we present a tool to help reduce the uncertainty presented in the decision-making process associated to the selection and hiring of a professional athlete. A weighted mean hemimetric for Phi−fuzzy subsets with trapezoidal fuzzy numbers (TrFN) as their elements, allows to compare candidates to the “ideal” player that the technical body of a team believes should be hired.
Download

Paper Nr: 342
Title:

Graph Structure Learning for Task Ordering

Authors:

Yiming Yang, Abhimanyu Lad, Henry Shu, Bryan Kisiel, Chad Cumby, Rayid Ghani and Katharina Probst

Abstract: In many practical applications, multiple interrelated tasks must be accomplished sequentially through user interaction with retrieval, classification and recommendation systems. The ordering of the tasks may have a significant impact on the overall utility (or performance) of the systems; hence optimal ordering of tasks is desirable. However, manual specification of optimal ordering is often difficult when task dependencies are complex, and exhaustive search for the optimal order is computationally intractable when the number of tasks is large. We propose a novel approach to this problem by using a directed graph to represent partial-order preferences among task pairs, and using link analysis (HITS and PageRank) over the graph as a heuristic to order tasks based on how important they are in reinforcing and propagating the ordering preference. These strategies allow us to find near-optimal solutions with efficient computation, scalable to large applications. We conducted a comparative evaluation of the proposed approach on a form-filling application involving a large collection of business proposals from the Accenture Consulting & Technology Company, using SVM classifiers to recommend keywords, collaborators, customers, technical categories and other related fillers for multiple fields in each proposal. With the proposed approach we obtained near-optimal task orders that improved the utility of the recommendation system by 27% in macro-averaged F1, and 13% in micro-averaged F1, compared to the results obtained using arbitrarily chosen orders, and that were competitive against the best order suggested by domain experts.
Download

Paper Nr: 361
Title:

Key Characteristics in Selecting Software Tools for Knowledge Management Tools

Authors:

Hanlie Smuts, Alta Van der Merwe and Marianne Loock

Abstract: The shift to knowledge as the primary source of value results in the new economy being led by those who manage knowledge effectively. Today’s organisations are creating and leveraging knowledge, data and information at an unprecedented pace – a phenomenon that makes the use of technology not an option, but a necessity. Software tools in knowledge management are a collection of technologies and are not necessarily acquired as a single software solution. Furthermore, these knowledge management software tools have the advantage of using the organisation’s existing information technology infrastructure. Organisations and business decision makers spend a great deal of resources and make significant investments in the latest technology, systems and infrastructure to support knowledge management. It is imperative that these investments are validated properly, made wisely and that the most appropriate technologies and software tools are selected or combined to facilitate knowledge management. In this paper, we propose a set of characteristics that should support decision makers in the selection of software tools for knowledge management. These characteristics were derived from both in-depth interviews and existing theory in publications.
Download

Paper Nr: 387
Title:

Towards a Semantic System for Managing Clinical Processes

Authors:

Massimo Ruffolo

Abstract: Managing risks is an high priority theme for health care professionals and providers. A promising approach for reducing risks, and enhancing patient safety, is the definition of process-oriented healthcare information systems. In this area a number of approaches to medical knowledge and clinical processes representation and management are available. But no systems that provide an integrated approach to both declarative and procedural medical knowledge are currently available. Furthermore, little attention is paid to systems that enable to manage and prevent risks and errors. This work describes a system aimed at supporting a semantic process-centered vision of healthcare practices. The system is founded on an ontology-based clinical knowledge representation framework that allows representing and managing, in a unified way, both medical knowledge and clinical processes. The system provides functionalities for: (i) creating ontologies of clinical processes that can be queried and explored in a semantic fashion; (ii) expressing errors and risks rules (by ad hoc reasoning tasks) that can be used for a monitored process execution; (iii) executing clinical processes and acquiring clinical process instances by means of either workflow enactment or dynamic workflow composition; (iv) monitoring clinical processes during the execution by running reasoning tasks; (v) analyzing acquired clinical process schemas and instances by semantic querying. The proposed system makes available decision support capabilities able to enhance risks control and patient safety. System features are described by using an example of clinical process regarding cares of breast neoplasm.
Download

Paper Nr: 393
Title:

Mining Patterns in the Presence of Domain Knowledge

Authors:

Cláudia Antunes

Abstract: One of the main difficulties of pattern mining is to deal with items of different nature in the same itemset, which can occur in any domain except basket analysis. Indeed, if we consider the analysis of any transactional database composed by several entities and relationships, it is easy to understand that the equality function may be different for each element, which difficult the identification of frequent patterns. This situation is just one example of the need for using domain knowledge to manage the discovery process, but several other, no less important can be enumerated, such the need to consider patterns at higher levels of abstraction or the ability to deal with structured data. In this paper, we show how the Onto4AR framework can be explored to overcome these situations in a natural way, illustrating its use in the analysis of two distinct case studies. In the first one, exploring a cinematographic dataset, we capture patterns that characterize kinds of movies in accordance to the actors present in their casts and their roles. In the second one, identifying molecular fragments, we find structured patterns, including chains, rings and stars.
Download

Paper Nr: 469
Title:

USER-DRIVEN ASSOCIATION RULE MINING USING A LOCAL ALGORITHM

Authors:

Marinica Claudia, Andrei Olaru and Fabrice Guillet

Abstract: One of the main issues in the process of Knowledge Discovery in Databases is the Mining of Association Rules. Although a great variety of pattern mining algorithms have been designed to this purpose, their main problems rely on in the large number of extracted rules, that need to be filtered in a post-processing step resulting in fewer but more interesting results. In this paper we suggest a new algorithm, that allows the user to explore the rules space locally and incrementally. The user interests and preferences are represented by means of the new proposed formalism - the Rule Schemas. The method has been successfully tested on the database provided by Nantes Habitat.
Download

Paper Nr: 491
Title:

Monitoring Cooperative Business Contracts in an Institutional Environment

Authors:

Henrique Lopes Cardoso and Eugénio Oliveira

Abstract: The automation of B2B processes is currently a hot research topic. In particular, multi-agent systems have been used to address this arena, where agents can represent enterprises in an interaction environment, automating tasks such as contract negotiation and enactment. Contract monitoring tools are becoming more important as the level of automation of business relationships increase. When business is seen as a joint activity that aims at pursuing a common goal, the successful execution of the contract benefits all involved parties, and thus each of them should try to facilitate the compliance of their partners. Taking into account these concerns and inspecting international legislation over trade procedures, in this paper we present an approach to model contractual obligations: obligations are directed from bearers to counterparties and have flexible deadlines. We formalize the semantics of such obligations using temporal logic, and we provide rules that allow for monitoring them. The proposed implementation is based on a rule-based forward chaining production system.
Download

Paper Nr: 494
Title:

A SIMULATION-BASED METHODOLOGY TO ASSIST DECISION-MAKERS IN REAL VEHICLE ROUTING PROBLEMS

Authors:

Angel A. Juan, Javier Faulin, Daniel Riera, David M. i and Josep Jorba

Abstract: The aim of this work is to present a simulation-based algorithm that not only provides a competitive solution for instances of the Capacitated Vehicle Routing Problem (CVRP), but is also able to efficiently generate a full database of alternative good solutions with different characteristics. These characteristics are related to solution’s properties such as routes’ attractiveness, load balancing, non-tangible costs, fuzzy preferences, etc. This double-goal approach can be specially interesting for the decision-maker, since he/she can make use of this algorithm to construct a database of solutions and then send queries to it in order to obtain those feasible solutions that better fit his/her utility function without incurring in a severe increase in costs. In order to provide high-quality solutions, our algorithm combines a CVRP classical heuristic, the Clarke and Wright Savings method, with Monte Carlo simulation using state-of-the-art random number generators. The resulting algorithm is tested against some well known benchmarks and the results obtained so far are promising enough to encourage future developments and improvements on the algorithm and its applications in real-life scenarios.
Download

Paper Nr: 511
Title:

A LOGIC PROGRAMMING FRAMEWORK FOR LEARNING BY IMITATION

Authors:

Nicola Di Mauro, Teresa M. Basile, Grazia Bombini, Stefano Ferilli and Floriana Esposito

Abstract: Humans use imitation as a mechanism for acquiring knowledge, i.e. they use instructions and/or demonstrations provided by other humans. In this paper we propose a logic programming framework for learning from imitation in order to make an agent able to learn from relational demonstrations. In particular, demonstrations are received in incremental way and used as training examples while the agent interacts in a stochastic environment. This logical framework allows to represent domain specific knowledge as well as to compactly and declaratively represent complex relational processes. The framework has been implemented and validated with experiments in simulated agent domains.
Download

Paper Nr: 523
Title:

I.M.P.A.K.T.: an innovative, semantic-based skill management system exploiting standard SQL

Authors:

Eufemia Tinelli, Eufemia Tinelli, Antonio Cascone, Michele Ruta, Tommaso Di Noia, Eugenio Di Sciascio and Francesco M. Donini

Abstract: The paper presents I.M.P.A.K.T. (Information Management and Processing with the Aid of Knowledge-based Technologies), a semantic-enabled platform for skills and talent management. In spite of the full exploitation of recent advances in semantic technologies, the proposed system only relies on standard SQL queries. Distinguishing features include: the possibility to express both strict requirements and preferences in the requested profile, a logic-based ranking of retrieved candidates and the explanation of rank results.
Download

Paper Nr: 524
Title:

TOWARDS A UNIFIED STRATEGY FOR THE PRE-PROCESSING STEP IN DATA MINING

Authors:

Camelia Lemnaru and Rodica Potolea

Abstract: Data-related issues represent the main obstacle in obtaining a high quality data mining process. Existing strategies for preprocessing the available data usually focus on a single aspect, such as incompleteness, or dimensionality, or filtering out “harmful” attributes, etc. In this paper we propose a unified methodology for data preprocessing, which considers several aspects at the same time. The novelty of the approach consists in enhancing the data imputation step with information from the feature selection step, and performing both operations jointly, as two phases in the same activity. The methodology performs data imputation only on the attributes which are optimal for the class (from the feature selection point of view). Imputation is performed using machine learning methods. When imputing values for a given attribute, the optimal subset (of features) for that attribute is considered. The methodology is not restricted to the use of a particular technique, but can be applied using any existing data imputation and feature selection methods.
Download

Paper Nr: 541
Title:

Semantic Argumentation in Dynamic Environments

Authors:

Jörn Sprado and Björn Gottfried

Abstract: Decision Support Systems play a crucial role when controversial points of views are to be considered in order to make decisions. In this paper we outline a framework for argumentation and decision support. This framework defines arguments which refer to conceptual descriptions of the given state of affairs. Based on their meaning and based on preferences that adopt specific viewpoints, it is possible to determine consistent positions depending on these viewpoints. We investigate our approach by examining soccer games, since many observed spatiotemporal behaviours in soccer can be interpreted differently. Hence, the soccer domain is particularly suitable for investigating spatiotemporal decision support systems.
Download

Paper Nr: 555
Title:

Hybrid Optimization Technique for Artificial Neural Networks Design

Authors:

Cleber Zanchettin and Teresa B. Ludermir

Abstract: In this paper a global and local optimization method is presented. This method is based on the integration of the heuristic Simulated Annealing, Tabu Search, Genetic Algorithms and Backpropagation. The performance of the method is investigated in the optimization of Multi-layer Perceptron artificial neural network architecture and weights. The heuristics perform the search in a constructive way and based on the pruning of irrelevant connections among the network nodes. Experiments demonstrated that the method can also be used for relevant feature selection. Experiments are performed with four classification and one prediction datasets.
Download

Paper Nr: 576
Title:

Estimating Greenhouse Gas Emissions Using Computational Intelligence

Authors:

Pedro G. Coelho, Joaquim Pinto Rodrigues, Luiz Biondi Netopq.cnpq.br and João B. Soares de Mello

Abstract: This paper proposes a Neuro-Fuzzy Intelligent System – ANFIS (Adaptive Network based Fuzzy Inference System) for the annual forecast of greenhouse gases emissions (GHE) into the atmosphere. The purpose of this work is to apply a Neuro-Fuzzy System for annual GHE forecasting based on existing emissions data including the last 37 years in Brazil. Such emissions concern tCO2 (tons of carbon dioxide) resulting from fossil fuels consumption for energetic purposes, as well as those related to changes in the use of land, obtained from deforestation indexes. Economical and population growth index have been considered too. The system modeling took into account the definition of the input parameters for the forecast of the GHE measured in terms of tons of CO2. Three input variables have been used to estimate the total tCO2 one year ahead emissions. The ANFIS Neuro-Fuzzy Intelligent System is a hybrid system that enables learning capability in a Fuzzy inference system to model non-linear and complex processes in a vague information environment. The results indicate the Neural-Fuzzy System produces consistent estimates validated by actual test data.
Download

Paper Nr: 629
Title:

INTEGRATING AGENTS WITH CONNECTIONIST SYSTEMS TO EXTRACT NEGOTIATION ROUTINES

Authors:

Marisa Masvoula, Panagiotis Kanellis and Drakoulis Martakos

Abstract: Routinization is a technique of knowledge exploitation based on the repetition of acts. When applied to negotiations it results the substitution of parts or even whole processes, disembarrassing negotiators from significant deliberation and decision making effort. Although it has an important impact on negotiators, the risk of establishing ineffective routines is evident. In our paper we discuss weaknesses and limitations and we propose a generic framework to address them. We consider routines as evolving processes and we take two orientations. The first concerns a communicative dimension to allow for external evaluation of the applied routines and the second concerns enforcement of the system core with evolving structure that adjusts to routine changes and flexibly incorporates new knowledge.
Download

Paper Nr: 22
Title:

A NEW CASE-BASED APPROXIMATE REASONING BASED ON SPMF IN LINGUISTIC APPROXIMATION

Authors:

Dae-Young Choi and I. K. RA

Abstract: A new case-based approximate reasoning (CBAR) based on standardized parametric membership functions (SPMF) in linguistic approximation is proposed. Linguistic case indexing and retrieval based on SPMF is suggested. It provides an efficient mechanism for linguistic approximation within linear time complexity. Thus, it can be used to improve the speed of linguistic approximation. It can be processed relatively fast compared to the previous linguistic approximation methods. From the engineering viewpoint, it may be a valuable advantage.
Download

Paper Nr: 37
Title:

SOCIAL ROBOTS, MORAL EMOTIONS

Authors:

Ana R. Delgado

Abstract: The affective revolution in Psychology has produced enough knowledge to implement abilities of emotional recognition and expression in robots. However, the emotional prototypes are still very basic, almost caricaturized ones. If the goal is constructing robots that respond flexibly, in order to fulfill market demands from different countries while respecting the moral values implicit in the social behavior of their inhabitants, then these robots will have to be programmed attending to detailed descriptions of the emotional experiences that are considered relevant in the interaction context in which the robot is going to be put to work (e.g., assisting people with cognitive or motor disabilities). The advantages of this approach are illustrated with an empirical study on contempt, the seventh basic emotion in Ekman’s theory, and one of the “rediscovered” moral emotions in Haidt’s New Synthesis. A phenomenological analysis of the experience of contempt in 48 Spanish subjects shows the structure and some variations –prejudiced, self-serving, and altruistic– of this emotion. Quantitative information was later obtained with the help of blind coders. Some spontaneous facial expressions that sometimes accompany self-reports are also shown. Finally, some future directions in the Robotics-Psychology intersection are presented (e.g., gender differences in social behavior).
Download

Paper Nr: 46
Title:

METHODS AND TOOLS FOR MODELLING REASONING IN DIAGNOSTIC SYSTEMS

Authors:

Alexander Eremeev and Vadim Vagin

Abstract: The methods of case-based reasoning for a solution of problems of real-time diagnostics and forecasting in intelligent decision support systems (IDSS) is considered. Special attention is drawn to a case library structure for real-time IDSS and an application of this reasoning type for diagnostics of complex object states. The problem of finding the best current measurement points in model-based device diagnostics with using Assumption-based Truth Maintenance Systems (ATMS) is viewed. The new heuristic approaches of current measurement point choosing on the basis of supporting and inconsistent environments are presented. This work was supported by the Russian Foundation for Basic Research (projects No 08-01-00437 and No 08-07-00212).
Download

Paper Nr: 76
Title:

STRATEGIES FOR ROUTE PLANNING ON CATASTROPHE ENVIRONMENTS

Authors:

Pedro Abreu and Pedro Mendes

Abstract: The concept of multi-agent systems (MAS) appeared when computer science researchers had the need to solve problems involving the simulation of real environments with several intervenients (agents). Solving these requires a coordination process between agents and in some cases negotiation. Such is the case of a catastrophe scenario with the need intervention to minimize the consequences, like for instance a fire. In this particular case the agents (firemen) must have a good coordination process to achieve as fast as they can their fire fighting position. The main goal of this project is to create an optimal strategy to calculate the best path to the fire fighting position. Tests were conducted on an existing simulator platform Pyrosim. Three factors have an important role: wind (intensity and direction), ground topology and vegetation variety. At the end the results were quite satisfactory, mainly in what concerns the agents main objective. The A* algorithm proved to be feasible for this particular problem, and the coordination process between agents was implemented successfully. In the future this project may have its agents ported to the BDI concept.
Download

Paper Nr: 78
Title:

AGENT-BASED MODELING AND SIMULATION OF RESOURCE ALLOCATION IN ENGINEERING CHANGE MANAGEMENT

Authors:

Young Moon and Bochao Wang

Abstract: An engineering change (EC) refers to a modification of products and components including purchased parts or even supplies after product design is finished and released to the market. While any company involved in product development would have to deal with engineering changes, the area of engineering change management hasn't received much attention from the research community. It is partly because of its complexity and lack of appropriate research tools. In this paper, we present preliminary research results of modeling the engineering change management (ECM) process using an agent-based modeling and simulation technique. The aim of the research reported in this paper is to study optimal strategies of resource allocation for a company when it is dealing with two kinds of ECs: "necessary ECs" and "initialized ECs." We discuss results from these simulation models to illustrate some insights of ECM, and present several research directions from these results.
Download

Paper Nr: 82
Title:

Evaluating Generalized Association Rules Combining Objective and Subjective Measures and Visualization

Authors:

Magaly L. Fujimoto, Veronica Oliveira de Carvalho, Magaly Fujimoto, Veronica O. Carvalho and Solange Rezende

Abstract: Considering the user view, many problems can be found during the post-processing of association rules, since a large number of patterns can be obtained, which complicates the comprehension and identification of interesting knowledge. Thereby, this paper proposes an approach to improve the knowledge comprehensibility and to facilitate the identification of interesting generalized association rules during evaluation. This aid is realized combining objective and subjective measures with information visualization techniques, implemented on a system called RulEE-GARVis.
Download

Paper Nr: 100
Title:

A TOOL FOR MEASURING INDIVIDUAL INFORMATION COMPETENCY ON AN ENTERPRISE INFORMATION SYSTEM

Authors:

Chui Y. Yoon, In S. Lee and Byung C. Shin

Abstract: This study presents a tool that can efficiently measure individual information competency to execute the given tasks on an enterprise information system. The measurement items are extracted from the major components of a general competency. By factor analysis and reliability analysis, a 14-item tool is proposed to totally measure individual information capability. The tool’s application and utilization are confirmed by applying it to measuring the information competency of the individuals in an enterprise.
Download

Paper Nr: 114
Title:

DESIGN A REVERSE LOGISTICS INFORMATION SYSTEM WITH RFID

Authors:

Ka M. Lee and Wilsern Tan

Abstract: Recently, reverse logistics management has become an integral part of the business cycle. This is mainly due to the need to be environmental friendly and urgent need to reuse scarce resources. Traditionally, reverse logistics activities have been a cost center for most businesses without generating extra revenue. However, due to recent increase in commodity and energy prices, reverse logistics management could eventually be a cost savings method. In this research, we propose using Radio Frequency Identification (RFID) technology to better optimize and streamline reverse logistics operations. Using RFID, we try to eliminate parts of the unknowns in reverse logistics flow that made reverse logistics model complicated. Furthermore, Genetic algorithm is used to optimize the place of initial collection center so as to cover the largest population possible in order to reduce logistics cost and provide convenience to end users. This study is based largely on literature review of past workings and also experiments are conducted on RFID hardware to test for its suitability. The significance of this paper is to adopt ubiquitous RFID technology and Genetic Algorithms for reverse logistics so as to obtain an economic reverse logistics network.
Download

Paper Nr: 117
Title:

SIHC: A Stable Incremental Hierarchical Clustering Algorithm

Authors:

Ibai Gurrutxaga, Olatz Arbelaitz, José Ignacio Martín, Javier Muguerza, Jesús María Pérez and Iñigo Perona

Abstract: SAHN is a widely used agglomerative hierarchical clustering method. Nevertheless it is not an incremental algorithm and therefore it is not suitable for many real application areas where all data is not available at the beginning of the process. Some authors proposed incremental variants of SAHN. Their goal was to obtain the same results in incremental environments. This approach is not practical since frequently must rebuild the hierarchy, or a big part of it, and often leads to completely different structures. The human user of such an application cannot assimilate so drastic changes and loses confidence on the algorithm. We propose a novel algorithm, called SIHC, that updates SAHN hierarchies with minor changes in the previous structures. This property makes it suitable for real environments. Results on 11 synthetic and 6 real datasets show that SIHC builds high quality clustering hierarchies. This quality level is similar and sometimes better than SAHN's. Moreover, the computational complexity of SIHC is lower than SAHN's.
Download

Paper Nr: 118
Title:

MODELING AND SIMULATION FOR DECISION SUPPORT IN SOFTWARE PROJECT WORKFORCE MANAGEMENT

Authors:

Bernardo G. Ambrósio, José Luis Braga, Moisés Resende-Filho and Jugurta Lisboa-Filho

Abstract: This paper presents and discusses the construction of a system dynamics model, focusing on key managerial decision variables related to workforce management during requirements extraction in software development projects. Our model establishes the relationships among those variables, making it possible to analyze and to better understand their mutual influences. Simulations conducted with the model made it possible to verify and foresee the consequences of risk factors (e.g. people turnover and high requirements volatility) on quality and cost of work. Three scenarios (e.g. optimistic, baseline and pessimistic) are set using data from previous studies and data collected in a software development company.
Download

Paper Nr: 166
Title:

STATISTICAL DECISIONS IN PRESENCE OF IMPRECISELY REPORTED ATTRIBUTE DATA

Authors:

Olgierd Hryniewicz

Abstract: The paper presents a new methodology for making statistical decisions when data is reported in an imprecise way. Such situations happen very frequently when quality features are evaluated by humans. We have demonstrated that traditional models based either on the multinomial distribution or on predefined linguistic variables may be insufficient for making correct decisions. Our model, which uses the concept of the possibility distribution, allows separation stochastic randomness from fuzzy imprecision, and provides a decision – maker with more information about the phenomenon of interest.
Download

Paper Nr: 171
Title:

An Agent-based Architecture for Cancer Staging

Authors:

José Machado, António Abelha, Miguel Miranda, José Neves and Manuel F. Santos

Abstract: Cancer staging is the process by which physicians evaluate the spread of cancer. This is important, once in a good cancer staging system, the stage of disease helps to determine prognosis and assists in selecting therapies. A combination of physical examination, blood tests, and medical imaging is used to determine the clinical stage; if tissue is obtained via biopsy or surgery, examination of the tissue under a microscope can provide pathologic staging. On the other hand, good patient education may help to reduce health service costs and improve the quality of life of people with chronic or terminal conditions. In this paper we describe a theoretically based framework for the provision of computer based information on cancer patients,