ICEIS 2007 Abstracts


Area 1 - Databases and Information Systems Integration

Full Papers
Paper Nr: 116
Title:

TRANSFORMATION OF LEGACY BUSINESS SOFTWARE INTO CLIENT-SERVER ARCHITECTURES

Authors:

Thomas Rauber and Gudula Rünger

Abstract: Business software systems in use contain specific knowledge which is essential for the enterprise using the software and the software has often grown over years. However, it is difficult to adapt these software systems to the rapidly changing hardware and software technologies. This so-called legacy problem is extremely cost intensive when a change in the software itself or the hardware platform is required due to a change in the business processes of the enterprise or the hardware technology. Thus, a common problem in business software is the cost effective analysis, documentation, and transformation of business software. In this paper, we concentrate on the transformation issue of software systems and propose an incremental process for transforming monolithic business software into client-server architectures. The internal logical structure of the software system is used to create software components in a flexible way. The transformation process is supported by a transformation toolset which preserves correctness and functionality.
Download

Paper Nr: 116
Title:

TRANSFORMATION OF LEGACY BUSINESS SOFTWARE INTO CLIENT-SERVER ARCHITECTURES

Authors:

Thomas Rauber and Gudula Rünger

Abstract: Business software systems in use contain specific knowledge which is essential for the enterprise using the software and the software has often grown over years. However, it is difficult to adapt these software systems to the rapidly changing hardware and software technologies. This so-called legacy problem is extremely cost intensive when a change in the software itself or the hardware platform is required due to a change in the business processes of the enterprise or the hardware technology. Thus, a common problem in business software is the cost effective analysis, documentation, and transformation of business software. In this paper, we concentrate on the transformation issue of software systems and propose an incremental process for transforming monolithic business software into client-server architectures. The internal logical structure of the software system is used to create software components in a flexible way. The transformation process is supported by a transformation toolset which preserves correctness and functionality.
Download

Paper Nr: 121
Title:

INFORMATION SYSTEMS INTEGRATION DURING MERGERS: INTEGRATION MODES TYPOLOGY AND INTEGRATION PATHS

Authors:

Gerald Brunetto

Abstract: Today Information Systems (IS) integration constitutes one of the major success factors of mergers and acquisitions. This article draws on two case studies of firms having realized more than 10 mergers and acquisitions between 1990 and 2000. This paper shows the importance of carrying out an approach to understand IS integration process. This approach represents the necessity of using organizational configuration to define possible IS integration modes. Thus we show the importance of organizational, strategic and technological contingencies within the elaboration of integration mode.
Download

Paper Nr: 277
Title:

XML SCHEMA STRUCTURAL EQUIVALENCE

Authors:

Angela Duta, Ken Barker and Reda Alhajj

Abstract: The Xequiv algorithm determines when two XML schemas are equivalent based on their structural organization. It calculates the percentages of schema inclusion in another schema by considering the cardinality of each leaf node and its interconnection to other leaf nodes that are part of a sequence or choice structure. Xequiv is based on the Reduction Algorithm (Duta et al., 2006) that focuses on the leaf nodes and eliminates intermediate levels in the XML tree.
Download

Paper Nr: 433
Title:

DETERMINING THE COSTS OF ERP IMPLEMENTATION

Authors:

Rob Kusters, Fred Heemstra and Arjan Jonker

Abstract: The key question of the research reported here is 'which factors influence Enterprise Resource Planning (ERP) implementation costs'. A 'theoretical' answer to this question has been designed by studying the sparsely available literature on ERP implementation costs, and adding to this relevant items from the related fields of software cost estimation, COTS implementation cost estimation, and ERP implementation critical success factors. This result has been compared with empirical data that have been obtained from two large corporations. The combined result is a first attempt to define ERP implementation cost drivers.
Download

Paper Nr: 455
Title:

STATISTICS API: DBMS-INDEPENDENT ACCESS AND MANAGEMENT OF DBMS STATISTICS IN HETEROGENEOUS ENVIRONMENTS

Authors:

Tobias Kraft and Bernhard Mitschang

Abstract: Many of today’s applications access not a single but a multitude of databases running on different DBMSs. Federation technology is being used to integrate these databases and to offer a single query-interface to the user where he can run queries accessing tables stored on different remote databases. So, the optimizer of the federated DBMS has to decide what portion of the query should be processed by the federated DBMS itself and what portion should be executed at the remote systems. Thereto, it has to retrieve cost estimates for query fragments from the remote databases. The response of these databases typically contains cost and cardinality estimates but no statistics about the data stored in these databases. However, statistics are optimization-critical information which is the crucial factor for any kind of decision making in the optimizer of the federated DBMS. When this information is not available optimization has to rely on imprecise heuristics mostly based on default selectivities. To fill this gap, we propose Statistics API, a JAVA interface that provides DBMS-independent access to statistics data stored in databases running on different DBMSs. Statistics API also defines data structures used for the statistics data returned by or passed to the interface. We have implemented this interface for the three prevailing commercial DBMSs IBM DB2, Oracle and Microsoft SQL Server. These implementations are available under the terms of the GNU Lesser General Public License (LGPL). This paper introduces the interface, i.e. the methods and data structures of the Statistics API, and discusses some details of the three interface implementations.
Download

Paper Nr: 456
Title:

DYNAMIC COMMIT TREE MANAGEMENT FOR SERVICE ORIENTED ARCHITECTURES

Authors:

Stefan Böttcher and Sebastian Obermeier

Abstract: Whenever Service Oriented Architectures make use of Web service transactions and an atomic processing of these transactions is required, atomic commit protocols are used for this purpose. Compared to traditional client server architectures, atomicity for Web services and Web service composition is much more challenging since in many cases sub-transactions belonging to a global transaction are not known in advance. In this contribution, we present a dynamic commit tree that guarantees atomicity for transactions that invoke sub-transactions dynamically during the commit protocol’s execution. Furthermore, our commit tree allows the identification of obsolete sub-transactions that occur if sub-transactions are aborted and restart.
Download

Paper Nr: 459
Title:

A VIRTUALIZATION APPROACH FOR REUSING MIDDLEWARE ADAPTERS

Authors:

Ralf Wagner and Bernhard Mitschang

Abstract: Middleware systems use adapters to integrate remote systems and to provide uniform access to them. Different middleware platforms use different adapter technologies, e.g. the J2EE platform uses J2EE connectors and federated database systems based on the SQL standard use SQL wrappers. However, a middleware platform cannot use adapters of a different middleware platform, e.g. a J2EE application server cannot use an SQL wrapper. Even if an SQL wrapper exists for a remote system that is to be integrated by a J2EE application server, a separate J2EE connector for that remote system has to be written. Tasks like that occur over and over again and require to invest additional resources where existing IT infrastructure should be reused. Therefore, we propose an approach that allows to reuse existing adapters. Reusing adapters is achieved by means of a virtualization tier that can handle adapters of different types and that provides uniform access to them. This enables middleware platforms to use each others adapters and thereby avoids the costly task of writing new adapters.
Download

Paper Nr: 462
Title:

XML INDEX COMPRESSION BY DTD SUBTRACTION

Authors:

Stefan Böttcher, Rita Hartel and Niklas Klein

Abstract: Whenever XML is used as format to exchange large amounts of data or even for data streams, the verbose behavior of XML is one of the bottlenecks. While compression of XML data seems to be a way out, it is essential for a variety of applications that the compression result can be queried efficiently. Furthermore, for efficient path query evaluation, an index is desired, which usually generates an additional data structure. For this purpose, we have developed a compression technique that uses structure information found in the DTD to perform a structure-preserving compression of XML data and provides a compression of an index that allows for efficient search in the compressed data. Our evaluation shows that compression factors which are close to gzip are possible, whereas the structural part of XML files can be compressed even better.
Download

Paper Nr: 520
Title:

ONE-TO-MANY DATA TRANSFORMATION OPERATIONS - Optimization and Execution on an RDBMS

Authors:

Paulo Carreira, Helena Galhardas, João Pereira and Andrzej Wichert

Abstract: The optimization capabilities of RDBMSs make them attractive for executing data transformations that support ETL, data cleaning and integration activities. However, despite the fact that many useful data transformations can be expressed as relational queries, an important class of data transformations that produces several output tuples for a single input tuple cannot be expressed in that way. To address this limitation a new operator, named data mapper, has been proposed as an extension of Relational Algebra for expressing one-to-many data transformations. In this paper we study the feasibility of implementing the mapper operator as a primitive operator on an RDBMS. Data transformations expressed as combinations of standard relational operators and mappers can be optimized resulting in interesting performance gains.
Download

Paper Nr: 528
Title:

MODELING DIMENSIONS IN THE XDW MODEL - A LVM-Driven Approach

Authors:

R. Rajugan, Elizabeth Chang and Tharam S. Dillon

Abstract: Since the introduction of eXtensible Markup Language (XML), XML repositories have gained a foothold in many global (and government) organizations, where, e-Commerce and e-Business models have maturated in handling daily transactional data among heterogeneous information systems. Due to this, the amount of data available for enterprise decision-making process is increasing exponentially and are being stored and/or communicated in XML. This presents an interesting challenge to investigate models, frameworks and techniques for organizing and analyzing such voluminous, yet distributed XML documents for business intelligence in the form of XML warehouse repositories and XML marts. In our previous work, we proposed a Layered View Model (LVM) driven, conceptual modeling framework for the design and development of an XML Document Warehouse (XDW) model with emphasis on conceptual and logical semantics. There, we presented a view-driven framework to conceptually model and deploy meaningful XML FACT repositories in the XDW model. Here, in this paper, we look at the hierarchical dimensions and their theoretical semantics used to design, specify and define dimensions over an XML FACT repository in the XDW model. One of the unique properties of this LVM-driven approach is that the dimensions are considered as first-class citizens of the XDW conceptual model. Also, here, to illustrate our concepts, we use a real-world case study example; a logically grouped, geographically dispersed, XDW model in the context of a global logistics and cold-storage company.
Download

Paper Nr: 748
Title:

USING AN INDEX OF PRECOMPUTED JOINS IN ORDER TO SPEED UP SPARQL PROCESSING

Authors:

Sven Groppe, Jinghua Groppe and Volker Linnemann

Abstract: SparQL is a query language developed by the W3C, the purpose of which is to query a data set in RDF representing a directed graph. Many free available or commercial products already support SparQL processing. Current index-based optimizations integrated in these products typically construct indices on the subject, predicate and object of an RDF triple, which is a single datum of the RDF data, in order to speed up the execution time of SparQL queries. In order to query the directed graph of RDF data, SparQL queries typically contain many joins over a set of triples. We propose to construct and use an index of precomputed joins, where we take advantage of the homogenous structure of RDF data. Furthermore, we present experimental results, which demonstrate the achievable speed-up factors for SparQL processing.
Download

Paper Nr: 929
Title:

MONITORING WEB DATA SOURCES USING TEMPORAL PROPERTIES AS AN EXTERNAL RESOURCES OF A DATA WAREHOUSE

Authors:

Francisco Araque , Alberto Salguero and Cecilia Delgado Negrete

Abstract: Flexibility to react on rapidly changing general conditions of the environment has become a key factor for economic success of any company, and the WWW has become an important resource of information for this proposal. Nowadays most of the important enterprise has incorporated the Data Warehouse (DW) technology where the information retrieved from different sources, including the WWW, has been integrated. The quality of data provided to the decision makers depends on the capability of the DW system to convey in a reasonable time, from the sources to the data marts, the changes made at the data sources. If we use the data arrival properties of such underlying information sources, the DW Administrator can derive more appropriate rules and check the consistency of user requirements more accurately. In this paper we present an algorithm for data integration depending on the temporal characteristics of the data sources and an architecture for monitoring web sources on the WWW in order to obtain its temporal properties. In addition we show an example applied to tourism area where data integrated into DW can be used to schedule personalized travel as a value-added service for electronic commerce.
Download

Short Papers
Paper Nr: 33
Title:

FROM DATABASE TO DATAWAREHOUSE - A Design Quality Evaluation

Authors:

Maurizio Pighin and Lucio Ieronutti

Abstract: Data warehousing provides tools and techniques for collecting, integrating and storing a large number of transactional data extracted from operational databases, with the aim of deriving accurate management information that can be effectively used for supporting decision processes. However, the choice of which attributes have to be considered as dimensions and which as measures heavily influences the effectiveness of a data warehouse. Since this is not a trivial task, especially for databases characterized by a large number of tables and attributes, an expert is often required for correctly selecting the most suitable attributes and assigning them the correct roles. In this paper, we propose a methodology based on the analysis of statistical and syntactical aspects that can be effectively used (i) during the data warehouse design process for supporting the selection of database tables and attributes, and (ii) then for evaluating the quality of data warehouse design choices. We also present the results of an experiment demonstrating the effectiveness of our methodology.
Download

Paper Nr: 63
Title:

PTSM: A PORTLET SELECTION MODEL

Authors:

Mª Ángeles Moraga, Coral Calero, Mario Piattini and Oscar Díaz

Abstract: The use of Web portals continues to rise, showing their importance in the current information society. The success of a portal depends on customers using and returning to it. Nowadays, it is very easy for users to change from one portal to another, so improving/assessing portal quality is a must. Hence, appropriate quality model should be available to measure and drive portal development. Specifically, this work focuses on portlet-based portals. Portlets are web components, and they can be thought as COTS but in a Web setting. This paper presents a portlet selection model that guides the portal developer in choosing the best portlet, among a set of portlets with similar functions for specified tasks and user objectives, in accordance to five quality measures, namely, functionality, reliability, usability, efficiency and reusability, and other three characteristics not related to the quality but important to carry out the selection.
Download

Paper Nr: 128
Title:

ENTERPRISE INFORMATION SEARCH SYSTEMS FOR HETEROGENEOUS CONTENT REPOSITORIES

Authors:

Trieu Chieu, Shyh-kwei Chen and Shiwa Fu

Abstract: In larger enterprises, business documents are typically stored in disparate, autonomous content repositories with various formats. Efficient search and retrieval mechanisms are needed to deal with the heterogeneousness and complexity of this environment. This paper presents a general architecture and two industrial implementations of a service-based information system to perform search in Lotus Notes databases and data sources with Web service interfaces. The first implementation is based on a federated database system that maps the various schemas of the sources into a common interface and aggregates information from their native locations. This implementation offers the advantages of scalability and accessibility to real-time information. The second one is based on a one-index enterprise-scale search engine that crawls, parses and indexes the document contents from the sources. This latter implementation offers the ability of scoring the relevance ranking of documents and eliminating duplications in search results. The relative merits and limitations of both implementations will be presented.
Download

Paper Nr: 129
Title:

A FRAMEWORK FOR SUPPORTING KNOWLEDGE WORK PROCESSES

Authors:

Weidong Pan, Igor Hawryszkiewycz and Dongbai Xue

Abstract: Improving knowledge work processes has become increasingly important for modern enterprises to maintain a competitive status in nowadays information society. This paper will propose a way to improve knowledge work processes through supportive services. A framework for supporting knowledge work processes will be presented where the best practices of knowledge work processes, developed by process organizers or derived from some successful applications, are described and stored in a database, and according to the description, software agents dynamically organize supportive services to guide process participants to advance process steps towards the efficient completion of a process. The paper will provide an overview of the method and explore the development of the main components in the framework.
Download

Paper Nr: 201
Title:

A METHOD FOR EARLY CORRESPONDENCE DISCOVERY USING INSTANCE DATA

Authors:

Indrakshi Ray and C. J. Michael Geisterfer

Abstract: Most of the research in database integration have focused on matching schema-level information to determine the correspondences between data concepts in the component databases. Such research relies on the availability of schema experts, schema documentation, and well – designed schemas – items that are often not available. We propose a method of initial instance-based correspondence discovery that greatly reduces the manual effort involved in the current integration processes. The gains are accomplished because the ensuing method uses only instance data (a body of database knowledge that is always available) to make its initial discoveries.
Download

Paper Nr: 246
Title:

UNASSSUMING VIEW-SIZE ESTIMATION TECHNIQUES IN OLAP - An Experimental Comparison

Authors:

Kamel Aouiche and Daniel Lemire

Abstract: Even if storage was infinite, a data warehouse could not materialize all possible views due to the running time and update requirements. Therefore, it is necessary to estimate quickly, accurately, and reliably the size of views. Many available techniques make particular statistical assumptions and their error can be quite large. Unassuming techniques exist, but typically assume we have independent hashing for which there is no known practical implementation. We adapt an unassuming estimator due to Gibbons and Tirthapura: its theoretical bounds do not make unpractical assumptions. We compare this technique experimentally with stochastic probabilistic counting, LOGLOG probabilistic counting, and multifractal statistical models. Our experiments show that we can reliably and accurately (within 10%, 19 times out 20) estimate view sizes over large data sets (1.5 GB) within minutes, using almost no memory. However, only GIBBONS-TIRTHAPURA provides universally tight estimates irrespective of the size of the view. For large views, probabilistic counting has a small edge in accuracy, whereas the competitive sampling-based method (multifractal) we tested is an order of magnitude faster but can sometimes provide poor estimates (relative error of 100%). In our tests, LOGLOG probabilistic counting is not competitive. Experimental validation on the US Census 1990 data set and on the Transaction Processing Performance (TPC H) data set is provided.
Download

Paper Nr: 254
Title:

IMPLEMENTING SPATIAL DATA WAREHOUSE HIERARCHIES IN OBJECT-RELATIONAL DBMSs

Authors:

Elzbieta Malinowski and Esteban Zimányi

Abstract: Spatial Data Warehouses (SDWs) allow to analyze historical data represented in a space supporting the decision-making process. SDW applications require a multidimensional view of data that includes dimensions with hierarchies and facts with associated measures. In particular, hierarchies are important since traversing them users can analyze detailed and aggregated measures. To better represent users’ requirements for SDWs, the conceptual model with spatial support should be used. Afterwards, the conceptual schema is translated to the logical and physical schemas. However, during the translation process the semantics can be lost. In this paper, we present the translation of spatial hierarchies from the conceptual to physical schemas represented in the MultiDimER model and Oracle 10g Spatial, respectively. Further, to ensure the semantic equivalence between the conceptual and the physical schemas, integrity constraints are exemplified mainly using triggers.
Download

Paper Nr: 387
Title:

SECURE KNOWLEDGE EXCHANGE BY POLICY ALGEBRA AND ERML

Authors:

Steve Barker and Paul Douglas

Abstract: In this paper, we demonstrate how role-based access control policies may be used for secure forms of knowledge module exchange in an open, distributed environment. For that, we define an algebra that a security administrator may use for defining compositions and decompositions of shared information sources, and we describe a markup language for facilitating secure information exchange amongst heterogeneous information systems. We also describe an implementation of our approach and we give some performance measures, which offer evidence of the feasibility of our proposal.
Download

Paper Nr: 392
Title:

MAINTENANCE COST OF A SOFTWARE DESIGN - A Value-Based Approach

Authors:

Daniel Cabrero Moreno, Javier Garzás and Mario Piattini

Abstract: Alternative valid software design solutions can give response to the same software product requirements. In addition, a great part of the success of a software project depends on the selected software design. However, there are few methods to quantify how much value will be added by each design strategy, and hence very little time is spent choosing the best design option. This paper presents a new approach to estimate and quantify how profitable is to improve a design solution. This will be achieved by estimating the maintenance cost of a software project using two main variables: The probability of change of each design artifact, and the cost associated to each change. Two techniques are proposed in this paper to support this approach: COCM (Change-Oriented Configuration Management) and CORT (Change-Oriented Requirement Tracing).
Download

Paper Nr: 399
Title:

THE CHALLENGES FACING GLOBAL ERP SYSTEMS IMPLEMENTATIONS

Authors:

Paul Hawking, Andrew Stein and Susan Foster

Abstract: Large global companies are increasing looking towards information systems to standardize business processes and enhance decision making across their operations in different countries. In particular these companies are implementing enterprise resource planning systems to provide this standardization. This paper is a review of literature which focuses on the use of ERP systems to support global operations. There are many technological and cultural challenges facing these implementations. However a major challenge faced by companies is the balance between centralization and localization.
Download

Paper Nr: 403
Title:

KNOWLEDGE-MASHUPS AS NEXT GENERATION WEBBASED SYSTEMS - Converging Systems Via Self-explaining Services

Authors:

Thomas Bopp, Birger Kühnel, Thorsten Hampel, Christian Prpitsch and Frank Lützenkirchen

Abstract: Webservice-based architectures are facing new challenges in terms of convergence of systems. By example of a web service integration of a digital repository/library, systems of knowledge management in groups, and learning management systems this contribution shows new potentials of flexible, descriptive webservices. Digital libraries are understood in their key position as searching, structuring, and archiving instances of digital media and they actively provide services in this sense. The goal of this article is to introduce services suitable for everyday use for coupling different classes of systems. Conceptually, the requirements of a possible standard in the area of convergence of knowledge management, digital libraries, and learning management systems are discussed. The results are published and search services with negotiation capabilities with a low-barrier for adoption.
Download

Paper Nr: 437
Title:

EVIE - AN EVENT BROKERING LANGUAGE FOR THE COMPOSITION OF COLLABORATIVE BUSINESS PROCESSES

Authors:

Tony O’Hagan, Shazia Sadiq and Wasim Sadiq

Abstract: Technologies that facilitate the management of collaborative processes are high on the agenda for enterprise software developers. One of the greatest difficulties in this respect is achieving a streamlined pipeline from business modeling to execution infrastructures. In this paper we present Evie - an approach for rapid design and deployment of event driven collaborative processes based on significant language extensions to Java that are characterized by abstract and succinct constructs. The new language is positioned within an overall framework that provides a bridge between a high level modeling tool and the underlying deployment environment.
Download

Paper Nr: 445
Title:

EXTRACTION AND TRANSFORMATION OF DATA FROM SEMI-STRUCTURED TEXT FILES USING A DECLARATIVE APPROACH

Authors:

Ricardo Raminhos and João Moura-Pires

Abstract: The World Wide Web is a major source of textual information, with a human-readable semi-structured format, referring to multiple domains, some of them highly complex. Traditional ETL approaches following the development of specific source code for each data source and based on multiple domain / computer-science experts interactions, become an inadequate solution, time consuming and prone to error. This paper presents a novel approach to ETL, based on its decomposition in two phases: ETD (Extraction, Transformation and Data Delivery) and IL (Integration and Loading). The ETD proposal is supported by a declarative language for expressing ETD statements and a graphical application for interacting with the domain expert. When applying ETD mainly domain expertise is required, while computer-science expertise will be centered in the IL phase, linking the processed data to target system models, enabling a clearer separation of concerns. This paper presents how ETD has been integrated, tested and validated in a space domain project, currently operational at the European Space Agency for the Galileo Mission.
Download

Paper Nr: 446
Title:

A DOCUMENT REPOSITORY ARCHITECTURE FOR HETEROGENEOUS BUSINESS INFORMATION MANAGEMENT

Authors:

Mohamed Mbarki, Chantal Soulé-Dupuy and Nathalie Vallès-Parlangeau

Abstract: As part of business memories, document repositories should bring some solutions to ensure flexible and efficient uses of dematerialized information content. While the fields of repositories modeling, document integration and interrogation have independently attracted a huge amount of attention, few works have tried to propose a general architecture of document repository management. Thus we propose a repository architecture based on the integration of different complementary modules ensuring an efficient storage of fragmented digital documents and then flexible fragments exploitation. This paper presents also an implementation of such architecture of document repository.
Download

Paper Nr: 447
Title:

OLAP AGGREGATION FUNCTION FOR TEXTUAL DATA WAREHOUSE

Authors:

Franck Ravat, Olivier Teste and Ronan Tournier

Abstract: For more than a decade, OLAP and multidimensional analysis have generated methodologies, tools and resource management systems for the analysis of numeric data. With the growing availability of semi-structured data there is a need for incorporating text-rich document data in a data warehouse and providing adapted multidimensional analysis. This paper presents a new aggregation function for keywords allowing the aggregation of textual data in OLAP environments as traditional arithmetic functions would do on numeric data. The AVG_KW function uses an ontology to join keywords into a more common keyword.
Download

Paper Nr: 448
Title:

EXTENSIBLE METADATA REPOSITORY FOR INFORMATION SYSTEMS AND ENTERPRISE APPLICATIONS

Authors:

Ricardo Ferreira and João Moura-Pires

Abstract: Today’s Information Systems and Enterprise Applications require extensive use of Metadata information. In Information Systems, metadata helps in integration and modeling their various components and computational processes, while in Enterprises metadata can describe business and management models, human or physical resources, among others. This paper presents a light and no-cost extensible Metadata Repository solution for such cases, relying on XML and related technologies to store, validate, query and transform metadata information, ensuring common operational concerns such as availability and security yet providing easy integration. The feasibility and applicability of the solution is proved by a case study where an implementation is running in operational state.
Download

Paper Nr: 487
Title:

DISTRIBUTED APPROACH OF CONTINUOUS QUERIES WITH KNN JOIN PROCESSING IN SPATIAL DATA WAREHOUSE

Authors:

Marcin Gorawski and Wojciech Gębczyk

Abstract: The paper describes realization of distributed approach to continuous queries with kNN join processing in a spatial telemetric data warehouse. Due to dispersion of the developed system, new structural members were distinguished - the mobile object simulator, the kNN join processing service and the query manager. Distributed tasks communicate using JAVA RMI. The kNN queries (k Nearest Neighbour) joins every point from one dataset with its k nearest neighbours in the other dataset. In our approach we use the Gorder method, which is a block nested loop join algorithm that exploits sorting, join scheduling and distance computation filtering to reduce CPU and I/O usage.
Download

Paper Nr: 498
Title:

REVISITING THE OLAP INTERACTION TO COPE WITH SPATIAL DATA AND SPATIAL DATA ANALYSIS

Authors:

Rosa Matias and João Moura-Pires

Abstract: In this paper we propose a new interface for spatial OLAP systems. Spatial data deals with data related to space and have a complex and specific nature bringing challenges to OLAP environments. Humans only understand spatial data through maps. We propose a new spatial OLAP environment compounded with the following elements: a map, a support table and a detail table. Those areas have synchronized granularity. We also extend OLAP operation to performed spatial analysis, for instance, spatial drill-down, spatial drill-up and spatial slice. We take special care in the spatial slice where we identify two main groups of operations: spatial-semantic slice and spatial-geometric slice.
Download

Paper Nr: 508
Title:

AN EXTENSIBLE RULE TRANSFORMATION MODEL FOR XQUERY OPTIMIZATION - Rules Pattern for XQuery Tree Graph View

Authors:

Nicolas Travers and Tuyêt-Trâm Dang-Ngoc

Abstract: Efficient evaluation of XML Query Languages has become a crucial issue for XML exchanges and integration. Tree Pattern (Sihem et al., 2002; Jagadish et al., 2001; Chen et al., 2003) are now well admitted for representing XML Queries and a model - called TGV (Travers, 2006; Travers et al., 2006; Travers et al., 2007c) - has extended the Tree Pattern representation in order to make it more intuitive, respect full XQuery specification and got support to be manipulated, optimized and then evaluated. For optimization, a search strategy is needed. It consists in generating equivalent execution plan using extensible rules and estimate cost of plan to find the better one. We propose the specification of extensible rules that can be used in heterogeneous environment, supporting XML and manipulating Tree Patterns.
Download

Paper Nr: 736
Title:

AN INFORMATION SYSTEMS AUDITOR’S PROFILE

Authors:

Mariana Carroll and Alta Van Der Merwe

Abstract: The increasing dependence upon Information Systems (IS) in the last few decades by businesses resulted in many concerns regarding auditing. Traditional IS auditing changed from auditing ‘around the computer’ to auditing through and with the computer. Technology is changing rapidly and so is the profession of IS auditing. As IS auditing is dependent on Information Technology (IT), it is essential that an IS auditor possesses IT and auditing knowledge to bridge the gap between the IT and auditing professions. In this paper we reflect on the auditor’s profile in this changing domain, where we first define the roles and responsibilities expected from IS auditors, describe the basic IT and audit knowledge required from IS auditors based on the roles and responsibilities identified, describe the soft skills required from IS auditors to successfully perform an IS audit assignment, define the main types of IS audit tools and techniques used most often to assist IS auditors in executing IS audit roles and responsibilities and lastly propose the IS auditor’s profile.
Download

Paper Nr: 742
Title:

ON CORRECTNESS CRITERIA FOR WORKFLOW EXCEPTION HANDLING POLICIES

Authors:

Belinda Carter and Maria Orlowska

Abstract: Exception handling during the execution of workflow processes is a frequently addressed topic in the literature. Exception handling policies describe the desired response to exception events with respect to the current state of the process instance in execution. In this paper, we present insights into the definition and verification of such policies for handling asynchronous, expected exceptions. In particular, we demonstrate that the definition of exception handling policies is not a trivial exercise in the context of complex processes, and, while different approaches to defining and enforcing exception handling policies have been proposed, the issue of verification of the policies has not yet been addressed. The main contribution of this paper is a set of correctness criteria which we envisage could form the foundation of a complete verification solution for exception handling policies.
Download

Paper Nr: 745
Title:

PROBLEMS WITH NON-OPEN DATA STANDARDS IN SWEDISH MUNICIPALS - When Integrating and Adopting Systems

Authors:

Benneth Christiansson and Fredrik Svensson

Abstract: Governments world-wide are applying information and communication technology in order to meet a broad range of citizen and organizational needs. When planning systems integration the choice should lead to the software that best suits the organizational needs, taking into account price, quality, ease of use, support, reliability, security and other characteristics considered important. This paper is based on experiences from the KOMpiere project which aims at modifying the open source licensed ERP-system Compiere for use in Swedish municipals. The overall goal of the project is to support and enhance the use of open source licensed software in the Swedish public sector and thereby enable municipals to lower their IT-related costs and gain strategic control over their own IT-environment. We discovered that at least some Swedish municipals don’t have free access to the data they are appointed to govern and protect. The software vendors have, by using non-open data standards, excluded the municipals from using their own data freely. Thereby denying Swedish municipals an open market. We have in this paper suggested the creation and usage of XML-based ODS for all systems in Swedish municipals.
Download

Paper Nr: 760
Title:

CHANGE MANAGEMENT IN DATA INTEGRATION SYSTEMS

Authors:

Rahee Ghurbhurn, Philippe Beaune and Hugues Solignac

Abstract: In this paper, we present a flexible architecture allowing applications and functional users to access heterogeneous distributed data sources. Our proposition is based on a multi-agent architecture and a domain knowledge model. The objective of such an architecture is to introduce some flexibility in the information systems architecture. This flexibility can be in terms of the ease to add or remove existing/new applications but also the ease to retrieve knowledge without having to know the underlying data sources structures. We propose to model the domain knowledge with the help of one or several ontologies and to use a multi-agent architecture maintain such a representation and to perform data retrieval tasks. The proposed architecture acts as a single point of entry to existing data sources. We therefore hide the heterogeneity allowing users and applications to retrieve data without being hindered by changes in these data sources.
Download

Paper Nr: 761
Title:

RELEVANT VALUES: NEW METADATA TO PROVIDE INSIGHT ON ATTRIBUTE VALUES AT SCHEMA LEVEL

Authors:

Sonia Bergamaschi, Mirko Orsini, Francesco Guerra and Claudio Sartori

Abstract: Research on data integration has provided languages and systems able to guarantee an integrated intentional representation of a given set of data sources. A significant limitation common to most proposals is that only intentional knowledge is considered, with little or no consideration for extensional knowledge. In this paper we propose a technique to enrich the intension of an attribute with a new sort of metadata: the “relevant values”, extracted from the attribute values. Relevant values enrich schemata with domain knowledge; moreover they can be exploited by a user in the interactive process of creating/refining a query. The technique, fully implemented in a prototype, is automatic, independent of the attribute domain and it is based on data mining clustering techniques and emerging semantics from data values. It is parameterized with various metrics for similarity measures and is a viable tool for dealing with frequently changing sources, as in the Semantic Web context.
Download

Paper Nr: 777
Title:

ACTIVITY WAREHOUSE: DATA MANAGEMENT FOR BUSINESS ACTIVITY MONITORING

Authors:

Oscar Mangisengi, Mario Pichler, Dagmar Auer, Dirk Draheim and Hildegard Rumetshofer

Abstract: Nowadays collecting checkpoint data of business process activities of transactions becomes important data resource for business analyst and decision-makers to support tactical decisions in general and strategic decisions in particular. In the context of business process-oriented applications, business activity monitoring (BAM) systems, which are predicted to play a major role in the future business-intelligence area is the most visible use of the current business needs. In this paper we address an approach to derive an activity warehouse model based on the BAM requirements. The implementation shows that data stored in activity warehouse is able to efficiently monitor the business process in real-time and provide a better real-time visibility of the business process.
Download

Paper Nr: 780
Title:

LEGACY SYSTEM EVOLUTION - A Comparative Study of Modernisation and Replacement Initiation Factors

Authors:

Irja Kankaanpää, Päivi Tiihonen, Jarmo J. Ahonen, Jussi Koskinen, Tero Tilus and Henna Sivula

Abstract: Decisions regarding information system evolution strategy become topical as the organization’s information systems age and start to approach the end of their life cycle. An interview study was conducted in order to compare factors influencing modernization and replacement initiation. System age, obsolete technology and high operation or maintenance costs were identified as triggers for both modernization and replacement projects. The results show that the most prevalent individual reason for modernization initiative is business development. Common initiation factors for replacement projects were end of vendor support and system’s inability to respond to organization’s business needs.
Download

Paper Nr: 806
Title:

SOFTWARE COST ESTIMATION USING ARTIFICIAL NEURAL NETWORKS WITH INPUTS SELECTION

Authors:

Efi Papatheocharous and Andreas Andreou

Abstract: Software development is an intractable, multifaceted process encountering deep, inherent difficulties. Especially when trying to produce accurate and reliable software cost estimates, these difficulties are amplified due to the high level of complexity and uniqueness of the software process. This paper addresses the issue of estimating the cost of software development by identifying the need for countable entities that affect software cost and using them with artificial neural networks to establish a reliable estimation method. Input Sensitivity Analysis (ISA) is performed on predictive models of the Desharnais and ISBSG datasets aiming at identifying any correlation present between important cost parameters at the input level and development effort (output). The degree to which the input parameters define the evolution of effort is then investigated and the selected attributes are employed to establish accurate prediction of software cost in the early phases of the software development life-cycle.
Download

Paper Nr: 814
Title:

DQXSD: AN XML SCHEMA FOR DATA QUALITY - An XSD for Supporting Data Quality in XML

Authors:

Eugenio Verbo, Ismael Caballero and Mario Piattini

Abstract: Traditionally, data quality management has mainly focused on both data source and data target. Increasingly, data processing to get a data product need raw data typically distributed among different data sources. However, if data quality is not preserved when transmitted, resulting data product and consequent information will not be of much value. It is necessary to improve exchange methods and means to get a better information process. This paper focus on that issue, proposing a new approach for assuring and transmitting data quality in the interchange. Using XML and related technologies, a document structure that considers data quality as a main topic is defined. The resulting schema is verified using several measures and comparing it to the data source.
Download

Paper Nr: 824
Title:

ENABLING CSCW SYSTEMS TO AUTOMATICALLY BIND EXTERNAL KNOWLEDGE BASES

Authors:

Thomas Bopp, Jonas Schulte and Thorsten Hampel

Abstract: The usage of CSCW systems for teaching, training, and research collaboration increases constantly, as it offers a time- and place -independent, as well as a cost-effective platform. The user’s search should not be restricted to local material; in fact, users benefit from different search environments, as, for example digital libraries to find appropriate working material. Searching and further processing of documents imply a media breach since the search cannot be invoked in current CSCW systems directly. This paper presents the first prototype of a CSCW system which enables users to search in external sources without media breach. To provide arbitrary search environments no restrictions to data formats or search functionalities are allowed. Hence we have enhanced search environments with self description capabilities in order to realize an automatic binding of search environments in CSCW systems. By search environments we address any system offering searchable knowledge bases, such as digital libraries or the CSCW system itself. Furthermore our concept supports local search and searching in different external sources at the same time.
Download

Paper Nr: 825
Title:

EXPOSING WORKFLOWS TO LOAD BURSTS

Authors:

Dmytro Dyachuk and Ralph Deters

Abstract: Well defined, loosely coupled services are the basic building blocks of the service-orientated design-integration paradigm. Services are computational elements that expose functionality (e.g. legacy applications) in a platform independent manner and can be described, published, discovered, orchestrated and consumed across language, platform and organizational borders. Using service-orientation (SO) it is fairly easy to expose existing applications/resources and to aggregate them into novel services called composite services (CS). This aggregation is achieved by defining a workflow that orchestrates the underlying services in a manner consistent with the desired functionality. Since CS can aggregate atomic and other CS they foster the development of service layers and reuse of already existing functionality. But by defining workflows, existing services are put into novel contexts and exposed to different workloads, which in turn can result in unexpected behaviors. This paper examines the behavior of sequential workflows that experience short-lived load bursts. Using workflows of varying length, the paper reports on the transformations that loads experience as they are processed by providers.
Download

Paper Nr: 834
Title:

DOING THINGS RIGHT OR DOING THE RIGHT THINGS? - Proposing a Documentation Scheme for Small to Medium Enterprises

Authors:

Josephina Antoniou, Panagiotis Germanakos and Andreas Andreou

Abstract: Coping with the initial and finest systems’ functionality and performance is indeed one of the major problems nowadays, due to the rapid increase and continuous change of customer demands. Hence, it is crucial to move on with a research analysis in an attempt to identify whether documentation, the most reliable source for preserving a software system’s quality over the years, is properly created, updated and used in Small to Medium Enterprises (SME) operating in small EU markets, focusing both on the development process and the maintenance activities. Henceforth, the main objective of this paper is to propose a minimum documentation set required to fulfill both the Software Engineering principles and the SME practical needs by comparing literature suggestions with empirical findings. In further support of our documentation set suggestion, we present and discuss the results of a small survey conducted in nine IT-oriented SME in Cyprus and Greece.
Download

Paper Nr: 843
Title:

OOPUS - A Production Planning Information System to Assure High Delivery Reliability Under Short-term Demand Changes and Production Disturbances

Authors:

Wilhelm Dangelmaier, Tobias Rust, Thomas Hermanowski, Daniel Brüggemann, Daniel Kaschula, Andre Döring and Thorsten Timm

Abstract: Batch-sizing and scheduling is the central decision problem in the area of production planning. A special challenge in this context is to handle the big amount of data in an adequate time interval. To aggregate and to illustrate this data clearly, appropriate techniques are required. This paper presents a new approach to integrate a Production Planning Table visualized by a Gantt chart and a cumulative quantity table for maximum information transparency in production planning. The discussed solution is realized in OOPUS, an object-oriented tool for planning and control, which became the leading production planning system in two motor assembly plants of an international automobile manufacturer.
Download

Paper Nr: 858
Title:

USING FUZZY DATACUBES IN THE STUDY OF TRADING STRATEGIES

Authors:

Miguel Delgado, J. F. Núñez Negrillo, Eva Gibaja and Carlos Molina Fernández

Abstract: A fuzzy multidimensional model can be used for exploratory analysis, modeling complex concepts that are very difficult to use in crisp ones. Some problems, as the edge problem, can be reduced using this approach. To hide the complexity of the fuzzy logic in this situation is important. In this paper we present an application of a fuzzy multidimensional model, that uses two layer representation to hide the complexity to the user, in the study of trading strategies.
Download

Paper Nr: 864
Title:

STAH-TREE - Hybrid Index for Spatio Temporal Aggregation

Authors:

Marcin Gorawski and Michał Faruga

Abstract: This paper presents a new index that stores spatiotemporal data and provides efficient algorithms for processing range and time aggregation queries where results are precise values not an approximation. In addition, this technology allows to reach detailed information when they are required. Spatiotemporal data are defined as static spatial objects with non spatial attributes changing in time. Range aggregation query computes aggregation over set of spatial objects that fall into query window. Its temporal extension allows to define additional time constraints. Index name (i.e. STAH-tree) is English abbreviation and can be extended as Spatio-Temporal Aggregation Hybrid Tree. STAH-tree is based on two well known indexing techniques. R– and aR–tree for storing spatial data and MVB-tree for storing non-spatial attributes values. These techniques were extended with new functionality and adopted to work together. Cost model for node accesses was also developed.
Download

Paper Nr: 867
Title:

USING SEMANTIC WEB AND SERVICE ORIENTED TECHNOLOGIES TO BUILD LOOSELY COUPLED SYSTEMS - SWOAT – A Service and Semantic Web Oriented Architecture Technology

Authors:

Bruno Caires and Jorge Cardoso

Abstract: The creation of loosely coupled and flexible applications has been a challenge faced by most organizations. This has been important because organization systems need to quickly respond and adapt to changes that occur in the business environment. In order to address these key issues, we implemented SWOAT, a ‘Service and Semantic Web Oriented Architecture Technology’ based middleware. Our system uses ontologies to semantically describe and formalize the information model of the organization, providing a global and integrated view over a set of database systems. It also allows interoperability with several systems using Web Services. Using ontologies and Web services, clients remain loosely coupled from data sources. As a result, data structures can be changed and moved without having to change all clients, internal or external to the organization.
Download

Paper Nr: 867
Title:

USING SEMANTIC WEB AND SERVICE ORIENTED TECHNOLOGIES TO BUILD LOOSELY COUPLED SYSTEMS - SWOAT – A Service and Semantic Web Oriented Architecture Technology

Authors:

Bruno Caires and Jorge Cardoso

Abstract: The creation of loosely coupled and flexible applications has been a challenge faced by most organizations. This has been important because organization systems need to quickly respond and adapt to changes that occur in the business environment. In order to address these key issues, we implemented SWOAT, a ‘Service and Semantic Web Oriented Architecture Technology’ based middleware. Our system uses ontologies to semantically describe and formalize the information model of the organization, providing a global and integrated view over a set of database systems. It also allows interoperability with several systems using Web Services. Using ontologies and Web services, clients remain loosely coupled from data sources. As a result, data structures can be changed and moved without having to change all clients, internal or external to the organization.
Download

Paper Nr: 892
Title:

A XML-BASED QUALITY MODEL FOR WEB SERVICES CERTIFICATION

Authors:

José Jorge L. Dias Jr., J. Adson O. G. da Cunha, Alexandre Alvaro, Roberto S. M. de Barros and Silvio Romero De Lemos Meira

Abstract: Internet has made possible the development of software as services, consumed on demand and developed by third parties. In this sense, a quality model is necessary to enable evaluation and, consequently, reuse of the services by consumers. In this way, this paper proposes a quality model based on the ISO 9126 standard, defining a set of attributes and metrics for an effective evaluation of Web services. A XML-based representation model was created to support this quality model, and a security schema was proposed to guarantee integrity and authenticity of the model.
Download

Paper Nr: 897
Title:

PREFERENCE RULES IN DATABASE QUERYING

Authors:

Sergio Greco, Cristian Molinaro and Francesco Parisi

Abstract: The paper proposes the use of preferences for querying databases. In expressing queries it is natural to express preferences among tuples belonging to the answer. This can be done in commercial DBMS, for instance, by ordering the tuples in the result. The paper presents a different proposal, based on similar approaches deeply investigated in the artificial intelligence field, where preferences are used to restrict the result of queries posed over databases. In our proposal a query over a database DB is a triple (q, P, Φ), where q denotes the output relation, P a Datalog program (or an SQL query) used to compute the result and Φ is a set of preference rules used to introduce preferences on the computed tuples. In our proposal tuples which are ”dominated” by other tuples do not belong to the result and cannot be used to infer other tuples. A new stratified semantics is presented where the program P is partitioned into strata and the preference rules associated to each stratum of P are divided into layers; the result of a query is carried out by computing one stratum at time and by applying the preference rules, one layer at time. We show that our technique is sound and that the complexity of computing queries with preference rules is still polynomial.
Download

Paper Nr: 903
Title:

DIMENSION HIERARCHIES UPDATES IN DATA WAREHOUSES - A User-driven Approach

Authors:

Cécile Favre, Fadila Bentayeb and Omar Boussaid

Abstract: We designed a data warehouse for the French bank LCL meeting users' needs regarding marketing operations decision. However, the nature of the work of users implies that their requirements are often changing. In this paper, we propose an original and global approach to achieve a user-driven model evolution that provides answers to personalized analysis needs. We developed a prototype called WEDrik (data Warehouse Evolution Driven by Knowledge) within the Oracle 10g DBMS and applied our approach on banking data of LCL.
Download

Paper Nr: 924
Title:

TRANSACTION SERVICE COMPOSITION - A Study of Compatibility Related Issues

Authors:

Anna-Brith Arntsen and Randi Karlsen

Abstract: Different application domains have varying transactional requirements. Such requirements must be met by applying adaptability and flexibility within transaction processing environments. ReflecTS is such an environment providing flexible transaction processing by exposing the ability to select and dynamically compose a transaction service suitable for each particular transaction execution. A transaction service (TS) can be seen as a composition of a transaction manager (T M) and a number of involved resource managers (RMs). Dynamic transaction service composition raises a need to examine issues regarding Vertical Compatibility between the components in a TS. In this work, we present a novel approach to service composition by evaluating Vertical Compatibility between a T M and RMs - which includes Property and Communication compatibility.
Download

Paper Nr: 926
Title:

SEMANTIC ORCHESTRATION MERGING - Towards Composition of Overlapping Orchestrations

Authors:

Clémentine Nemo, Mireille Blay, Michel Riveill and Günter Kniesel

Abstract: Service oriented architectures foster evolution of enterprise information systems by supporting loose coupling and easy composition of services. Unfortunately, current approaches to service composition are inapplicable to services that share subservices or data. In this paper, we define overlapping orchestrations, analyze the problems that they pose to existing composition approaches and propose orchestration merging, a novel, interactive approach to composition of overlapping orchestrations based on their semantic.
Download

Paper Nr: 971
Title:

WFESelector - A Tool for Comparing and Selecting Workflow Engines

Authors:

Karim Baina

Abstract: The task of selecting a workflow engine becomes more and more complex and risky. For this reason, organizations require a broad, and a clear vision of which workflow engines are, and will continue to be, suitable for changing requirements. This paper presents a workflow engines comparison model to analyze, compare, and select business process management modeling and enactment engines (Workflow Engines or WFEs) according to user specific requirements. After the description of the underlying model itself, we present the implementation of this workflow engines comparison model through our multi-criteria workflow engines comparison and selection prototype WFESelector. The later proposes two scenarios for selecting relevant WFE: either to express dynamically multi-criteria query upon a WFE evaluation database, or to browse the whole WFE classification through a reporting aggregation based dashboard. WFESelector is subsequently experimented to assess criteria satisfaction on a very large number of open source workflow engines (as numerous as 35).
Download

Paper Nr: 981
Title:

PIN: A PARTITIONING & INDEXING OPTIMIZATION METHOD FOR OLAP

Authors:

Ricardo Santos and Jorge Bernardino

Abstract: Optimizing the performance of OLAP queries in relational data warehouses (DW) has always been a major research issue. There are various techniques that can be used to achieve its goals, such as data partitioning, indexing, data aggregation, data sampling, redefinition of database (DB) schemas, among others. In this paper we present a simple and easy to implement method which links partitioning and indexing based on the features present in predefined major decision making queries to efficiently optimize a data warehouse’s performance. The evaluation of this method is also presented using the TPC-H benchmark, comparing it with standard partitioning and indexing techniques, demonstrating its efficiency with single and multiple simultaneous user scenarios.
Download

Paper Nr: 984
Title:

SIMPLIFIED QUERY CONSTRUCTION - (Queries Made as Easy as Possible)

Authors:

Brad Arshinoff, Damon Ratcliffe, Martin Saetre, Reda Alhajj and Tansel Ozyer

Abstract: QMAEP- Queries Made as Easy as Possible, is intended to be a system that greatly simplifies the process of query construction for statisticians and researchers. This document is focused on the usability of the database query language and deals with visual representations of the query process, in specific the select query. Methods of integrating simple Graphical User Interfaces (GUIs) for building queries into pre-existing database forms is explored to provide users an intuitive method for query construction. This paper explores data mining as it pertains to clinical research with emphasis on simplifying the data extraction process from complex databases so as to accommodate analysis using important statistical software such as SASS, QMath and MS Excel.
Download

Paper Nr: 1050
Title:

THE CONCEPTUAL FRAMEWORK FOR BUSINESS PROCESS INNOVATION - Towards a Research Program on Global Supply Chain Intelligence

Authors:

Charles Møller

Abstract: Industrial supply chains today are globally scattered and nearly all organizations rely on their Enterprise Information Systems (EIS) for integration and coordination of their activities. In this context innovation in a global supply chain must be driven by advanced information technology. This paper proposes a research program on Global Supply Chain Intelligence. The paper argues that a conceptual framework for BPI is required to approach innovations in a global supply chain. A research proposal based on five interrelated topics is derived from the framework. The research program is intended to establish and to develop the conceptual framework for BPI further and to apply this framework in a global supply chain context.
Download

Paper Nr: 1059
Title:

MEDIATION FRAMEWORK FOR ENTERPRISE INFORMATION SYSTEM INFRASTRUCTURES: - Application-driven Approach

Authors:

Leonid Kalinichenko, Dmitry Briukhov, Dmitry Martynov, Nikolay Skvortsov and Sergey Stupnikov

Abstract: This position paper provides a short summary of results obtained so far on an application-driven approach for mediation-based EIS development. This approach has significant advantages over the conventional, information source driven approach. Basic methods for the application-driven approach are discussed including synthesis methods of canonical information models, unifying languages of various kinds of heterogeneous information sources in one extensible model, methods of identification of sources relevant to an application and their registration at the mediator applying GLAV techniques as well as ontological contexts reconciliation methods. Methodology of EIS application development according to the approach is briefly discussed emphasizing importance of a mediator consolidation phase by the respective community, application problem formulations in canonical model and their rewriting into the requests to the registered information sources. The technique presented is planned to be used in various EIS and information systems.
Download

Paper Nr: 1065
Title:

AN INSERTION STRATEGY FOR A TWO-DIMENSIONAL SPATIAL ACCESS METHOD

Authors:

Wendy Osborn and Ken Barker

Abstract: This paper presents the 2DR-tree, a novel approach for accessing spatial data. The 2DR-tree uses nodes that are the same dimensionality as the data space. All spatial relationships between objects are preserved. A validity rule ensures that every node preserves the spatial relationships among its objects. The proposed insertion strategy adds a new object by recursively partitioning the space occupied by a set of objects. A performance evaluation shows the advantages of the 2DR-tree and identifies issues for future consideration.
Download

Paper Nr: 104
Title:

A NEW LOOK INTO DATA WAREHOUSE MODELLING

Authors:

Nikolay Nikolov

Abstract: The dominating paradigm of Data Warehouse design is the star schema (Kimball, 1996). The main debate within the scientific community for years has been not whether this paradigm is really the only way, but, rather, on its details (e.g. “to snowflake or not to snowflake” – Kimball et al., 1998). Shifting the emphasis of the discourse entirely within the star schema paradigm prevents the search for better alternatives. We argue that the star schema paradigm is an artifact of the transactional perspective and does not account for the analytic perspective. The most popular formalized method for deriving the star schema (Golfarelli et al., 1998) underlines just that by taking only the entity-relationship-model (ERM) as an input. Although this design approach follows the natural data and work-flow, it does not necessarily offer the best performance. The main thrust of our argument is that the query model should be used on a par with the ERM as a starting point in the data warehouse design process. The rationale is that the end design should reflect not just the structure inherent in the data model, but also that of the expected workload. Such approach results in a schema which may look very different than the traditional star schema but the performance improvement it may achieve justifies going off-the-beaten track.
Download

Paper Nr: 214
Title:

A DATABASE INTEGRATION SYSTEM BASED ON GLOBAL VIEW GENERATION

Authors:

Uchang Park and Ramon Lawrence

Abstract: Database integration is a common and growing challenge with the proliferation of database systems, data warehouses, data marts, and other OLAP systems in organizations. Although there are many methods of sharing data between databases, true interoperability of database systems requires capturing, comparing, and merging the semantics of each system. In this work, we present a database integration system that improves on the database federation architecture by allowing domain administrators to simply and efficiently capture database semantics. The semantic information is combined using a tool for producing a global view. Building the global view is the bottleneck in integration because there are few tools that support its construction, and these tools often require sophisticated knowledge and experience to operate properly. The technique and tool presented is simple and powerful enough to be used by all database administrators, yet expressive enough to support the majority of integration queries.
Download

Paper Nr: 258
Title:

TEXT ANALYTICS AND DATA ACCESS AS SERVICES - A Case Study in Transforming a Legacy Client-server Text Analytics Workbench and Framework to SOA

Authors:

E. Maximilien, Ying Chen, Ana Lelescu, James Rhodes, Jeffrey Kreulen and Scott Spangler

Abstract: As business information is made available via the intranet and Internet, there is a growing need to quickly analyze the resulting mountain of information to infer business insights. For instance, analyzing a company’s patent database against another’s to find the patents that are cross-licensable. IBM Research’s Business Insight Workbench (BIW) is a text mining and analytics tool that allows end-users to explore, understand, and analyze business information in order to come up with such insight. However, the first incarnation of BIW used a thick-client architecture with a database back-end. While very successful, the architecture caused limitations in the tool’s flexibility, scalability, and deployment. In this paper we discuss our initial experiences in converting BIW into a modern Service-Oriented Architecture. We also provide some insights into our design choices and also outline some lessons learned.
Download

Paper Nr: 401
Title:

INCENTIVES AND OBSTACLES IN IMPLEMENTING INTER-ORGANISATIONAL INTEROPERABILITY

Authors:

Raija Halonen and Veikko Halonen

Abstract: This paper explores the incentives and obstac