ICEIS 2022 Abstracts

Area 1 - Databases and Information Systems Integration

Full Papers

Paper Nr:	68
Title:	An Iterated Local Search for a Pharmaceutical Storage Location Assignment Problem with Product-cell Incompatibility and Isolation Constraints
Authors:	Nilson F. M. Mendes, Beatrice Bolsi and Manuel Iori
Abstract:	In healthcare supply chain, centralised warehouses are used to store large amounts of products close to hospitals and pharmacies in order to avoid shortages and reduce storage costs. To reach these objectives, the warehouses need to have efficient order retrieval and dispatch procedures, as well as a storage allocation policy able to guarantee the safe keeping of items. Considering this scenario, we present a Storage Location Assignment Problem with Product-Cell Incompatibility and Isolation Constraints, that models the targets and restrictions of a storage policy in a pharmaceutical product warehouse. In this problem, we aim to minimise the total distance travelled by the order pickers to recover all products required in a set of orders. We propose an Iterated Local Search algorithm to solve the problem, and present numerical experiments based on simulated data. The results show a relevant improvement with respect to a greedy full turnover procedure commonly adopted in real life operations.
Download

Paper Nr:	83
Title:	FAIR Principles and Big Data: A Software Reference Architecture for Open Science
Authors:	João P. C. Castro, Lucas M. F. Romero, Anderson C. Carniel and Cristina D. Aguiar
Abstract:	Open Science pursues the assurance of free availability and usability of every digital outcome originated from scientific research, such as scientific publications, data, and methodologies. It motivated the emergence of the FAIR Principles, which introduce a set of requirements that contemporary data sharing repositories must adopt to provide findability, accessibility, interoperability, and reusability. However, implementing a FAIR-compliant repository has become a core problem due to two main factors. First, there is a significant complexity related to fulfilling the requirements since they demand the management of research data and metadata. Second, the repository must be designed to support the inherent big data complexity of volume, variety, and velocity. In this paper, we propose a novel FAIR-compliant software reference architecture to store, process, and query massive volumes of scientific data and metadata. We also introduce a generic metadata warehouse model to handle the repository metadata and support analytical query processing, providing different perspectives of data insights. We show the applicability of the architecture through a case study in the context of a real-world dataset of COVID-19 Brazilian patients, detailing different types of queries and highlighting their importance to big data analytics.
Download

Paper Nr:	86
Title:	Where Is the Internet of Health Things Data?
Authors:	Evilasio Costa Junior, Rossana M. C. Andrade, Amanda D. P. Venceslau, Pedro Almir M. Oliveira, Ismayle S. Santos and Breno S. Oliveira
Abstract:	The advent of Internet of Things (IoT) and the smart objects popularization have boosted the data generation in many areas. Data have then become increasingly valuable as they can be used to “teach” machines to perform the most varied tasks. Health is among the areas that have benefited from such data, because there is, for example, a need for solutions that optimize the cost-benefit ratio of health systems. In this scenario, the Internet of Health Things (IoHT) uses smart sensors to collect patient data and intelligent algorithms to process this data for improving patient Quality of Life. However, researchers and practitioners have faced difficulties in finding and using public health care data sensor repositories. Therefore, we conducted a systematic multivocal review of IoHT databases to identify and characterize the existing datasets. We also bring as a contribution of this paper a set of guidelines about how new IoHT data repositories can be structured.
Download

Paper Nr:	155
Title:	An Application of the Analytic Hierarchy Process to the Evaluation of Companies’ Data Maturity
Authors:	Simone Malacaria, Andrea De Mauro, Marco Greco and Michele Grimaldi
Abstract:	The study reports the data maturity evaluation on a sample of Italian firms of different sectors and sizes, retrieved through an online assessment made by 261 professionals and entrepreneurs operating in the data domain. The paper's objective is to derive the relative importance of the critical factors to impact successful big data initiatives, according to organization reality and manager perspective. The questionnaire was distributed among IT professionals and decision-makers in Italy using the LinkedIn platform. The assessment was divided into two sections: the 1st one contained the assessment of 8 critical success factors for big data, whereas the 2nd section assessed weights based on an application of the analytic hierarchy process. The result of this process is a scoring system that includes the characteristics a company "must-have" to become data-oriented and make data-driven decisions. The application of the weights allows giving more importance to the domains that managers think are more important in a data-driven company. Respondents agreed to the importance of integrated architecture, data-friendly corporate culture, and integrated organization domains. Once the results consider the weights from the AHP, data friendliness becomes the most sought-after characteristic. The findings provide direction for further development of the assessment system.
Download

Paper Nr:	172
Title:	A Data Quality Management Framework to Support Delivery and Consultancy of CRM Platforms
Authors:	Renee Albrecht, Sietse Overbeek and Inge van de Weerd
Abstract:	CRM platforms heavily depend on high-quality data, where poor-quality data can negatively influence its adoption. Additionally, these platforms are increasingly interconnected and complex to meet growing needs of customers. Hence, delivery and consultancy of CRM platforms becomes highly complex. In this study, we propose a CRM data quality management framework that supports CRM delivery and consultancy firms to improve data quality management practices within their projects. The framework should also improve data quality within CRM solutions for their clients. We extract best practices for CRM data quality management by means of a literature study on data quality definition and measurement, data quality challenges, and data quality management methods. In a case study at an IT consultancy company, we investigate how CRM delivery and consultancy projects can benefit from the incorporation of data quality management practices. The design of the framework is validated by means of confirmatory focus groups and a questionnaire. The results translate into a framework that provides a high-level overview of data quality management practices incorporated in CRM delivery and consultancy projects. It includes the following components: Client profiling, project definition, preparation, migration/integration, data quality definition, assessment, and improvement.
Download

Short Papers

Paper Nr:	4
Title:	An Extensible Framework for Data Reliability Assessment
Authors:	Óscar Oliveira and Bruno Oliveira
Abstract:	Data Warehouse (DW) and Data Lake (DL) systems are mature and widely used technologies to integrate data for supporting decision-making. They support organizations to explore their operational data that can be used to take competitive advantages. However, the amount of data generated by humans in the last 20 years increased exponentially. As a result, the traditional data quality problems that can compromise the use of analytical systems, assume a higher relevance due to the massive amounts and heterogeneous formats of the data. In this paper, an approach for dealing with data quality is described. Using a case study, quality metrics are identified to define a reliability indicator, allowing the identification of poor-quality records and their impact on the data used to support enterprise analytics.
Download

Paper Nr:	15
Title:	Exploring User-centered Requirements Validation and Verification Techniques in a Social Inclusion Context
Authors:	Emille Catarine Rodrigues Cançado, Ian Nery Bandeira, Pedro Henrique Teixeira Costa, José Fortes Neto, Daniel Carvalho Moreira, Luís Amaral and Edna Dias Canedo
Abstract:	The activities that comprise a requirements engineering process involve elicitation, modeling, validation, and verification of requirements, and these activities tend to be more communication and interaction-intensive than others during the software development process. This paper presents an experience report on requirements validation and verification techniques applied to a mobile application project developed for the Brazilian prison system’s former inmates, aiming to support them in their resocialization process. Besides, it presents the decisions we made in agreement with the project stakeholders to guarantee the end-users data privacy. Our results show that even with the Covid-19 pandemic and social isolation restrictions, it was possible to apply the requirements validation techniques. Furthermore, the mobile application’s acceptance tests with both stakeholders and the end-users demonstrate that the developers duly followed the privacy guidelines. Finally, all privacy requirements comply with the stakeholders and the application’s end-users needs and are under the Brazilian General Data Protection Law (LGPD).
Download

Paper Nr:	38
Title:	DMISTA: Conceptual Data Model for Interactions in Support Ticket Administration
Authors:	Christian Mertens and Andreas Nürnberger
Abstract:	Changing business models and dynamic markets in the globally connected world results in more and more complex system environments. The IT service infrastructure as enabler of innovative business models has to support these innovations by providing agile methods to quickly adapt to new use-cases. This underlines the need to manage the digitized environment systematically in order to foster efficiency. IT Service Management (ITSM) as a discipline evolved and now provides the framework to orchestrate the complexity in Information Technology. The activities, processes, and capabilities to maintain the portfolio are served by individuals, who interact with each other. There is an emphasized need for identifying, acquiring, organizing, storing, retrieving, and analyzing data related to human interaction processes to support finally the business processes. This paper proposes a conceptual data model to capture information about human interactions during support ticket administration (DMISTA). The presented model-structure and -requirements allow for efficient selection of appropriate data for various data science use-cases to understand and optimize business processes. The DMISTA supports different types of relationships (based on causality, joint cases, and joint activities) to enable efficient processing of specific analysis methods. The applicability of the model is shown based on a typical use-case.
Download

Paper Nr:	39
Title:	An Evolutionary Algorithm for Task Scheduling in Crowdsourced Software Development
Authors:	Razieh Saremi, Hardik Yardik, Julian Togelius, Ye Yang and Guenther Ruhe
Abstract:	The complexity of software tasks and the uncertainty of crowd developer behaviors make it challenging to plan crowdsourced software development (CSD) projects. In a competitive crowdsourcing marketplace, competition for shared worker resources from multiple simultaneously open tasks adds another layer of uncertainty to potential outcomes of software crowdsourcing. These factors lead to the need for supporting CSD managers with automated scheduling to improve the visibility and predictability of crowdsourcing processes and outcomes. To that end, this paper proposes an evolutionary algorithm-based task scheduling method for crowdsourced software development. The proposed evolutionary scheduling method uses a multiobjective genetic algorithm to recommend optimal task start date. The method uses three fitness functions, based on project duration, task similarity, and task failure prediction, respectively. The task failure fitness function uses a neural network to predict the probability of task failure with respect to a specific task start date. The proposed method then recommends the best tasks’ start dates for the project as a whole and each individual task so as to achieve the lowest project failure ratio. Experimental results on 4 projects demonstrate that the proposed method has the potential to reduce project duration by a factor of 33-78%.
Download

Paper Nr:	44
Title:	Semantic Metadata Requirements for Data Warehousing from a Dimensional Modeling Perspective
Authors:	Susanna E. S. Campher
Abstract:	The era of big data has brought on new challenges to data warehousing. Emerging architectural paradigms such as data fabric, data mesh, lakehouse and logical data warehouse are promoted as solutions to big data analytics challenges. However, such hybrid environments, aimed at offering universal data platforms for analytics, have schemas that tend to grow in size and complexity and become more dynamic and decentralized, having a drastic impact on data management. Data integrity, consistency and clear meaning are compromised in large architectures where traditional (relational) database principles do not apply. This paper proposes an investigation into semantic metadata solutions in modern data warehousing from a (logical) dimensional modeling perspective. The primary goal is to determine which metadata and types of semantics are required to support automated dimensionalization as it is assumed to be a good approach to integrate data with different modalities. A secondary goal is finding a suitable model to represent such metadata and semantics for both human and computer interpretability and use. The proposal includes a description of the research problem, an outline of the objectives, the state of the art, the methdology and assumptions, the exepected outcome and current stage of the research.
Download

Paper Nr:	46
Title:	A Methodology for Aligning Process Model Abstraction Levels and Stakeholder Needs
Authors:	Dennis G. J. C. Maneschijn, Rob H. Bemthuis, Faiza A. Bukhsh and Maria-Eugenia Iacob
Abstract:	Process mining derives knowledge of the execution of processes through analyzing behavior as observed from real-life events. While benefits of process mining are widely acknowledged, finding an adequate level of detail at which a mined process model is suitable for a specific stakeholder is still an ongoing challenge. Process models can be mined at different levels of abstraction, often resulting in either highly complex or highly abstract process models. This may have an important impact on the comprehensibility of the process model, which can also differ from the perspective of a particular stakeholder. To address this problem from a stakeholder-centric perspective, we propose a methodology for determining an appropriate level of process model abstraction. To this end, we use quantitative metrics on process models as well as a qualitative evaluation by using a technology acceptance model (TAM). A logistics case study involving the fuzzy process mining discovery algorithm shows initial evidence that the use of appropriate abstraction levels is key when considering the needs of various stakeholders.
Download

Paper Nr:	60
Title:	Formal Concept Analysis Applied to a Longitudinal Study of COVID-19
Authors:	Paulo Lana, Cristiane Nobre, Luis Zarate and Mark Song
Abstract:	The COVID-19 pandemic, and consequently the difficulty of obtaining feedback on the effectiveness of contamination prevention methods, has caused an increased need to produce a relevant and consistent analysis from collected data. Through Formal Concept Analysis, applying the triadic approach, called Triadic Concept Analysis (TCA), it is possible to evaluate the correlation between prevention measures and the number of contaminated people by performing concept extraction and implication rules. The advantage of using this method is the possibility of correlating the waves, which allows us to explain and understand the evolution of the data over the collection waves, helping us draw a more assertive conclusion from the data analyzed. This paper uses the data collected from the 2020 National Population Survey of Nigeria to depict how Nigerian society’s essential and everyday behaviors impacted the evolution of the COVID-19 pandemic in that country. The results obtained from this research can assist governments, and public entities in developing better public policies to combat highly infectious diseases. Furthermore, it provides practical evidence of how TCA can be applied, bringing benefits to different areas and fields of science.
Download

Paper Nr:	69
Title:	Application of Formal Concept Analysis and Data Mining to Characterize Infant Mortality in Two Regions of the State of Minas Gerais
Authors:	Deivid Santos, Cristiane Nobre, Luis Zarate and Mark Song
Abstract:	Infant mortality is characterized by the death of children under one year, a problem that affects a large part of the world population. This article applies the Formal Concept Analysis (FCA), a mathematical technique used in data analysis to characterize infant mortality in two regions of Minas Gerais state - Brazil: Belo Horizonte and Vale do Jequitinhonha. The Metropolitan Region of Belo Horizonte has the best human development rate, and Vale do Jequitinhonha has the worst social equality. The relationships between attributes and victims are identified through association rules and implications.
Download

Paper Nr:	72
Title:	Blending Topic-based Embeddings and Cosine Similarity for Open Data Discovery
Authors:	Maria Helena Franciscatto, Marcos Didonet Del Fabro, Luis Carlos Erpen de Bona, Celio Trois and Hegler Tissot
Abstract:	Source discovery aims to facilitate the search for specific information, whose access can be complex and dependent on several distributed data sources. These challenges are often observed in Open Data, where users experience lack of support and difficulty in finding what they need. In this context, Source Discovery tasks could enable the retrieval of a data source most likely to contain the desired information, facilitating Open Data access and transparency. This work presents an approach that blends Latent Dirichlet Allocation (LDA), Word2Vec, and Cosine Similarity for discovering the best open data source given a user query, supported by joint union of the methods’ semantic and syntactic capabilities. Our approach was evaluated on its ability to discover, among eight candidates, the right source for a set of queries. Three rounds of experiments were conducted, alternating the number of data sources and test questions. In all rounds, our approach showed superior results when compared with the baseline methods separately, reaching a classification accuracy above 93%, even when all candidate sources had similar content.
Download

Paper Nr:	75
Title:	ICT Development and Food Consumption: An Impact of Online Food Delivery Services
Authors:	Gunawan
Abstract:	The online food delivery (OFD) service has grown globally. The growth of OFD depends on the ICT development where the online business could grow in a region. This study departs from the question, "does ICT development impact the food consumption of society?" The answer is likely to provide evidence for the ICT impact on a new issue: food consumption. In the Indonesian context, the study objectives are: (1) to investigate the pattern of ICT development among provinces in Indonesia; (2) to investigate the pattern of food consumption indicators among provinces; (3) to cluster provinces based on ICT development and food consumption; (4) to deploy a predictive model into another dataset. This study takes place in Indonesia, where the OFD revenue was projected about $800 million by 2021. This secondary and quantitative research adopted a data mining approach by analyzing data of ICT development and food consumption among Indonesian provinces. The clustering analysis indicated that provinces with higher ICT development have higher food consumption. The result is likely to explain the impact of the OFD growth. Managers of OFD platforms might use the finding to decide which provinces to focus on for their marketing strategy. As a prominent actor for ICT development, the government might use the result to formulate a better plan to improve ICT access. This study suggests that the government and OFD platforms promote healthy food eating to improve public health. The use of official statistics and data mining approach provides this research with generalized findings at the country level. Further studies are needed to obtain a generalization of the results beyond Indonesia.
Download

Paper Nr:	80
Title:	Entity-relationship Modeling Tools and DSLs: Is It Still Possible to Advance the State of the Art from Observations in Practice?
Authors:	Jonnathan Lopes, Maicon Bernardino, Fábio Basso and Elder Rodrigues
Abstract:	The variety of database system technologies that became available in recent years makes difficult the selection of tools for modeling entity-relationship (ER). The published mapping studies on this topic date back to 2000, thus outdated and limited to guide designers towards the recent innovations selection for the design and implementation of databases. In this sense, we contribute with an overview of the recent innovations through a systematic literature mapping complemented by research in the gray literature. This paper scopes ten (10) primary studies focused on Domain-Specific Languages (DSL) and identifies fifty-five (55) tools already applied in industry and academia for ER modeling at the conceptual, logical, and physical level. Hence, as a significant increment to existing mapping studies, this presents the state-of-the-art and practice for ER modeling, including its characterization and research gaps.
Download

Paper Nr:	109
Title:	Towards Unlocking the Potential of the Internet of Things for the Skilled Crafts
Authors:	André Pomp, Andreas Burgdorf, Alexander Paulus and Tobias Meisen
Abstract:	The Internet of Things (IoT) enables companies to develop new digital business models or optimize existing processes through digitalization. Since value creation in the skilled crafts is determined by the manufacturing of material products and the provision of associated services, it is predestined for the use of IoT technologies. While these services are increasingly finding their way into the consumer market via industrial providers, the local skilled crafts with its small and medium-sized businesses lacks the knowledge to assess the potential and to adequately develop and operate IoT solutions. Our aim is to develop a manufacturer-independent platform that enables those businesses to implement IoT solutions independently. The platform is intended to support craftsmen in identifying suitable IoT use cases and resulting business models for their trades, products or services. For that, it will provide an overview of the components (e.g., sensors) required for the respective use cases. Based on this, the use cases can be set up without special know-how with the help of the platform, which also collects and manages the accruing data. Each craft business can identify new use cases and make them available to the other users of the platform, thus creating a cross-trade solution from the skilled crafts for the skilled crafts. Potential IoT use cases and their technical requirements were already identified in a pre-study in collaborative hackathons and will be evaluated during the development of the IoT Crafts Portal. The results are intended to assist craft businesses from a wide range of trades to identify and implement new business models in such a way that they can be integrated into the existing processes of the businesses. At the same time, there will be a transfer of knowledge between the craft businesses themselves, since each can use the experience of the other business or offer its own insights.
Download

Paper Nr:	111
Title:	A Platform to Generate FAIR Data for COVID-19 Clinical Research in Brazil
Authors:	Vânia Borges, Natalia Queiroz de Oliveira, Henrique F. Rodrigues, Maria Luiza Machado Campos and Giseli Rabello Lopes
Abstract:	The COVID-19 pandemic and the global actions to address it have highlighted the importance of clinical care data for more detailed studies of the virus and its effects. Extracting and processing such data, in terms of confidentiality issues, is a challenge. In addition, the mechanisms necessary for their publication are aimed at reuse in research to better understand the effects of this pandemic or other viral outbreaks. This paper describes a modular, scalable, distributed, and flexible platform, based on a generic architecture, to promote the publication of FAIR clinical research data. This platform collects heterogeneous data from Electronic Health Records, transforms these data into interconnected and interoperable (meta)data that are processable by software agents, and publishes them through technological solutions such as repositories and FAIR Data Point.
Download

Paper Nr:	120
Title:	Analyzing the Determinant Characteristics for a Good Performance at ENADE Brazilian Exam Stratified by Teaching Modality: Face-to-face versus Online
Authors:	Eric Gondran, Giancarlo Lucca, Rafael Berri, Helida Santos and Eduardo N. Borges
Abstract:	The National Student Performance Exam (ENADE) annually evaluates different Brazilian higher education courses. This exam considers both face-to-face and distance learning courses. Distance learning is growing increasingly, especially during the coronavirus (COVID-19) pandemic. This study applies different techniques for selecting ENADE 2018 database characteristics, like information gain, gain rate, symmetric uncertainty, Pearson correlation, and relief F. The objective of the work is to discover which personal and socioeconomic characteristics are decisive for the student’s performance at ENADE, whether the student is in the context of Distance Education or face-to-face. It can be concluded, among other results, that: the father’s level of education directly influences performance; the higher the income, the better the performance; and white students have better performance than black and brown-skinned ones. Thus, the results obtained in this study may initiate analyzes of public policies towards improving performance at ENADE.
Download

Paper Nr:	141
Title:	Steiner Tree-based Collaborative Learning Group Formation in Trust Networks
Authors:	Yifeng Zhou, Shichao Lin and Qi Zhao
Abstract:	Group formation is one of the key problems for collaborative learning, i.e., how to allocate agents (learners) to appropriate groups in order to improve the learning utility of the system. Previous works often focus on investigating the potential factors that may influence the agent’s learning utility from the perspective of intrinsic attributes of agents; however, the structural attributes of groups are rarely considered. Considering that trust is an important interactive and cognitive attribute in collaborative learning, which can influence not only the incentive of learners collaborating in a group but also the promotion of skills of agents in knowledge sharing, this paper studies the collaborative learning group formation problem in trust networks. We propose a Steiner tree-based group formation algorithm, which first allocates appropriate agents to groups as initiators by considering the skill mastery and the strength of trust in the groups to guarantee the opportunities for skill promotion and then select followers by searching locally in the trust network. Through experiments based on real-world network datasets, we validate the performance of the proposed algorithm by comparing to several benchmarks, e.g., the graph partitioning-based group formation algorithm and the simulated annealing-based group formation algorithm.
Download

Paper Nr:	145
Title:	Integration of the Autonomous Open Data Prediction Framework in ERP Systems
Authors:	Janis Peksa and Janis Grabis
Abstract:	Enterprise resource planning (ERP) systems are large modular enterprise applications that are designed to execute the majority of enterprise business processes with a focus on transaction processing. Business processes, on the other hand, frequently necessitate complex decision-making. If data processing logic requires complex analytical calculations and domain specific knowledge, is is considered as complex. To externalize the analytical calculations and decouple them from the core ERP system, this paper elaborates an integration framework fererred as to Autonomous Open Data Prediction Framework (AODPF). The AODPF provides advanced prediction capabilities to ERP systems. It uses data integration and processing as well as best model selection functions to generate predictions passed to the ERP system for decision-making purposes. The framework is experimentally evaluated by prediction road conditions for the case of winter road maintenance. The utility of the framework is evaluated in the expert survey.
Download

Paper Nr:	156
Title:	Integrity: An Object-relational Framework for Data Security
Authors:	Elder Costa, João Pedro Lorenzo de Siqueira, Carlos Eduardo Pantoja and Nilson Mori Lazarin
Abstract:	Considering the recent laws that discuss how user data must be collected, treated, stored, and protected, designing and developing projects considering the information security in application systems is necessary. Given that cryptographic functions available in database management systems are limited to some data types, this work proposes an object-relational framework to add data security using a process to mask data in the persistence layer of a layered application.
Download

Paper Nr:	179
Title:	Machine Learning Performance on Predicting Banking Term Deposit
Authors:	Nguyen Minh Tuan
Abstract:	With the expansion of epidemic diseases and after the crises of the economy in the world, choosing financial deposits for many purposes is very helpful. To identify a customer whether deposit or not, based on the information given to analyze and predict, it is becoming increasingly difficult for banks to identify whether customer that is right for them. Many banks will be reconfigured beyond recognition to attract customers, while others are facing a shortage drawing customers to maintain the business as a corollary of advances in particular. To serve customers with the information needed to select a suitable deposit in such a rapidly evolving and competitive arena requires more than merely following one’s passion. We assert such information may be derived by analyzing some descriptions using deep neural network models, a novel approach to identifying the descriptions about age, job, marital, education, default, balance, housing, loan, contact, day, month, duration, campaigns, pdays, previous, outcome, deposit (y) in choosing an appropriate deposit customer. There have been some researchers written about this prediction but they just focused on algorithms models instead of concentrating on deep machine learning. In this paper, we will muster up algorithms using the models on deep machine learning with Long-Short Term Memory (LSTM), Gated Recurrent Unit (GRU), Bidirectional Long-Short Term Memory (BiLSTM), Bidirectional Gated Recurrent Unit (BiGRU), and Simple Recurrent Neuron Network (SimpleRNN). The result will suggest suitable customers based on the information given. The results showed that Gated Recurrent Unit (GRU) reaches the best accuracy with 90.08% at epoch 50th, and the following is the Bidirectional Long-Short Term Memory (BiLSTM) model with 90.05% at epoch 50th. The results will be helpful for the banks to confirm whether the customers could deposit or not.
Download

Paper Nr:	195
Title:	Social Bots Detection: A Method based on a Sentiment Lexicon Learned from Messages
Authors:	Samir de O. Ramos, Ronaldo R. Goldschmidt and Alex de V. Garcia
Abstract:	The use of bots on social networks for malicious purposes has grown significantly in recent years. Among the last generation techniques used in the automatic detection of social bots, are those that take into account the sentiment existing in the messages propagated on the network. This information is calculated based on sentiment lexicons with content manually annotated and, hence, susceptible to subjectivity. In addition, words are analyzed in isolation, without taking into account the context in which they are inserted, which may not be sufficient to express the sentiment existing in the sentence. With these limitations, this work raises the hypothesis that the automatic detection of social bots that considers the sentiment characteristics of the words of the messages can be improved if these characteristics were previously learned by machines from the data, instead of using manually annotated lexicons. To verify such hypothesis, this work proposes a method that detects bots based on Sentiment-Specific Word Embedding (SSWE), a lexicon of sentiment learned by a homonymous recurrent neural network, trained in a large volume of messages. Preliminary experiments carried out with data from Twitter have generated evidence that suggests the adequacy of the proposed method and confirms the raised hypothesis.
Download

Paper Nr:	202
Title:	Performance of Raspberry Pi during Blockchain Execution using Proof of Authority Consensus
Authors:	J. A. Guerra, J. I. Guerrero, A. Gallardo, D. F. Larios and C. León
Abstract:	Raspberry Pi is one of the most popular devices for research in many different fields. It is proposed to analyse its performance as a lightweight blockchain node. This could enable the Raspberry Pi to execute other tasks and the same time, like data acquisition or working as an Internet of Things node, without losing performance. To achieve this, a specific consensus protocol is used to light the processing load. This testbed is evaluated in several benchmarks, whose results clarify the limits of this device as a lightweight blockchain node.
Download

Paper Nr:	23
Title:	Data Sharing for Fraud Detection in Insurance: Challenges and Possibilities
Authors:	Carl Christophe Louis Søilen-Knutsen and Bjørnar Tessem
Abstract:	Digital development has opened up new tools to enable innovation, one of the options being data sharing among businesses. This paper addresses data sharing in the insurance industry and its innovation potential through a case study from a Norwegian data sharing project. The goal of the studied project is to achieve cross-company data sharing and with that enabling more efficient insurance fraud detection. We look at what requirements need to be fulfilled for data sharing to be implemented and what kind of challenges such a data sharing project meets. We analyse interview data from project participants and systematize their opinions and impressions regarding possibilities and challenges for data sharing. The case shows that data sharing among competitors in the insurance industry is hard to realise, very much due to the lack of trust in how the others will use data, but also due to competition laws and other regulations.
Download

Paper Nr:	37
Title:	Results from using an Automl Tool for Error Analysis in Manufacturing
Authors:	Alexander Gerling, Oliver Kamper, Christian Seiffer, Holger Ziekow, Ulf Schreier, Andreas Hess and Djaffar Ould Abdeslam
Abstract:	Machine learning (ML) is increasingly used by various user groups to analyze product errors with data recorded during production. Quality engineers and production engineers as well as data scientists are the main users of ML in this area. Finding a product error is not a trivial task due to the complexity of today’s production processes. Products have often many features to check and they are tested in various stages in the production line. ML is a promising technology to analyze production errors. However, a key challenge for applying ML in quality management is the usability of ML tools and the incorporation of domain knowledge for non-experts. In this paper, we show results from using our AutoML tool for manufacturing. This tool makes the use of domain knowledge in combination with ML easy to use for non-experts. We present findings obtained with this approach along with five sample cases with different products and production lines. Within these cases, we discuss the occurred error origins that were found and show the benefit of a supporting AutoML tool.
Download

Paper Nr:	88
Title:	Virus Spread Modeling and Simulation: A Behavioral Parameters Approach and Its Application to Covid-19
Authors:	Alfredo Cuzzocrea and Edoardo Fadda
Abstract:	How a virus spread on a network is a really important topic and even more important is to classify the danger of a virus. With this goal in mind, we investigate the characteristics that define the most deadly virus. Moreover, we aim to provide a simplified discrete-time simulation, described by few parameters, as a straightforward alternative to more complex models of diseases diffusion. The simulation is used to model the spread of the infection, and the obtained results are then analyzed to understand how the virus’ behavior varies by changing its characteristics and the network topology.
Download

Paper Nr:	100
Title:	Multimedia Indexing and Retrieval: Optimized Combination of Low-level and High-level Features
Authors:	Mohamed Hamroun, Henri Nicolas and Benoit Crespin
Abstract:	Nowadays, the number of theoretical studies that deal with classification and machine learning from a general point of view, without focusing on a particular application, remains very low. Although they address real problems such as the combination of visual (low-level) and semantic (high-level) descriptors, these studies do not provide a general approach that gives satisfying results in all cases. However, the implementation of a general approach will not go without asking the following questions: (i) How to model the combination of the information produced by both low-level and high-level features? (ii) How to assess the robustness of a given method on different applications ? We try in this study to address these questions that remain open-ended and challenging. We proposes a new semantic video search engine called “SIRI”. It combines 3 subsystems based on the optimized combination of low-level and high-level features to improve the accuracy of data retrieval. Performance analysis shows that our SIRI system can raise the average accuracy metrics from 92% to 100% for the Beach category, and from 91% to 100% for the Mountain category over the ISE system using Corel dataset. Moreover, SIRI improves the average accuracy by 99% compared to 95% for the ISE. In fact, our system improves indexing for different concepts compared to both VINAS and VISEN systems. For example, the value of the traffic concept rises from 0.3 to 0.5 with SIRI. It is positively reflected on the search result using the TRECVID 2015 dataset, and increases the average accuracy by 98.41% compared to 85% for VINAS and 88% for VISEN.
Download

Paper Nr:	110
Title:	Analysis of Aeronautical Mobile Airport Communication System
Authors:	Kristina Kovacikova, Andrej Novak and Alena Novak Sedlackova
Abstract:	Air traffic is doubling every 15 years. Aeronautical technologies are changing and developing every year and with it the global Air navigation systems needs to adapt to the increased air traffic, to the move of more than one hundred thousand commercial flights daily and this number is expected to increase in the future. Increased flights in early 2000s, caused the saturation in the Air Traffic Management communications capacity that uses the VHF data link provided by International Telecommunication Union in Europe and in the United States. The situation created a need for new research to find new communication systems to help release the pressure, and that can eventually replace the current aeronautical communication system. It led to the use of Aeronautical Airport Communication System. The aim of this paper is to analyse the Aeronautical Airport Communications System.
Download

Paper Nr:	117
Title:	Data Ingestion from a Data Lake: The Case of Document-oriented NoSQL Databases
Authors:	Fatma Abdelhedi, Rym Jemmali and Gilles Zurfluh
Abstract:	Nowadays, there is a growing need to collect and analyze data from different databases. Our work is part of a medical application that must allow health professionals to analyze complex data for decision making. We propose mechanisms to extract data from a data lake and store them in a NoSQL data warehouse. This will allow us to perform, in a second time, decisional analysis facilitated by the features offered by NoSQL systems (richness of data structures, query language, access performances). In this paper, we present a process to ingest data from a Data Lake into a warehouse. The ingestion consists in (1) transferring NoSQL DBs extracted from the Data Lake into a single NoSQL DB (the warehouse), (2) merging so-called "similar" classes, and (3) converting the links into references between objects. An experiment has been performed for a medical application.
Download

Paper Nr:	201
Title:	DL-CNN: Double Layered Convolutional Neural Networks
Authors:	Lixin Fu and Rohith Rangineni
Abstract:	We studied the traditional convolutional neural networks and developed a new model that used double layers instead of only one. In our example of this model, we used five convolutional layers and four fully connected layers. The dataset has four thousand human face images of two classes, one of them being open eyes and the other closed eyes. In this project, we dissected the original source code of the standard package into several components and changed some of the core parts to improve accuracy. In addition to using both the current layer and the prior layer to compute the next layer, we also explored whether to skip the current layer. We changed the original convolution window formula. A multiplication bias instead of originally adding bias to the linear combination was also proposed. Though it is hard to explain the rationale, the results of multiplication bias are better in our example. For our new double layer model, our simulation results showed that the accuracy was increased from 60% to 95%.
Download

Area 2 - Artificial Intelligence and Decision Support Systems

Full Papers

Paper Nr:	19
Title:	Feature Selection with Hybrid Bio-inspired Approach for Classifying Multi-idiom Social Media Sentiment Analysis
Authors:	Luís Marcello Moraes Silva, Carlos Roberto Valêncio, Geraldo Francisco Donegá Zafalon and Angelo Cesar Columbini
Abstract:	Social media sentiment analysis consists on extracting information from users’ comments. It can assist the decision-making process of companies, aid public health and security and even identify intentions and opinions about candidates in elections. However, such data come from an environment with big data characteristics, which can make traditional and manual analysis impracticable because of the high dimensionality. The implications on the analysis are high computational cost and low quality of results. Up to date research focuses on how to analyse feelings of users with machine learning and inspired by nature methods. To analyse such data effectively, a feature selection through cuckoo search and genetic algorithm is proposed. Machine learning with lexical analysis has become an attractive alternative to overcome this challenge. This paper aims to present a hybrid bio-inspired approach to realize feature selection and improve sentiment classification quality. The scientific contribution is the improvement of a classification model considering pre-processing of the data with different languages and contexts. The results prove that the developed method enriches the predictive model. There is an improvement of around 13% in accuracy with a 45% average usage of attributes related to traditional analysis.
Download

Paper Nr:	34
Title:	MLOps: Practices, Maturity Models, Roles, Tools, and Challenges – A Systematic Literature Review
Authors:	Anderson Lima, Luciano Monteiro and Ana Paula Furtado
Abstract:	Context: The development of machine learning solutions has increased significantly due to the advancement of technology based on artificial intelligence. MLOps have emerged as an approach to minimizing efforts and improving integration between those who are in the process of deploying the models in the production environment. Objective: This paper undertakes a systematic literature review in order to identify practices, standards, roles, maturity models, challenges, and tools related to MLOps. Method: The study is founded on an automatic search method of selected digital libraries that applies selection and quality criteria to identify suitable papers that underpin the research. Results: The search initially found 1,905 articles of which 30 papers were selected for analysis. This analysis led to findings that made it possible to achieve the objectives of the research. Conclusion: The results allowed us to conclude that MLOps is still in its initial stage, and to recognize that there is an opportunity to undertake further academic studies that will prompt organizations to adopt MLOps practices.
Download

Paper Nr:	64
Title:	Gideon-TS: Efficient Exploration and Labeling of Multivariate Industrial Sensor Data
Authors:	Tristan Langer, Viktor Welbers and Tobias Meisen
Abstract:	Modern digitization in industrial production requires the acquisition of process data that is subsequently used in analysis and optimization scenarios. For this purpose, the use of machine learning methods has become more and more established in recent years. However, training advanced machine learning models from scratch requires a lot of labeled data. The creation of such labeled data is a major challenge for many companies, as the generation process cannot be fully automated and is therefore very time-consuming and expensive. Thus, the need for corresponding software tools to label complex data streams, such as sensor data, is steadily increasing. Existing contributions are not designed for handling large datasets and forms common for industrial applications, and offer little support for the labeling of large data volumes. For this reason, we introduce Gideon-TS — an interactive labeling tool for sensor data that is tailored to the needs of industrial use. Gideon-TS can integrate time series datasets in multiple modalities (univariate, multivariate, samples, with and without timestamp) and remains performant even with large datasets. We also present an approach to semi-automatic labeling that reduces the time needed to label large volumes of data. We evaluated Gideon-TS on an industrial exemplary use case by conducting performance tests and a user study to show that it is suitable for labeling large datasets and significantly reduces labeling time compared to traditional labeling methods.
Download

Paper Nr:	76
Title:	A Structured Literature Review on the Application of Machine Learning in Retail
Authors:	Marek Hütsch and Tobias Wulfert
Abstract:	Machine learning (ML) has the potential to take on a variety of routine and non-routine tasks in brick-and-mortar and e-commerce. Many tasks previously executed manually are susceptible to computerization involving ML. Although procedure models for the introduction of ML across industries exist, it needs to be determined for which tasks in retail ML can be implemented. Hence, we conducted a structured literature review involving 225 research papers to derive possible ML application areas in retail along with the structure of a well-established information systems architecture. In total, we identified 20 application areas for ML in retail that mainly address decision-oriented and economic-operative tasks. We organized the application areas in a framework for practitioners and researchers to determine an appropriate ML usage in retail. Our analysis further revealed that while ML applications in offline retail focus on the article, in e-commerce the customer is pivotal for application areas of ML.
Download

Paper Nr:	77
Title:	Predicting Mortality Risk among Elderly Inpatients with Pneumonia: A Machine Learning Approach
Authors:	Victor Monteiro Silva, Damires Yluska De Souza Fernandes and Alex Sandro Da Cunha Rêgo
Abstract:	Community-acquired Pneumonia (CAP) is a serious respiratory infection that can cause life-threatening risk in people of different ages, especially in elderly inpatients. Regarding this age group, mortality rates by CAP still can reach 30% of all respiratory causes of death. In this work, we propose a machine learning approach to predict mortality risk among elderly inpatients with CAP. The approach uses real world data of elderly people with CAP from a hospital in Brazil, collected from 2018 to 2021. Based on patients data as learning features, our approach is able not only to classify patients at risk of mortality during hospitalization, but also to estimate the probability concerning the prediction. Some classification models have been examined and, among them, the best performance in terms of Area under ROC Curve (AUC) value has been achieved by the Logistic Regression (LR) classifier (AUC=0.81). Accomplished results show that the presented approach outperforms CURB-65 score as baseline in terms of both AUC values and probability of patient death. Besides, our approach is able to output probabilities ranging from 50 to 99% w.r.t. positive classification, i.e., patients that may come to death. A statistical test confirms that the presented approach outperforms the baseline provided by the CURB-65.
Download

Paper Nr:	95
Title:	Text Classification in the Brazilian Legal Domain
Authors:	Gustavo M. C. Coelho, Alimed Celecia, Jefferson de Sousa, Melissa Cavaliere, Maria Julia Lima, Ana Mangeth, Isabella Frajhof, Cesar Cury and Marco Casanova
Abstract:	Text classification is a popular Natural Language Processing task that aims at predicting the categorical values associated with textual instances. One of the relevant application fields for this task is the legal domain, which involves a high volume of unstructured textual documents. This paper proposes a new model for the task of classifying legal opinions related to consumer complaints according to the moral damage value. The proposed model, named MuDEC (Multi-step Document Embedding-Based Classifier), combines Doc2vec and SVM for feature extraction and classification, respectively. To optimize the classification performance, the model uses a combination of methods, such as oversampling for imbalanced datasets, clustering for the identification of textual patterns, and dimensionality reduction for complexity control. For performance evaluation, a 6-class dataset of 193 legal opinions related to consumer complaints was created in which each instance was manually labeled according to its moral damage value. A 10-fold stratified cross-validation resampling procedure was used to evaluate different models. The results demonstrated that, under this experimental setup, MuDEC outperforms baseline models by a significant margin, achieving 78.7% of accuracy, compared to 61.1% for a SIF classifier and 65.2% for a C-LSTM classifier.
Download

Paper Nr:	113
Title:	Global Spare Parts Exploitation Costs Optymization and Its Reduction to Rectangular Knapsack Problem
Authors:	Andrzej Chmielowiec, Leszek Klich, Weronika Woś and Adam Błachowicz
Abstract:	Maintenance costs represent a significant portion of the budget in large enterprises. Unplanned downtime and breakdowns have a very negative impact on the financial result. Minimizing their number is today a huge challenge for systems supporting production management and maintenance processes. Due to intensive research in the field of reliability today, we know a lot about the optimal use of a single spare part. Unfortunately, the analytical methods used for this purpose do not work well in the case of simultaneous analysis of the entire production line. The article proposes a combination of analytical methods and discrete optimization methods for global management of spare parts operating costs. The purpose of the presented algorithm is to support decision-making that minimize the global cost of maintaining reliability in the entire production company.
Download

Paper Nr:	125
Title:	AUDIO-MC: A General Framework for Multi-context Audio Classification
Authors:	Lucas B. Sena, Francisco D. B. S. Praciano, Iago C. Chaves, Felipe T. Brito, Eduardo Rodrigues Duarte Neto, Jose Maria Monteiro and Javam C. Machado
Abstract:	Audio classification is an important research topic in pattern recognition and has been widely used in several domains, such as sentiment analysis, speech emotion recognition, environment sound classification and sound events detection. It consists in predicting a piece of audio signal into one of the pre-defined semantic classes. In recent years, researchers have been applied convolution neural networks to tackle audio pattern recognition problems. However, these approaches are commonly designed for specific purposes. In this case, machine learning practitioners, who do not have specialist knowledge in audio classification, may find it hard to select a proper approach for different audio contexts. In this paper we propose AUDIO-MC, a general framework for multi-context audio classification. The main goal of this work is to ease the adoption of audio classifiers for general machine learning practitioners, who do not have audio analysis experience. Experimental results show that our framework achieves better or similar performance when compared to single-context audio classification techniques. AUDIO-MC framework shows an accuracy of over 80% for all analyzed contexts. In particular, the highest achieved accuracies are 90.60%, 93.21% and 98.10% over RAVDESS, ESC-50 and URBAN datasets, respectively.
Download

Paper Nr:	144
Title:	Towards Image Captioning for the Portuguese Language: Evaluation on a Translated Dataset
Authors:	João Gondim, Daniela Barreiro Claro and Marlo Souza
Abstract:	Automatic describing an image comprehends the representation from the scene elements to generate a concise natural language description. Few resources, particularly annotated datasets for the Portuguese language, discourage the development of new methods in languages other than English. Thus, we propose a new image captioning method for the Portuguese language. We provide an analysis empowered by an encoder-decoder model with an attention mechanism when employing a multimodal dataset translated into Portuguese. Our findings suggest that: 1) the original and translated datasets are pretty similar considering the measure achievements; 2) the translation approach includes some dirty sentence formulations that disturb our model for the Portuguese language.
Download

Paper Nr:	174
Title:	Analysis of Incremental Learning and Windowing to Handle Combined Dataset Shifts on Binary Classification for Product Failure Prediction
Authors:	Marco Spieß, Peter Reimann, Christian Weber and Bernhard Mitschang
Abstract:	Dataset Shifts (DSS) are known to cause poor predictive performance in supervised machine learning tasks. We present a challenging binary classification task for a real-world use case of product failure prediction. The target is to predict whether a product, e. g., a truck may fail during the warranty period. However, building a satisfactory classifier is difficult, because the characteristics of underlying training data entail two kinds of DSS. First, the distribution of product configurations may change over time, leading to a covariate shift. Second, products gradually fail at different points in time, so that the labels in training data may change, which may a concept shift. Further, both DSS show a trade-off relationship, i. e., addressing one of them may imply negative impacts on the other one. We discuss the results of an experimental study to investigate how different approaches to addressing DSS perform when they are faced with both a covariate and a concept shift. Thereby, we prove that existing approaches, e. g., incremental learning and windowing, especially suffer from the trade-off between both DSS. Nevertheless, we come up with a solution for a data-driven classifier, that yields better results than a baseline solution that does not address DSS.
Download

Paper Nr:	192
Title:	Toward Cloud Manufacturing: A Decision Guidance Framework for Markets of Virtual Things
Authors:	Xu Han and Alexander Brodsky
Abstract:	In the value creation chain today, entrepreneurs have been faced with stiff hindrance in turning their innovative ideas into marketable products due to the manufacturing-entrepreneurship disconnect in terms of accessibility, predictability and agility. Toward bridging this gap, in this paper we develop a formal mathematical framework for markets of virtual things: parameterized products and services that can be searched, composed and optimized. The proposed framework formalizes the notions of virtual product and service designs and customer-facing specs as well as requirements’ specs. Based on these formal concepts, the framework also formalizes the notions of search for and composition of virtual products and services that (1) are mutually consistent with the requirement specs, (2) are Pareto-optimal in terms of customer-facing metrics such as cost, product desirable characteristics and delivery terms; and (3) that are optimal in terms of customer utility function that is expressed in terms of customer facing metrics. We also propose the design of a repository of virtual things and their artifacts, to be used in support of the virtual things’ markets. The proposed markets of virtual things can lead to democratizing innovation by allowing entrepreneurs without design and manufacturing expertise to bring their ideas to markets quickly.
Download

Paper Nr:	194
Title:	Mechanism of Overfitting Avoidance Techniques for Training Deep Neural Networks
Authors:	Bihi Sabiri, Bouchra El Asri and Maryem Rhanoui
Abstract:	The objective of a deep learning neural network is to have a final model that performs well both on the data used to train it and the new data on which the model will be used to make predictions. Overfitting refers to the fact that the predictive model produced by the machine learning algorithm adapts well to the training set. In this case, the predictive model will capture the generalizable correlations and the noise produced by the data and will be able to give very good predictions on the data of the training set, but it will predict badly on the data that it has not yet seen during his learning phase. This paper proposes two techniques among many others to reduce or prevent overfitting. Furthermore, by analyzing dynamics during training, we propose a consensus classification algorithm that avoids overfitting, we investigate the performance of these two types of techniques in convolutional neural network. Early stopping allowing to save the hyper-parameters of a model at the right time. And the dropout making the learning of the model harder allowing to gain up to more than 50% by decreasing the loss rate of the model.
Download

Paper Nr:	196
Title:	Impact of Hyperparameters on the Generative Adversarial Networks Behavior
Authors:	Bihi Sabiri, Bouchra El Asri and Maryem Rhanoui
Abstract:	Generative adversarial networks (GANs) have become a full-fledged branch of the most important neural network models for unsupervised machine learning. A multitude of loss functions have been developed to train the GAN discriminators and they all have a common structure: a sum of real and false losses which depend only on the real losses and generated data respectively. A challenge associated with an equally weighted sum of two losses is that the formation can benefit one loss but harm the other, which we show causes instability and mode collapse. In this article, we introduce a new family of discriminant loss functions which adopts a weighted sum of real and false parts. With the use the gradients of the real and false parts of the loss, we can adaptively choose weights to train the discriminator in the sense that benefits the stability of the GAN model. Our method can potentially be applied to any discriminator model with a loss which is a sum of the real and fake parts. Our method consists in adjusting the hyper-parameters appropriately in order to improve the training of the two antagonistic models Experiences validated the effectiveness of our loss functions on image generation tasks, improving the base results by a significant margin on dataset Celebdata.
Download

Short Papers

Paper Nr:	6
Title:	An Efficient Contact Lens Spoofing Classification
Authors:	Guilherme Silva, Pedro Silva, Mariana Mota, Eduardo Luz and Gladston Moreira
Abstract:	Spoofing detection, when differentiating illegitimate users from genuine ones, is a major problem for biometric systems and these techniques could be an enhancement in the industry. Nowadays iris recognition systems are very popular, once it is more precise for person authentication when compared to fingerprints and other biometric modalities. Nevertheless, iris recognition systems are vulnerable to spoofing via textured cosmetic contact lenses and techniques to avoid those attacks are imperative for a well system behavior and could be embedded. In this work, attention is centered on a three-class iris spoofing detection problem: textured/colored contact lenses, soft contact lenses, and no lenses. Our approach adapts the Inverted Bottleneck Convolution blocks from the EfficientNets to build deep image representation. Experiments are conducted in comparison with the literature on two public iris image databases for contact lens detection: Notre Dame and IIIT-Delhi. With transfer learning, we surpass previous approaches in most of the cases for both databases with very promising results.
Download

Paper Nr:	20
Title:	Online Set Cover With Rated Subsets
Authors:	Christine Markarian
Abstract:	In this paper, we introduce the Online Set Cover With Rated Subsets problem (OSC-RS), a generalization of the well-known Online Set Cover problem, in which we are given a universe of elements and a collection of subsets of the universe, each associated with a subset cost and a rating cost. In each step, the algorithm is given a request containing elements from the universe. The algorithm serves a request by assigning it to a number of purchased subsets that jointly cover the requested elements. The algorithm pays the subset costs associated with the subsets purchased and for each request, it pays the sum of the rating costs associated with the subsets assigned to the request. The aim is to serve all requests as soon as revealed, while minimizing the total subset and rating costs paid. OSC-RS is motivated by intrinsic client-service-providing scenarios in which service providers are rated and their ratings are included in the decision-making process, so as higher-rated service providers are associated with lower rating costs. That is, the decisions about serving clients take into account the quality of the services provided. We propose the ﬁrst online algorithm for OSC-RS and evaluate it using the standard notion of competitive analysis. The latter compares the performance of the online algorithm to that of an optimal ofﬂine algorithm that is assumed to know all the input sequence in advance.
Download

Paper Nr:	25
Title:	Robust Face Mask Detection with Combined Frontal and Angled Viewed Faces
Authors:	Ivan George L. Tarun, Vidal Wyatt M. Lopez, Patricia Angela R. Abu and Ma. Regina Justina E. Estuar
Abstract:	One such protocol currently enforced by the Philippine government to combat COVID-19 is the mandatory use of face masks in public places. The problem however is that ensuring people follow this protocol is difficult to monitor during a pandemic due to other conflicting health protocols like social distancing and workforce reduction. This study therefore explores on the creation of deep learning models that consider both frontal and side view images of the face for face mask detection. In doing so, improvements to robustness were found when compared to using models that were previously trained on purely frontal images. This was accomplished by first relabeling a subset of images from the FMLD dataset. These images were then split into train, validation, and test sets. Four deep learning models (YOLOv5 Small, YOLOv5 Medium, CenterNet Resnet50 V1 FPN 512x512, CenterNet HourGlass104 512x512) were later trained on the training set of images. These four models were compared with three models (MobileNetV1, ResNet50, VGG16) that were trained previously on purely frontal images. Results show that the four models trained on the relabeled FMLD dataset offer an approximately 20% increase in classification accuracy over the three models that were previously trained on purely frontal images.
Download

Paper Nr:	50
Title:	Zero-shot Mathematical Problem Solving via Generative Pre-trained Transformers
Authors:	Federico A. Galatolo, Mario G. C. A. Cimino and Gigliola Vaglini
Abstract:	Mathematics is an effective testbed for measuring the problem-solving ability of machine learning models. The current benchmark for deep learning-based solutions is grade school math problems: given a natural language description of a problem, the task is to analyse the problem, exploit heuristics generated from a very large set of solved examples, and then generate an answer. In this paper, a descendant of the third generation of Generative Pre-trained Transformer Networks (GPT-3) is used to develop a zero-shot learning approach, to solve this problem. The proposed approach shows that coding based problem-solving is more effective than the natural language reasoning based one. Specifically, the architectural solution is built upon OpenAI Codex, a descendant of GPT-3 for programming tasks, trained on public GitHub repositories, the world’s largest source code hosting service. Experimental results clearly show the potential of the approach: by exploiting the Python as programming language, proposed pipeline achieves the 18.63% solve rate against the 6.82% of GPT-3. Finally, by using a fine-tuned verifier, the correctness of the answer can be ranked at runtime, and then improved by generating a predefined number of trials. With this approach, for 10 trials and an ideal verifier, the proposed pipeline achieves 54.20% solve rate.
Download

Paper Nr:	57
Title:	Process Diagnostics at Coarse-grained Levels
Authors:	Mahsa Pourbafrani, Firas Gharbi and Wil M. P. van der Aalst
Abstract:	Process mining enables the discovery of actionable insights from event data of organizations. Process analysis techniques typically focus on process executions at detailed, i.e., fine-grained levels, which might lead to missed insights. For instance, the relation between the waiting time of process instances and the current states of the process including resources workload is hidden at fine-grained level analysis. We propose an approach for coarse-grained diagnostics of processes while decreasing user dependency and ad hoc decisions compared to the current approaches. Our approach begins with the analysis of processes at fine-grained levels focusing on performance and compliance and proceeds with an automated translation of processes to the time series format, i.e., coarse-grained process logs. We exploit time series analysis techniques to uncover the underlying patterns and potential causes and effects in processes. The evaluation using real and synthetic event logs indicates the efficiency of our approach to discover overlooked insights at fine-grained levels.
Download

Paper Nr:	93
Title:	Kolmogorov’s Gate Non-linearity as a Step toward Much Smaller Artificial Neural Networks
Authors:	Stanislav Selitskiy
Abstract:	The deep architecture of today’s behemoth “foundation” Artificial Neural Network (ANN) models came to be not only because we can do that utilizing computational capabilities of the underlying hardware. The direction of the ANN architecture development was also set at the early stages of ANN research by using algorithms and models that proved to be effective, however limiting. The use of the small set of simple nonlinearity functions moved ANN architectures in the direction of accumulating many layers to achieve reasonable approximation in the emulation of the complex processes. Narrow efficient input domain of the activation functions also led to computational complexities of adding normalization, regularization and back, de-regularization, de- normalization layers. Such layers do not add any value to the process emulation and break the topology and memory integrity of the data. We propose to look back at forgotten shallow and wide ANN architecture to learn what we can use from then at the current state of technology. In particular, we would like to point at the Kolmogorov-Arnold theorem that has such implications for ANN architectures that, given a wide choice of volatile activation functions, even 2-layer ANN of O(n) parameters complexity and Ω(n2) relations complexity (where n is an input dimensionality), may approximate arbitrary non-linear transformation. We investigate the behaviour of the emulation of such volatile activation function using gated architecture inspired by the LSTM and GRU type cells, applied to the feed-forward fully connected ANN, on the financial time series prediction.
Download

Paper Nr:	98
Title:	Opti-Soft+: A Recommender and Sensitivity Analysis for Optimal Software Feature Selection and Release Planning
Authors:	Fernando Boccanera and Alexander Brodsky
Abstract:	Many approaches have been developed to increase the return on a software investment, but each one has drawbacks. Proposed in this paper is the Opti-Soft+ framework that addresses this problem by producing a software release schedule that maximizes the business value of investments in information systems that automate business processes. The optimal release schedule is the result of solving a mixed integer linear programming problem. Opti-Soft+ is an extension of the Opti-Soft framework proposed earlier with (1) a refined cost model, (2) a technique for sensitivity analysis of the normalized cost per unit of production, and (3) an atomic business process model that is driven by output throughputs in addition to input throughputs.
Download

Paper Nr:	99
Title:	Audio Circuits Evolution through Genetic Algorithms
Authors:	P. H. G. Coelho, J. F. M. do Amaral, E. N. Da Rocha and M. C. Bentes
Abstract:	This paper focuses on the implementation of an extrinsic platform in order to develop audio electronic circuits with known topologies. This platform uses genetic algorithms to choose the best components to achieve a specific goal. All the proposed topologies have been consolidated in the audio market and due to this aspect, the technique has not proposed new possibilities for them. However, the generation mechanism of these alternatives is studied in depth, approaching their theoretical potential. User specifications of the proposed interface define the evolution of the circuit. The developed platform compares the graphic of a simulation with a graphic which is generated by a function in MATLAB. The similarity between these curves is used as the fitness function to evolve the circuit components values. All the work was done in MATLAB and Simulink. As MATLAB is used to run the codes and create desired curves with its function, its tool Simulink is used to simulate circuits and to carry out transfer functions analysis. Case studies are presented to illustrate the method which are analogic filters used in audio applications. These circuits were evolved by using the free package GAOT for MATLAB.
Download

Paper Nr:	104
Title:	A Symbolic Time Constraint Propagation Mechanism Proposal for Workflow Nets
Authors:	Lorena Rodrigues Bruno and Stéphane Julia
Abstract:	The model of a Workflow Management System should describe the time constraints of resources over the activities of the corresponding business process. In general, typical temporal phenomena include activity execution delays, limits to the occurrence of valid intervals over the activities, limits to valid intervals over resources (limit to resources life cycle), limits to duration of process execution, time distance between two activities, etc. In this study, a Workflow net model incremented with time intervals to describe the duration of activities and waiting times is presented. To define the execution of activities minimum and maximum intervals, a time constraint propagation mechanism based on the sequent calculus of Linear Logic and on symbolic dates is proposed.
Download

Paper Nr:	106
Title:	A Literature Review on Methods for Learning to Rank
Authors:	Junior Zilles, Giancarlo Lucca and Eduardo Nunes Borges
Abstract:	The increasing number of indexed documents makes manual retrieval almost impossible when they are retrieved or stored automatically. The solution to this problem consists of using information retrieval systems, which seek to present the most relevant data items to the user in order of relevance. Therefore, this work aims to conduct a theoretical survey of the most used algorithms in the Information Retrieval field using Learning to Rank methods. We also provide an analysis regarding the datasets used as benchmarks in the literature. We observed that RankSVM and LETOR collection are the most frequent method and datasets employed in the analyzed works.
Download

Paper Nr:	124
Title:	BPA: A Multilingual Sentiment Analysis Approach based on BiLSTM
Authors:	Iago C. Chaves, Antônio Diogo F. Martins, Francisco D. B. S. Praciano, Felipe T. Brito, Jose Maria Monteiro and Javam C. Machado
Abstract:	Sentiment analysis (SA) is the automatic process of understanding people’s feelings or beliefs expressed in texts such as emotions, opinions, attitudes, appraisals and others. The main task is to identify the polarity level (positive, neutral or negative) of a given text. This task has been the subject of several research competitions in many languages, for instance, English, Spanish and Arabic. However, developing a multilingual sentiment analysis method remains a challenge. In this paper, we propose a new approach, called BPA, based on BiLSTM neural networks, pooling operations and attention mechanism, which is able to automatically classify the polarity level of a text. We evaluated the BPA approach using five different data sets in three distinct languages: English, Spanish and Portuguese. Experimental results evidence the suitability of the proposed approach to multilingual and domain-independent polarity classification. BPA’s best results achieved an accuracy of 0.901, 0.865 and 0.923 for English, Spanish and Portuguese, respectively.
Download

Paper Nr:	149
Title:	Comparative Analysis of Neural Translation Models based on Transformers Architecture
Authors:	Alexander Smirnov, Nikolay Teslya, a Nikolay Shilov, Diethard Frank, Elena Minina and Martin Kovacs
Abstract:	While processing customers’ feedback for an industrial company, one of the important tasks is the classification of customer inquiries. However, this task can produce a number of difficulties when the text of the message can be composed using a large number of languages. One of the solutions, in this case, is to determine the language of the text and translate it into a base language, for which the classifier will be developed. This paper compares open models for automatic translation of texts. The following models based on the Transformers architecture were selected for comparison: M2M100, mBART, OPUS-MT (Helsinki NLP). A test data set was formed containing texts specific to the subject area. Microsoft Azure Translation was chosen as the reference translation. Translations produced by each model were compared with the reference translation using two metrics: BLEU and METEOR. The possibility of fast fine-tuning of models was also investigated to improve the quality of the translation of texts in the problem area. Among the reviewed models, M2M100 turned out to be the best in terms of translation quality, but it is also the most difficult to fine-tune it.
Download

Paper Nr:	162
Title:	Using Deep Learning-based Object Detection to Extract Structure Information from Scanned Documents
Authors:	Alice Nannini, Federico A. Galatolo, Mario G. C. A. Cimino and Gigliola Vaglini
Abstract:	The computer vision and object detection techniques developed in recent years are dominating the state of the art and are increasingly applied to document layout analysis. In this research work, an automatic method to extract meaningful information from scanned documents is proposed. The method is based on the most recent object detection techniques. Specifically, the state-of-the-art deep learning techniques that are designed to work on images, are adapted to the domain of digital documents. This research focuses on play scripts, a document type that has not been considered in the literature. For this reason, a novel dataset has been annotated, selecting the most common and useful formats from hundreds of available scripts. The main contribution of this paper is to provide a general understanding and a performance study of different implementations of object detectors applied to this domain. A fine-tuning of deep neural networks, such as Faster R-CNN and YOLO, has been made to identify text sections of interest via bounding boxes, and to classify them into a specific pre-defined category. Several experiments have been carried out, applying different combinations of data augmentation techniques.
Download

Paper Nr:	181
Title:	Online Non-metric Facility Location with Service-Quality Costs
Authors:	Christine Markarian
Abstract:	In this paper, we study the Online Non-metric Facility Location with Service-Quality Costs problem (Non-metric OFL-SQC), a generalization of the well-known Online Non-metric Facility Location problem (Non-metric OFL), in which facilities have, in addition to opening costs, service-quality costs. Service-quality costs are determined by the quality of the service provided by each facility so as the higher the quality, the lower the service-quality cost. These are motivated by companies wishing to incorporate the quality of third-party services into their optimization decisions. Clients are scattered around facilities and arrive in groups over time. Each arriving group is composed of a number of clients at different locations. Non-metric OFL-SQC asks to serve each client in the group by connecting it to an open facility. Opening a facility incurs an opening cost and connecting a client to a facility incurs a connecting cost, which is the distance between the client and the facility. Moreover, for each group, the algorithm needs to pay the sum of the service-quality costs associated with the facilities serving the clients of the group. The aim is to serve each arriving group while minimizing the total facility opening costs, connecting costs, and service-quality costs. We develop the first online algorithm for non-metric OFL-SQC and analyze it using the standard notion of competitive analysis, in which the online algorithm’s worst-case performance is measured against the optimal offline solution that can be constructed optimally given all the input sequence in advance.
Download

Paper Nr:	185
Title:	An ML Agent using the Policy Gradient Method to win a SoccerTwos Game
Authors:	Victor Ulisses Pugliese
Abstract:	We conducted an investigative study of Policy Gradient methods using Curriculum Learning applied in Video Games, as professors at the Federal University of Goiás created a customized SoccerTwos environment to evaluate the Machine Learning agents of students in a Reinforcement Learning course. We employed the PPO and SAC as state-of-arts in on-policy and off-policy contexts, respectively. Also, the Curriculum could improve the performance based on it is easier to teach people in a complex gradual order than randomly. So, combining them, we propose our agents win more matches than their adversaries. We measured the results by minimum, maximum, mean rewards, and the mean length per episode in checkpoints. Finally, PPO achieved the best result with Curriculum Learning, modifying players’ (position and rotation) and ball’s (speed and position) settings in time intervals. Also, It used fewer training hours than other experiments.
Download

Paper Nr:	190
Title:	Automatic Evaluation of Textual Cohesion in Essays
Authors:	Aluizio Haendchen Filho, Filipe Sateles Porto de Lima, Hércules Antônio do Prado, Edilson Ferneda, Adson Marques da Silva Esteves and Rudimar L. S. Dazzi
Abstract:	Aiming to contribute to studies on the evaluation of textual cohesion in Brazilian Portuguese, this paper presents an approach based on machine learning for automated scoring of textual cohesion, according to the evaluation model adopted in Brazil. The purpose is to verify the mastery of skills and abilities of students who have completed high school. Based on features groups such as lexicon diversity, connectives, readability indexes and overlap of sentences and paragraphs, 91 features, based in TAACO (Tool for the Automatic Analysis of Cohesion), were adopted. Beyond features specifically related to textual cohesion, other were defined for capturing general aspects of the text. The efficiency of the classification model based on Support Vector Machines was measured. It was also demonstrated how normalization and class balancing techniques are essential to improve results using the small dataset available for this task.
Download

Paper Nr:	16
Title:	LSTM Network Learning for Sentiment Analysis
Authors:	Badiâa Dellal-Hedjazi and Zaia Alimazighi
Abstract:	The strong economic issues (e-reputation, buzz detection ...) and political ( opinion leaders identification ...) explain the rapid rise of scientists on the topic of sentiment classification. Sentiment analysis focuses on the orientation of an opinion on an entity or its aspects. It determines its polarity which can be positive, neutral, or negative. Sentiment analysis is associated with texts classification problems. Deep Learning (machine learning technique) is based on multi-layer artificial neural networks. This technology has allowed scientists to make significant progress in data recognition and classification. What makes deep learning different from traditional machine learning methods is that during complex analyses, the basic features of the treatment will no longer be identified by human treatment in a previous algorithm, but directly by the deep learning. In this article we propose a Twitter sentiment analysis application using a deep learning algorithm with LSTM units adapted for natural language processing.
Download

Paper Nr:	31
Title:	Detecting Turistic Places with Convolutional Neural Networks
Authors:	Fabricio Torrico-Pacherre, Ian Maguiña-Mendoza and Willy Ugarte
Abstract:	A mobile application was developed for the recognition of places from a photo using the technique “content based photo geolocation as spatial database queries”. For this purpose, an investigation and analysis of the different existing methods that allow us to recognize images from a photo was carried out in order to select the best possible model and then improve it. Performance comparisons, comparison of number of parameters, Error: imagenet and the Brain-Score were made; once the best model was obtained, the algorithm was implemented and with the results the expected information of the place in the photo was shown. The purpose of this information is to recommend nearby places of interest. In the development stage, first, we implement an architecture with convolutional neural networks VGG16, for the recognition of places, the model was trained, after obtaining a trained model with successful results, the construction phase of the application continued. mobile in order to test the operation of the model. Users will use the app by submitting a photo which will query the trained model, and results will be obtained in seconds, information that will provide a better experience when visiting unknown places.
Download

Paper Nr:	101
Title:	On the Methods to Predict Moisture Content on Wood: A Literature Review
Authors:	Vítor M. Magalhães, Giancarlo Lucca, Alessandro De L. Bicho and Eduardo N. Borges
Abstract:	Wood is the raw material for many manufactured goods. Charcoal, cellulose for the paper industry, laminated wood furniture, and even explosive products, such as gunpowder cotton, are possible destinations for the wood. On the other hand, the growing use of wood as a raw material has increased illegal deforestation and, as a direct consequence, it has changed the climate at a global level. The use of wood in production processes must be optimized to mitigate these adverse effects. One of the determining factors for this optimization is moisture content on wood, i.e., the ratio between the mass of water contained in the wood and dry wood mass. This article reviews the scientific literature published from 1959 to 2019 regarding the use of wood due to a better knowledge of its properties, particularly systems to explain or predict the moisture content. It contributes to the continuity of related research with the theme by ensemble the conducted studies into a single analysis.
Download

Paper Nr:	103
Title:	Prerequisites for Applying Artificial Intelligence for Scheduling in Small- and Medium-sized Enterprises
Authors:	Tatjana Schkarin and Alexander Dobhan
Abstract:	With the increasing spread of Artificial Intelligence (AI), the prerequisites for a successful implementation in practice are becoming more relevant for large enterprises as well as for small and medium-sized enterprises (SMEs). The latter are usually characterized by flat hierarchies, high flexibility, but also by a lack of AI experts and data organisation. One field of AI application for SMEs is scheduling as part of production planning. Scheduling belongs to the most relevant digital solution areas for SMEs. In this article, we examine the prerequisites for the application of AI methods in scheduling in SMEs. For identifying relevant prerequisites, we conduct a literature review and combine it with the results of three AI adoption and readiness models. Afterwards, we describe the results of an interview study on our research question. The main findings include a list of prerequisites. We connect our list with already existing approaches for AI adoption and AI readiness with a strong focus on SMEs and scheduling. Furthermore, we conclude that the prerequisites are dependent on the application context. However, the effect of the size of a company on the prerequisites remains unclear.
Download

Paper Nr:	126
Title:	Applying Edge AI towards Deep-learning-based Monocular Visual Odometry Model for Mobile Robotics
Authors:	Frederico Luiz Martins de Sousa, Mateus Coelho Silva and Ricardo Augusto Rabelo Oliveira
Abstract:	Visual odometry is a relevant problem considering mobile robotics. While intelligent robots can provide mapping and location tasks with a multitude of sensors, it is interesting to evaluate the ability to create models using less information to create similar information. While traditional approaches consider computer vision aspects of proposing solutions, they lack the application of modern perspectives as edge computing and deep learning. This text assesses the problem of evaluating the usage of deep-learning-based visual odometry models in mobile robotics. We expect mobile robots to have embedded computers with limited computing technologies, so we approach this problem through the Edge AI perspective. Our results displayed an improvement of the model considering previous results. Also, we profile the performance of hardware candidates to perform this task in mobile edge devices.
Download

Paper Nr:	129
Title:	A New Approach for Analyzing Financial Markets using Correlation Networks and Population Analysis
Authors:	Zahra Hatami, Hesham Ali, David Volkman and Prasad Chetti
Abstract:	With the availability of massive data sets associated with stock markets, we now have opportunities to apply newly developed big data techniques and data-driven methodologies to analyze these complicated markets. Correlation network analysis makes it possible to structure large data in ways that facilitate finding common patterns and mine-hidden information. In this study, we developed the population analysis with utilizing a correlation network model to conduct a study on stock market data on companies for the years 2000 through 2004. We utilized companies’ parameters for behavior assessment based on the population analysis. After creating the network model, we employed graph-based community algorithms, such as GLay, to identify communities and stocks with similar features associated with their excess returns. Our analysis of the top two communities revealed that companies in the finance sector have the highest share in the market, and companies with a low amount of capitalization have a high excess return, similar to large companies. The proposed correlation network model and the associated population analysis show that investing in companies with high capitalization does not always guarantee higher rates of return on investment. Based on the proposed approach, investors could get similar returns by investing in certain small companies.
Download

Paper Nr:	133
Title:	Detection and Delimitation of Natural Gas in Seismic Images using MLP-Mixer and U-Net
Authors:	Carolina L. S. Cipriano, Domingos A. D. Junior, Petterson S. Diniz, Luiz F. Marin, Anselmo C. Paiva, João O. B. Diniz and Aristófanes C. Silva
Abstract:	The seismic data acquired through the seismic reflection method is important for hydrocarbon prospecting. As an example of hydrocarbon, we have natural gas, one of the leading and most used energy sources in the current scenario. The techniques for analyzing these data are challenging for specialists. Due to the noisy nature of data acquisition, it is subject to errors and divergences between the specialists. The growth of deep learning has brought great highlights to tasks of segmentation, classification, and detection of objects in images from different areas. Consequently, the use of machine learning in seismic data has also grown. Therefore, this work proposes an automatic detection and delimitation of the natural gas region in seismic images (2D) using MLP-Mixer and U-Net. The proposed method obtained competitive results with an accuracy of 99.6% (inline) and 99.55% (crossline); specificity of 99.79% (inline) and 99.73% (crossline).
Download

Paper Nr:	154
Title:	A Performance Benchmark of Formulated Methods for Forecast or Reconstruct Trajectories Associated to the Process Control in Industry 4.0
Authors:	Davi Neves and Ricardo Augusto Rabelo Oliveira
Abstract:	Manufacturing processes are generally modeled through dynamic systems, whose solutions establish a tool for control theory, essential in the elaboration of industrial automation, a pillar of the fourth revolution. Understanding and mastering these technological procedures correspond to the ability to determine and analyze the solutions of a system of differential equations, in order to deploy smart devices in a production line, such as the robotic arm, because this trajectories can be always associated with the running of any equipment. Currently there are many formulated methods to determine (or forecast) these curves, through numerical or stochastic tools, the focus in this work are those capable of reconstructing a state space, such as the Koopman’s operator, convolutional neural network and reinforcement learning technique. Therefore, based on the solutions provided by these methods, a benchmark will assembled to compare them, using topological measures such as Shannon entropy, Lyapunov exponent and Hurst coefficient, thus defining the effectiveness of each one.
Download

Paper Nr:	158
Title:	A Performance Analysis of Classifiers on Imbalanced Data
Authors:	Nathan F. Garcia, Rômulo A. Strzoda, Giancarlo Lucca and Eduardo N. Borges
Abstract:	In the machine learning field, there are many classification algorithms. Each algorithm performs better in certain scenarios, which are very difficult to define. There is also the concept of grouping multiple classifiers, known as ensembles, which aim to increase the model generalization capacity. Comparing multiple models is costly, as, for certain cases, training classifiers can take a long time. In the literature, many aspects of the data have already been studied to help in the task of classifier selection, such as measures of diversity among classifiers that form an ensemble, data complexity measures, among others. In this context, the main objective of this work is to analyze class imbalance and how this measure can be used to guide the selection of classifiers. We also compare the model’s performances when using class balancing techniques such as oversampling and undersampling.
Download

Paper Nr:	182
Title:	Tourism Integrated Recommender System: Setubal Peninsula Case Study
Authors:	Mohamma Julashokri, Suzana Monteiro Leonardi and Pedro Seabra
Abstract:	The diversity and a huge number of different places and attractions can make the decision for tourists difficult. The recommender systems are developed to facilitate people's decision-making process that can help in the area of tourism as well. In this paper, we proposed and implement a recommender system that works integrated with a Setubal peninsula portal to help tourists to choose experiences and points of interest to visit. The implementation was done using collaborative and content-based filtering to make recommendations based on user profiles and activities within the portal.
Download

Area 3 - Information Systems Analysis and Specification

Full Papers

Paper Nr:	26
Title:	ARTIFACT: Architecture for Automated Generation of Distributed Information Extraction Pipelines
Authors:	Michael Sildatke, Hendrik Karwanni, Bodo Kraft and Albert Zündorf
Abstract:	Companies often have to extract information from PDF documents by hand since these documents only are human-readable. To gain business value, companies attempt to automate these processes by using the newest technologies from research. In the field of table analysis, e.g., several hundred approaches were introduced in 2019. The formats of those PDF documents vary enormously and may change over time. Due to that, different and high adjustable extraction strategies are necessary to process the documents automatically, while specific steps are recurring. Thus, we provide an architectural pattern that ensures the modularization of strategies through microservices composed into pipelines. Crucial factors for success are identifying the most suitable pipeline and the reliability of their result. Therefore, the automated quality determination of pipelines creates two fundamental benefits. First, the provided system automatically identifies the best strategy for each input document at runtime. Second, the provided system automatically integrates new microservices into pipelines as soon as they increase overall quality. Hence, the pattern enables fast prototyping of the newest approaches from research while ensuring that they achieve the required quality to gain business value.
Download

Paper Nr:	40
Title:	SERIES: A Task Modelling Notation for Resource-driven Adaptation
Authors:	Paul A. Akiki, Andrea Zisman and Amel Bennaceur
Abstract:	Enterprise Systems (ESs) can make use of tasks that depend on various types of resources such as robots and raw materials. The variability of resources can cause losses to enterprises. For example, the malfunctioning of robots at automated warehouses could delay product deliveries and cause financial losses. These losses can be avoided if resource-driven adaptation is supported. In order to support resource-driven adaptation in ESs, this paper presents a task modelling notation called SERIES, which is used for specifying the tasks of ESs at design time and the enterprise-specific task variants and property values at runtime. SERIES is complemented by a visual tool. We assessed the usability of SERIES using the cognitive dimensions framework. We also evaluated SERIES by developing resource-driven adaptation examples and measuring the performance overhead and source-code intrusiveness. The results showed that SERIES does not hinder performance and is non-intrusive.
Download

Paper Nr:	42
Title:	Effects of Cognitive-driven Development in the Early Stages of the Software Development Life Cycle
Authors:	Victor Hugo Santiago C. Pinto and Alberto Luiz Oliveira Tavares De Souza
Abstract:	The main goal of software design is to continue slicing the code to fit the human mind. A likely reason for that is related to the fact that human work can be improved by a focus on a limited set of data. However, even with advanced practices to support software quality, complex codes continue to be produced, resulting in cognitive overload for the developers. Cognitive-Driven Development (CDD) is an inspiration from cognitive psychology that aims to support the developers in defining a cognitive complexity constraint for the source code. The main idea behind the CDD is keeping the implementation units under this constraint, even with the continuous expansion of software scale. This paper presents an experimental study for verifying the CDD effects in the early stages of development compared to conventional practices. Projects adopted for hiring developers in Java by important Brazilian software companies were chosen. 44 experienced software engineers from the same company attended this experiment and the CDD guided part of them. The projects were evaluated with the following metrics: CBO (Coupling between objects), WMC (Weight Method Class), RFC (Response for a Class), LCOM (Lack of Cohesion of Methods) and LOC (Lines of Code). The result suggests that CDD can guide the developers to achieve better quality levels for the software with lower dispersion for the values of such metrics.
Download

Paper Nr:	54
Title:	Acquisition of Open Intellectual Capital: A Case Study of Innovative, Software-developing SMEs
Authors:	Tomasz Sierotowicz
Abstract:	The existing studies into intellectual capital (IC) focus on its utilisation and effect on selected business performance indicators, mostly achieved by large enterprises. IC is subject to single-stream analyses and understood as an internal enterprise resource. Since IC is used in the business operations of enterprises, it must also be acquired. The aim of this study is to present the results of research conducted in a field of IC acquisition that has not yet been explored. The described research focused on innovative small and medium enterprises (SMEs) that develop software in Poland (2007–2019). Empirical data were obtained in time series form through the use of dedicated statistical tools including the dynamic rate of change. The main conclusion states that IC acquisition in the SMEs covered by the research should be described as a process taking place simultaneously, systematically and continually in two streams: an internal and an external stream of acquisition. Thus, considering the IC acquisition, the concept of Open IC (OIC), which consists of two streams of acquisition: internal and external, was introduced. Future research in this field allow focus on comparative analyses of different branches, which can extend our knowledge of the importance of OIC in businesses.
Download

Paper Nr:	94
Title:	Family Matters: Abusing Family Refresh Tokens to Gain Unauthorised Access to Microsoft Cloud Services Exploratory Study of Azure Active Directory Family of Client IDs
Authors:	Ryan Cobb, Anthony Larcher-Gore and Nestori Syynimaa
Abstract:	Azure Active Directory (Azure AD) is an identity and access management service used by Microsoft 365 and Azure services and thousands of third-party service providers. Azure AD uses OIDC and OAuth protocols for authentication and authorisation, respectively. OAuth authorisation involves four parties: client, resource owner, resource server, and authorisation server. The resource owner can access the resource server using the specific client after the authorisation server has authorised the access. The authorisation is presented using a cryptographically signed Access Token, which includes the identity of the resource owner, client, and resource. During the authorisation, Azure AD assigns Access and Id Tokens that are valid for one hour and a Refresh Token that is valid for 90 days. Refresh Tokens are used for requesting new Access and Id token after their expiration. By OAuth 2.0 standard, Refresh Tokens should only be able to be used to request Access Tokens for the same resource owner, client, and resource. In this paper, we will present findings of a study related to undocumented feature used by Azure AD, the Family of Client ID (FOCI). After studying 600 first-party clients, we found 16 FOCI clients which supports a special type of Refresh Tokens, called Family Refresh Tokens (FRTs). These FRTs can be used to obtain Access Tokens for any FOCI client. This non-standard behaviour makes FRTs primary targets for a token theft and privilege escalation attacks.
Download

Paper Nr:	142
Title:	Challenges in Requirements Engineering and Its Solutions: A Systematic Review
Authors:	Otávio da Cruz Mello and Lisandra Manzoni Fontoura
Abstract:	Software development projects are susceptible to many adversities throughout their life cycle that can be originated, among several reasons, of a low-quality specification and management of requirements. To ensure the Requirements Engineering activities are conducted correctly, researchers study and apply various techniques to predict and avoid the negative impacts that may occur in projects. The main goal of this research is to identify which techniques have been used to solve problems related to requirements management in software projects. We retrieved and reviewed studies published across various scientific databases to answer research questions that were previously defined. From this work, it was possible to obtain a better understanding of the most common problems in the Requirements Engineering field, as well as some techniques that currently exist to solve them. We also identified that Artificial Intelligence has been widely explored to improve the activities of the field.
Download

Short Papers

Paper Nr:	13
Title:	Software Product Line Regression Testing: A Research Roadmap
Authors:	Willian D. F. Mendonça, Wesley K. G. Assunção and Silvia R. Vergilio
Abstract:	Similarly to traditional single-product software, Software Product Lines (SPLs) are constantly maintained and evolved. However, an unrevealed bug in an SPL can be propagated to a wide set of products and impact customers differently, depending on the set of features they are using. In such scenarios, SPL regression testing is paramount to avoid undesired problems and guarantee that the SPL maintenance and evolution are performed accordingly. Although there are several studies on SPL regression testing, the research community lacks a clear set of research opportunities to be addressed in a short and medium term. To fulfill this gap, the goal of this work is to overview the current body of knowledge of SPL regression testing and present a research roadmap for the following years. For this, we conducted a systematic mapping study that found 27 primary studies. We identified techniques used by the approaches, and applied strategies. Test case selection and prioritization techniques are prevalent, as well as fault and coverage based criteria. Furthermore, based on gaps and limitations reported in the studies we distilled a set of future work opportunities that serve as a guide for new research in the field.
Download

Paper Nr:	35
Title:	Digital Twin Paradigm Shift: The Journey of the Digital Twin Definition
Authors:	Martin Tomczyk and Hendrik van der Valk
Abstract:	This paper examines the paradigm shift in the definition of the digital twin in recent years and describes how the definitions differ from each other. After an extensive literature review and the development of a concept matrix, it became apparent that a paradigm shift is taking place from the classic three-dimensional definition – physical and virtual space with a bidirectional connection – to an expanded five-dimensional definition – data and services as extended dimensions. In particular, the focus and developments in Information and Communication Technologies lead to the recognition of the dimensions of data and services as an independent dimension. In addition, further descriptions of the concept of the digital twin were assigned to the known dimensions.
Download

Paper Nr:	66
Title:	Is There an Optimal Sprint Length on Agile Software Development Projects?
Authors:	Nicolas Nascimento, Alan Santos, Afonso Sales and Rafael Chanin
Abstract:	Agile software development is adopted by the industry as a way to develop applications while also remaining flexible to quickly respond and adapt. At its core, agile relies heavily upon time-constrained iterations, usually named “sprints”, which should provide the development team to deliver a functional version of a software product. This study aims at understanding what is the impact of different sprint lengths in agile software teams. In order to achieve it, we have conducted a field study at a mobile software development course for eight months. The course was organized on three stages, where at each stage ten projects were simultaneously conducted. Data collection was based on project outcome including daily logs and deliverables generated by the teams. Each stage had a different sprint length (1-week, 2-week, or 3-week iterations). Our results indicate that there are differences in some aspects, including project evaluation and weekly impediments. These differences were statistically analyzed regarding the impacts of different sprint lengths in agile teams. Further, we have also observed some correlation between weekly impediments and project evaluation, providing indications of a possible impact on overall projects outcome.
Download

Paper Nr:	81
Title:	State-free End-to-End Encrypted Storage and Chat Systems based on Searchable Encryption
Authors:	Keita Emura, Ryoma Ito, Sachiko Kanamori, Ryo Nojima and Yohei Watanabe
Abstract:	Searchable symmetric encryption (SSE) has attracted significant attention because it can prevent data leakage from external devices, e.g., on clouds. SSE appears to be effective to construct such a secure system; however, it is not trivial to construct such a system from SSE in practice because other parts must be designed, e.g., user login management, defining the keyword space, and sharing secret keys among multiple users who usually do not have public key certificates. In this paper, we describe the implementation of two systems based upon the state-free dynamic SSE (DSSE) (Watanabe et al., ePrint 2021), i.e., a secure storage system (for a single user) and a chat system (for multiple users). In addition to the Watanabe et al. DSSE protocol, we employ a secure multipath key exchange (SMKEX) protocol (Costea et al., CCS 2018), which is secure against some classes of unsynchronized active attackers. It allows the chat system users without certificates to share a secret key of the DSSE protocol in a secure manner. To realize end-to-end encryption, the shared key must be kept secret; thus, we must consider how to preserve the secret on, for example, a user’s local device. However, this requires additional security assumptions, e.g., tamper resistance, and it seems difficult to assume that all users have such devices. Thus, we propose a secure key agreement protocol by combining the SMKEX and login information (password) that does not require an additional tamper-resistant device. Combining the proposed key agreement protocol and the underlying state-free DSSE protocol allow users who know the password to use the systems on multiple devices.
Download

Paper Nr:	119
Title:	REVS: A Vulnerability Ranking Tool for Enterprise Security
Authors:	Igor Forain, Robson de Oliveira Albuquerque and Rafael Timóteo de Sousa Júnior
Abstract:	Information security incidents currently affect organizations worldwide. In 2021, thousands of companies suffered cyber attacks, resulting in billions of dollars in losses. Most of these events result from known vulnerabilities in information assets. However, several heterogeneous databases and sources host information about those flaws, turning the risk assessment difficult. This paper proposes a Recommender Exploitation-Vulnerability System (REVS) with the Technique for Order Preference by Similarity to Ideal Solution (TOPSIS) to rank vulnerability-exploit. The REVS is a dual tool that can pinpoint the best exploits to pentest or the most sensitive vulnerabilities to cybersecurity staff. This paper also presents results in the GNS3 emulator leveraging data from the National Vulnerability Database (NVD), the China National Vulnerability Database (CNVD), and Vulners. They reveal that the CNVD, despite data issues, has 23,281 vulnerabilities entries unmapped in the NVD. Moreover, this work establishes criteria to link heterogeneous vulnerability databases.
Download

Paper Nr:	130
Title:	Efficient Multi-view Change Management in Agile Production Systems Engineering
Authors:	Felix Rinker, Sebastian Kropatschek, Thorsten Steuer, Kristof Meixner, Elmar Kiesling, Arndt Lüder, Dietmar Winkler and Stefan Biffl
Abstract:	Agile Production Systems Engineering (PSE) is a complex, collaborative, and knowledge-intensive process. PSE requires expert knowledge from various disciplines and the integration of discipline-specific perspectives and workflows. This integration is a major challenge due to fragmented views on the production system with a difficult a priori coordination of changes. Hence, proper tracking and management of changes to heterogeneous engineering artifacts across disciplines is key for successful collaboration in such environments. This paper explores effective and efficient multi-view change management for PSE. Therefore, we elicit requirements for multi-view change management. We design the agile Multi-view Change Management (MvCM) workflow by adapting the well-established Git Workflow with pull requests with a multi-view coordination artifact to improve over traditional document-based change management in PSE. We design an information system architecture to automate MvCM workflow steps. We evaluate the MvCM workflow in the context of a welding robot work cell for car parts, using a typical set of changes. The findings indicate that the MvCM workflow is feasible, effective, and efficient for changes of production asset properties in agile PSE.
Download

Paper Nr:	135
Title:	Exploring Azure Active Directory Attack Surface: Enumerating Authentication Methods with Open-Source Intelligence Tools
Authors:	Nestori Syynimaa
Abstract:	Azure Active Directory (Azure AD) is Microsoft’s identity and access management service used globally by 90 per cent of Fortune 500 companies and many other organisations. Recent attacks by nation-state adversaries have targeted these organisations by exploiting known attack vectors. In this paper, open-source intelligence (OSINT) is gathered from organisations using Azure AD to explore the current attack surface. OSINT is collected from Fortune 500 companies and top 2000 universities globally. The collected OSINT includes authentication methods used by the organisation and the full name and phone number of the primary technical contact. The findings reveal that most organisations are using Azure AD and that majority of these organisations are using authentication methods exploited during the recent attacks by nation-state adversaries.
Download

Paper Nr:	136
Title:	Harmonizing the OQuaRE Quality Framework
Authors:	Achim Reiz and Kurt Sandkuhl
Abstract:	Measuring ontology quality using metrics is far from a trivial task – one has to pick the right metrics for the right task and then interpret these values in a meaningful way. Without help, these interpretations are often highly subjective, even for trained knowledge engineers. Quality frameworks can assist and objectify the evaluation. One of the more prominent frameworks in ontology evaluation is OQuaRE, which builds upon the SQuaRE standard for software evaluation. Not only provides it tangible metrics for assessing an ontology, but it also suggests an interpretation for these values in the form of a quality rating and links these metrics to a broader quality framework. However, during an implementation effort, the authors identified some drawbacks. In the last years, various metrics have been proposed that sometimes seem to conflict with each other or are inconclusive in their descriptions. The resources on the quality framework are distributed over web pages and papers. The following paper aims first to present the drawbacks the framework currently has. At the next step, we resolve the current heterogeneities and collect the information of the various sources. We aim to provide a one-stop information resource on OQuaRE to enable our further research and applications efforts.
Download

Paper Nr:	146
Title:	Practical Findings from Applying Quality Assurance Activities in the Development of Three Information Systems for Power Companies
Authors:	Geraldo Braz Junior, Luis Rivero, João Almeida, Simara Rocha, Aristófanes Silva, Anselmo Paiva, Carlos Castro, Darlan Quintanilha, Italo Santos, Erika Alves and Samira Barbosa
Abstract:	Quality can be achieved in software development by identifying and fixing defects, improving the development process and including configuration management activities. However, for novice development teams, including the above activities may be difficult when developing large or complex information systems. Thus, to gain insight on how to improve software quality, novice software engineers may review reports from real software development projects and apply lessons learned. In this paper, we report how software engineering activities for quality assurance were adapted within three power company information system projects. We explain how activities regarding version tracking, software testing and user interface design were carried out by three novice software development teams within a software organization of about 40 collaborators. Our results indicate that version control can be costly at first, but is useful to assess the current state of the development of features. Furthermore, though low-cost evaluation and design approaches, the end product can meet users’ needs and reduce rework when launching a version of an information system.
Download

Paper Nr:	147
Title:	A Hybrid Genetic Algorithm using Progressive Alignment and Consistency based Approach for Multiple Sequence Alignments
Authors:	Vitoria Zanon Gomes, Matheus Carreira Andrade, Anderson Rici Amorim and Geraldo Francisco Donegá Zafalon
Abstract:	The multiple sequence alignment is one of the most important tasks in bioinformatics, since it allows to analyze multiple sequences at the same time. There are many approaches for this problem such as heuristics and metaheuristics, that generally lead to great results in a plausible time, being among the most used approaches. The genetic algorithm is one of the most used methods because of its results quality, but it had a problematic disadvantage: it can be easily trapped in a local optima result, not being able to reach better alignments. In this work we propose a hybrid genetic algorithm with progressive and consistency-based methods as a way to smooth the local optima problem and improve the quality of the alignments. The obtained results show that our method was able to improve the quality of AG results 2 a 27 times, smoothing the local maximum problem and providing results with more biological significance.
Download

Paper Nr:	191
Title:	Towards an Approach for Improving Exploratory Testing Tour Assignment based on Testers’ Profile
Authors:	Letícia De Souza Santos, Rejane Maria Da Costa Figueiredo, Rafael Fazzolino Pinto Barbosa, Auri Marcelo Rizzo Vincenzi, Glauco Vitor Pedrosa and John Lenon Cardoso Gardenghi
Abstract:	This work presents an empirical study on the relationship between the testers’ profile and their efficiency and preference in the application of tours with tourist metaphor for exploratory software testing. For this purpose, we developed and applied a questionnaire based model to gather as much as possible information about the knowledge, expertise and education level from a group of testers. The results indicated that, in fact, the testers’ profile have impact on the application of tours used in the tourist metaphor: there are differences between the tours preferred by different levels of education and most of testers tend to choose those tours based on what they believed to have the shortest execution time. This work raises a valuable discussion about a humanized process of assigning test tasks in order to improve the efficiency of software testing.
Download

Paper Nr:	197
Title:	Industry-Academy Collaboration in Agile Methodology: Preliminary Findings of a Systematic Literature Review
Authors:	Denis de Gois Marques, Tâmara D. Dallegrave, Luis E. L. Barbosa, Cleyton Mario de Oliveira Rodrigues and Wylliams Barbosa Santos
Abstract:	Collaborative Research between Industry and Academia (IAC) in Software Engineering (SE) is being applied and developed in practice. Collaborative practices help both environments, from academic and software industry perspectives. As a way of observing what is being developed in the SE, the objective of this article is to present an exploratory and empirical study of IAC practices in the scope of Agile Software Development (ASD), exploring and characterizing solutions and practices, the challenges found in the application of the IAC and the collaboration. A Systematic Literature Review (SLR) was carried out in five main academic databases, evaluating/analyzing 7143 articles, totalling 12 articles approved following the proposed criteria. As preliminary findings of the data analysis, 76 good practices and 37 challenges in carrying out the IAC were described. As well, practical models for the application of IAC were detailed.
Download

Paper Nr:	198
Title:	A Grey Literature Review on the Impacts of Covid-19 in Software Development
Authors:	Everton Quadros, Rafael Prikladnicki and Regis Lahm
Abstract:	The workplace has been changed by Covid-19. But what is the meaning of the “work from home” phenomenon in software development? This paper aims to investigate the “work from home” pandemic phenomenon in software development. Between October 2019 and December 2021, the Grey Literature review was carried out to investigate 25,251 records, collected through a scraper written in python language. Descriptive analysis was performed using data science and artificial intelligence techniques. We developed a methodology to optimize the collection and extraction of insights from the Grey Literature and reveal perceptions or cognitive distances from the social representation of the impacts of Covid-19 in software development. The main contributions of this paper are to show how Grey Literature may contribute to anticipate findings, reveal changes in the discourse regarding the effects of the pandemic on the work model, and show that in early 2021 the desire for flexibility pressed for a hybrid model. This type of literature review can assist in strategies to deal with events such as Covid-19.
Download

Paper Nr:	107
Title:	Customer Satisfaction as a Critical Success Factor for ERP Design
Authors:	Jamie Plunkett and Craig M. Gelowitz
Abstract:	Enterprise Resource Planning (ERP) systems have been an important tool in managing business processes in corporations worldwide. This paper briefly looks at some popular business process analysis methodologies such as Balanced Scorecard and Critical Success Factors (CSF). It also includes a customer satisfaction analysis as a supplementary mechanism to design and implement an ERP system for small to mid-size enterprises (SMEs). In addition to the traditional metrics, customer satisfaction is included as a critical success factor that drives the changes to business processes and provide insight into the design of an effective ERP for an SME.
Download

Paper Nr:	118
Title:	Cyberwarfare Readiness in the Maritime Environment
Authors:	Diop Ausar Harris
Abstract:	Cybersecurity ensures that sensitive information and data are protected. Even so, there are still roadblocks that impede an organization’s ability to uphold cybersecurity. This study provides an analysis of challenges present in the maritime environment that may impede the preparedness of cyberwarfare. The research showed that the United States tends to be more vulnerable to cyberattacks due to its dependence on information technologies. Numerous cases also showed that the United States, compared to other powers, lacks somewhat in defensive capabilities. Many of the breaches in cybersecurity were due to a lack of cyber awareness leading to human error accidents. Vessels and ports are also likely targets during a cyberwar due to their importance in a country’s economy and security. These challenges present in the maritime environment must be dealt with to prepare for a potential cyberwar. For effective cyberwarfare readiness moving forward, states should aim to increase cybersecurity awareness, improve capabilities, and promote cooperation between states. Proposition for continued research on cyberwarfare readiness in the maritime environment by academic researchers and practitioners is recommended.
Download

Paper Nr:	163
Title:	A Catalog of Process Patterns for Academic Software Projects
Authors:	Caroline Guterres Silva and Lisandra Manzoni Fontoura
Abstract:	Universities have established partnerships with industry or government through technological innovation projects to develop solutions based on problems presented by institutions. Based on a systematic literature review, we identified a lack of software processes suitable for projects developed in academia. This article proposes a catalogue of process patterns documenting practices that have been successfully adopted in academic projects involving external partnerships. Process patterns describe solutions to problems and challenges commonly found in projects developed in the university environment. We conduct a systematic literature review to identify problems commonly encountered in academic projects and the software practices applied to solve them. Later, with the help of the literature, we deepened the understanding of how the software practices can be used in software projects and documented them as process patterns. As a result, we have identified thirteen problems and documented ten process patterns describing possible solutions related to the problem. Eight researchers with experience in software projects in partnership with academia participated in the validation. The validation showed that the proposed process pattern catalogue describes relevant solutions to the problems and is applicable to the academic context.
Download

Area 4 - Software Agents and Internet Computing

Full Papers

Paper Nr:	32
Title:	On-development of a GDPR Compliant Graph-based Recommender Systems
Authors:	Goloviatinski Sergiy, Herbelin Ludovic, José Mancera, Luis Terán, Jhonny Pincay and Edy Portmann
Abstract:	The enforcement of the General Data Protection Regulation (GDPR) in the European Union represents a challenge in designing reliable recommender systems due to user data collection limitations. This work proposes a method to consider GDPR data with a graph-based recommender system to tackle data sparsity and the cold-start problem by representing the data in a knowledge graph. In this work, the authors assess a real dataset provided by Beekeeper AG, a social network company for front-line workers, to model the interactions in a graph database. This work proposes and develops a recommender system on top of the database using the requests made to Beekeeper’s REST API. It explores the API events, neither with knowledge of the content nor the user profiles. Besides, it presents a discussion of multiple approaches for community detection algorithms to retrieve clusters of groups or companies that are part of the social network. This paper proposes several techniques to understand user activity and infer user interactions and events such as likes in posts, comments, and session duration. The recommendation engine presents posts to new and existing users. Thanks to pilot customers who provided consent to access private data, this work verifies the effectiveness of the findings.
Download

Paper Nr:	91
Title:	Patterns for IoT-based Business Process Improvements: Developing a Metamodel
Authors:	Christoph Stoiber and Stefan Schönig
Abstract:	The number of Internet of Things (IoT) devices is constantly growing across all areas of private and professional life. Especially industrial organizations are increasingly recognizing the IoT’s disruptive capabilities and potential benefits for business processes along all value chain activities. In this regard, the integration of IoT technology into existing business processes enables valuable Business Process Improvements (BPI). However, it often remains unclear which BPIs can be expected by organizations and how the anticipated BPIs are realized in detail. Furthermore, the integration of IoT technology into existing business processes constitutes a major challenge caused by a lack of supporting methods, models, or guidelines. The paper at hand addresses this research gap by providing a metamodel that enables the illustration of generic IoT-based BPI patterns. It contains all relevant elements that are comprised by IoT applications with BPI propositions and can be used by industrial organizations as blueprints for conducting IoT projects. The metamodel development follows fundamental principles of design science research (DSR) and is extensively evaluated by deriving a first set of patterns from real-life IoT applications of three market- leading corporations. In addition, an expert survey is conducted to assess the metamodel’s usefulness.
Download

Paper Nr:	164
Title:	Performance Testing Guide for IoT Applications
Authors:	Liana M. Carvalho, Valéria Lelli and Rossana M. C. Andrade
Abstract:	Internet of Things (IoT) applications are characterized by the use of smart objects that are connected to the Internet to provide different types of services. These objects usually generate data that need to be stored and analyzed to contribute to decision making (whether or not they are immediate). In this context, such applications may require high performance, low cost and good scalability. These requirements bring new testing challenges and the need for specific approaches, for example, the detection of performance failures among heterogeneous IoT devices, which process a large amount of data and, under uncertain conditions, must have their resources optimized. Thus, our goal is to propose a performance testing guide for the evaluation of IoT applications. To build the guide, we performed a literature review to identify the IoT standards and analyzed IoT bug repositories. In this paper, we present the Performance Testing Guide for IoT applications. To validate the proposed guide, we conducted two evaluations: (i) an evaluation with the experts; and (ii) a controlled experiment. The results showed that the guide provides a systematization of testing activities, helping the evaluation of IoT aspects intrinsic to performance.
Download

Short Papers

Paper Nr:	21
Title:	Integrating a Multi-Agent Smart Parking System using Cloud Technologies
Authors:	Milton Boos Junior, Lucas Sakurada, Paulo Leitão, Paulo Alves, Gleifer Vaz Alves, André Pinz Borges and Diego Roberto Antunes
Abstract:	Smart parking (SP) systems are becoming a solution to address the increasing traffic in major cities, which are related to the traffic congestion, unnecessary time spent searching for parking spots, and, consequently, environmental issues. These systems intend to help drivers that are searching for available parking spaces in a given desired location. This paper presents a cloud-based solution to integrate a Multi-Agent System (MAS) for SP, which enables the modularization, scalability and robustness of such large-scale systems. The MAS abstraction is a suitable approach to represent the dynamic features of a SP, where multiple drivers arrive, request, search, and leave the parking spots. The cloud services enable to scale up the use of a MAS, being an intermediary in the communication between the MAS and the end user, providing a broad architecture that involves database, asynchronous functions activated by events and real-time message exchange. The cloud agent-based system was deployed in the parking of an University campus, where users driving bicycles and cars can request and schedule parking slots that are managed in a distributed manner by the MAS. The obtained results show the user friendly interaction with the system, the scalability of the system in terms of drivers and parking spots, as well as the efficient management of the parking spots by the MAS system.
Download

Paper Nr:	41
Title:	MovieOcean: Assessment of a Personality-based Recommender System
Authors:	Luca Rolshoven, Corina Masanti, Jhonny Pincay, Luis Terán, José Mancera and Edy Portmann
Abstract:	This research effort explores the incorporation of personality treats into user-user collaborative filtering algorithms. To explore the performance of such a method, MovieOcean, a movie recommender system that uses a questionnaire based on the Big Five model to generate personality profiles, was implemented. These personality profiles are used to precompute personality-based neighborhoods, which are then used to predict movie ratings and generate recommendations. In an offline analysis, the root mean square error metric is computed to analyze the accuracy of the predicted ratings and the F1-score to assess the relevance of the recommendations for the personality-based and a standard-rating-based approach. The obtained results showed that the root mean square error of the personality-based recommender system improves when the personality has a higher weight than the information about the user ratings. A subsequent t-test was conducted for the proposed personality-based approach underperformed based on the root mean square error metric. Furthermore, interviews with users suggested that including aspects of personality when computing recommendations is well-perceived and can indeed help improve current recommendation methods.
Download

Paper Nr:	56
Title:	Using Node Embeddings to Generate Recommendations for Semantic Model Creation
Authors:	Alexander Paulus, Andreas Burgdorf, Alina Stephan, André Pomp and Tobias Meisen
Abstract:	With the ongoing digitalization and the resulting growth in digital heterogeneous data, it is becoming increasingly important for enterprises to manage and control this data. An approach that has established itself over the past years for managing heterogeneous data is the creation and use of knowledge graphs. However, creating a knowledge graph requires the generation of a semantic mapping in the form of a semantic model between datasets and a corresponding ontology. Even though the creation of semantic models can be partially automated nowadays, manual adjustments to the created models are often required, as otherwise no reliable results can be achieved in many real-world use cases. In order to support the user in the refinement of those automatically created models, we propose a content-based recommender system that, based on the present semantic model, automatically suggests concepts that reasonably complement or complete the present semantic model. The system utilizes node embeddings to extract semantic concepts from a set of existing semantic models and utilize these in the recommendation. We evaluate accuracy and usability of our approach by performing synthetic modeling steps upon selected datasets. Our results show that our recommendations are able to identify additional concepts to improve auto-generated semantic models.
Download

Paper Nr:	70
Title:	A Case Study for Minimal Cost Shopping in a Cluster of Online Stores
Authors:	Thiago Alexandre Nakao França, Rodrigo Campiolo and Lilian Caroline Xavier Candido
Abstract:	This work proposes and evaluates mechanisms to converge to the purchase configuration with minimal cost for a product list in an online stores cluster. We collected and stored product prices in a database, and used caching to reduce client response time. We designed, implemented, and compared two Integer Linear Programming solutions to achieve purchase configuration with minimal cost. A case study was conducted to evaluate these mechanisms. The results demonstrated that developed mechanisms found optimal solutions with response time guarantees. Empirical tests in the case study with 100 different products and 118 providers converged in about 9 seconds.
Download

Paper Nr:	89
Title:	Adaptive Learning Content Recommendation using a Probabilistic Cluster Algorithm
Authors:	Adson Marques Esteves, Aluizio Haendchen Filho, André Luiz Alice Raabe, Angélica Karize Viecelli, Jeferson Miguel Thalheimer and Lucas Debatin
Abstract:	Nowadays there are many research using the LDA (Latent Dirichlet Allocation) algorithm to find preferences and characteristics for recommendation systems. In some of the most relevant studies, the recommendation is based on the student's level of evolution within the discipline. This work presents a new recommendation approach with the LDA algorithm. The approach differs from previous LDA studies since the recommendation technique is based on the experiences and preferences from a group of students and not just an individual student. The main objective is to verify, through simulation, whether the methods used, and the algorithm can generate recommendations close to those considered ideal. The obtained results indicate that the application of the LDA for creating groups to generate recommendations provides a good result in delivering content and practices in accordance with the student's interests. It’s empirical research, as the conclusions are drawn from concrete and verifiable evidence used in the simulations.
Download

Paper Nr:	59
Title:	On the Integration of Smart Grid and IoT
Authors:	Salvatore Cavalieri, Giulio Cantali and Andrea Susinna
Abstract:	The paper proposes a novel solution in the field of the integration of Smart Grid and Internet of Things. The definition of a web platform able to offer to a generic web user a RESTful interface to IEC 61850 Servers is proposed. The web platform enables the mapping of information maintained by an IEC 61850 server into MQTT messages.
Download

Paper Nr:	132
Title:	The Push and Pull of Cybersecurity Adoption: A Positional Paper
Authors:	Yang Hoong and Davar Rezania
Abstract:	The intensity and frequency of cyber breaches has brought cybersecurity into the spotlight. This has led to cybersecurity becoming a major concern and stream of research for practitioners and researchers alike. However, despite the negative effects associated with cyber breaches, there remains a limited understanding surrounding the adoption of cybersecurity measures. Specifically, to date, how the interaction of external and internal forces affect cybersecurity adoption remains unclear. We provide an overview of the reasons for a passive posture against cybersecurity, as well as the internal and external forces that push for cybersecurity adoption. We examine the tension of the push and pull of internal and external forces, identify a gap, and propose future research directions.
Download

Area 5 - Human-Computer Interaction

Full Papers

Paper Nr:	52
Title:	Using Lean Personas to the Description of UX-related Requirements: A Study with Software Startup Professionals
Authors:	Gabriel V. Teixeira and Luciana A. M. Zaina
Abstract:	User experience (UX) is a quality requirement widely discussed by software developers. Organizations have targeted to offer software features that carry value to the audience. For software startups, UX-related requirements can represent a competitive edge in their fast-paced environment with constant time pressures and limited resources. However, software startup professionals often have little experience and lack knowledge about UX techniques. Lean persona technique emerges as a slim form of constructing personas to allow the description of end-users needs. In this paper, we investigated the use of the lean persona technique with 21 software professionals, 10 and 11 from software startups and established companies respectively. We carried out a comparison to see whether the startup professionals use the technique in a different way from the established company professionals. Our results revealed that the professionals of both groups used the technique for similar purposes and wrote up UX-related requirements in different levels of abstraction. They also reported positive feedback about the technique acceptance. We saw that the participants’ characteristics as years of experience, prior knowledge about personas technique, or the fact of working in startups did not have an influence on the technique acceptance.
Download

Paper Nr:	71
Title:	Imagination on Interactive Installations: A Systematic Literature Review
Authors:	Maria Jêsca Nobre De Queiroz, Emanuel Felipe Duarte, Julio Cesar Dos Reis and Josiane Rosa De Oliveira Gaia Pimenta
Abstract:	Imagination plays a key role in human development as a natural process between the individual and their surroundings, including environmental possibilities. Today, these surroundings often include ubiquitous and pervasive technologies that enable new interaction possibilities. Although imagination is an important aspect in the theory of enactivism, it remains unclear whether it has been investigated within the context of interactive installations, ubiquitous computing, or other kind of application that emphasizes embodiment. This article presents a systematic literature review investigating if and how imagination has been explored in ubiquitous scenarios of interactive installations. We found that ubiquitous technologies can play an important role in enabling imagination in interactive installations. There is, however, a need for more specific design and evaluation methods and theory adoption to support imagination in the design of interactive systems. On this need, we contribute with a research agenda for further study on this subject.
Download

Paper Nr:	112
Title:	Towards Meaning in User Interface Prototypes
Authors:	Gretchen Torres De Macedo, Lucas Viana and Bruno Gadelha
Abstract:	User interface prototypes aid several activities during the development process lifecycle. However, there are still many manual activities performed during this process. This research project investigates how semantic meaning identification in UI prototypes can help carry out manually performed tasks. We started by analyzing a set of UI prototypes obtained from prototyping activities with students. As a result, we created a database of 856 UI prototypes labeled semantically in 19 classes and two semantic levels: layout and functionalities. Each class of UI prototypes was analyzed to identify its distinguishing characteristics, from which we obtained a set of 19 heuristics specifications. This set of heuristics allows the development of solutions for automatic analysis of UI prototypes, thus supporting software prototyping activities.
Download

Paper Nr:	137
Title:	MCCD: Generating Human Natural Language Conversational Datasets
Authors:	Matheus F. Sanches, Jader M. C. de Sá, Allan M. de Souza, Diego A. Silva, Rafael R. de Souza, Julio C. dos Reis and Leandro A. Villas
Abstract:	In recent years, state-of-the-art problems related to Natural Language Processing (NLP) have been extensively explored. This includes better models for text generation and text understanding. These solutions depend highly on data to training models, such as dialogues. The limitations imposed by the lack of data in a specific language significantly limit the available datasets. This becomes worse as intensive data is required to achieve specific solutions for a particular domain. This investigation proposes MCCD, a methodology to extract human conversational datasets based on several data sources. MCCD identifies different answers to the same message differentiating various conversation flows. This enables the resulting dataset to be used in more applications. Datasets generated by MCCD can train models for different purposes, such as Questions & Answers (QA) and open-domain conversational agents. We developed a complete software tool to implement and evaluate our proposal. We applied our solution to extract human conversations from two datasets in Portuguese language.
Download

Paper Nr:	157
Title:	Perceived Value of IS Collaboration Support in an SME Ecosystem’s Innovation Activity
Authors:	Susanne Marx, Michael Klotz and Kurt Sandkuhl
Abstract:	Networks and ecosystems are involved in Open Innovation (OI) initiatives, their collaboration mediated by technology. A central element of OI is the generation of value as perceived by involved actors. The paper investigates how information systems supporting collaboration (CIS) facilitate the generation of perceived value for OI participants. As multi-method qualitative research, the study uses interview and survey data derived from an innovation activity jointly implemented by two small and medium-sized enterprises and their ecosystem (ten participants), facilitated by two tools: video conferencing and online whiteboard software. The findings suggest specific functionalities and characteristics of these tools to support the development of three types of value: excellence, efficiency, and emotional value. The identified adverse impacts of the CIS encourage providing transparent guidelines for behaviour when using the CIS for an ecosystem’s innovation activity. The tools’ functionalities proved appropriate, with the Perceived Usefulness independent from prior experience. The research advances the understanding of the role of technology in value generation in an ecosystem’s innovation activity and supports practitioners in their decisions for digital support for OI. The study is limited by its small, qualitative approach and focus on the ideation phase of innovation.
Download

Paper Nr:	160
Title:	Challenges of API Documentation from a Provider Perspective and Best Practices for Examples in Public Web API Documentation
Authors:	Gloria Bondel, Arif Cerit and Florian Matthes
Abstract:	Developers frequently have to learn new Web APIs provided by other teams or organizations. Documentation, especially code examples, supports learning and influences the consumers’ perception of an API. Nevertheless, documentation repeatedly fails to address consumers’ information needs. Therefore, we identify four major challenges of creating and maintaining public Web API documentation from a provider perspective which are unknown customer needs, the difficulty of balancing the coverage of varying information needs and keeping documentation concise, the high effort of creating and maintaining documentation, and missing internal guidance and governance for creating API documentation. In addition, we derive 46 best practices candidates for code examples as part of Web API documentation from literature and 13 expert interviews. Moreover, we evaluate a subset of eight of these candidates in the context of the Web API documentation for a public GraphQL API in a case study with 12 participants. As a result, we validate the analyzed eight best practices candidates to be best practices for public Web API documentation.
Download

Paper Nr:	199
Title:	Scoring-based DOM Content Selection with Discrete Periodicity Analysis
Authors:	Thomas Osterland and Thomas Rose
Abstract:	The comprehensive analysis of large data volumes forms the shape of the future. It enables decision-making based on empiric evidence instead of expert experience and its utilization for the training of machine learning models enables new use cases in image recognition, speech analysis or regression and classification. One problem with data is, that it is often not readily available in aggregated form. Instead, it is necessary to search the web for information and elaborately mine websites for specific data. This is known as web scraping. In this paper we present an interactive, scoring based approach for the scraping of specific information from websites. We propose a scoring function, that enables the adaption of threshold values to select specific sets of data. We combine the scoring of paths in a web pages DOM with periodicity analysis to enable the selection of complex patterns in structured data. This allows non-expert users to train content selection models and to label classification data for supervised learning.
Download

Short Papers

Paper Nr:	43
Title:	Adaptable GDPR Assessment Tool for Micro and Small Enterprises
Authors:	Emanuel Löffler, Bettina Schneider, Andreas Goerre and Petra Maria Asprion
Abstract:	The coming into force of the European General Data Protection Regulation (GDPR) has profoundly changed the data protection landscape. Irrespective of their size, organisations inside and outside of Europe are challenged to comply with the requirements posed by the GDPR. Especially micro and small enterprises (MSEs) lack the required internal resources and knowledge to understand the regulation and its implications. In our study, a simplified self-assessment tool dedicated to the situation of MSEs is designed to act as an amplifier for the data protection maturity of this target group. Our research is embedded into the H2020 EU project GEIGER that aims to leverage cybersecurity and data protection of MSEs in Europe. Building upon Hevner’s design science research, our study results in an open source, easy-to-adapt GDPR self-assessment web application targeted to the broad, but so-far rather neglected user group of MSEs. Our privacy-by-design and mobile-first approach ensures the trustworthy handling of user data while focusing on usability.
Download

Paper Nr:	78
Title:	Guidelines’ Parametrization to Assess AAL Ecosystems' Usability
Authors:	Carlos Romeiro and Pedro Araújo
Abstract:	With the aging of the population, healthcare services worldwide are faced with new economic, technical, and demographic challenges. Indeed, an effort has been made to develop viable alternatives capable of mitigating current services’ bottlenecks and of assisting/improving end-user’s life quality. Through a combination of information and communication technologies, specialized ecosystems have been developed; however, multiple challenges (ecosystems autonomy, robustness, security, integration, human-computer interactions and usability) have arisen, compromising their adoption and acceptance among the main stakeholders. Dealing with the technical related flaws has led to a shift in the focus of the development process from the end-user towards the ecosystem’s technological impairments. Although many issues, namely usability, have been reported, solutions are still lacking. This article proposes a set of metrics based on the parametrization of literature guidelines, with the aim of providing a consistent and accurate way of using the heuristic methodology not only to evaluate the ecosystem’s usability compliance level, but also to create the building blocks required to include automation mechanisms.
Download

Paper Nr:	116
Title:	In the Flow: A Case Study on Self-paced Digital Distance Learning on Business Information Systems
Authors:	Anke Schüll and Laura Brocksieper
Abstract:	This paper investigates the acceptance of a self-paced digital distance learning environment on courses about Business Information Systems and Management & Control of IT at a university. The aim of the environment was to avoid monotony and to actively involve the students into their learning process. The course content was split into small units arranged onto an online roadmap. Different design elements were used along the progress on the roadmap, each adding to the content, contributing to clarification, understanding, repetition or memorization. Students could proceed at their own pace, but there was a timetable for discussing the content in accompanying videoconferences and corresponding deadlines for the tasks to be completed. The concept was evaluated in a real-life learning situation following the Unified Theory of Acceptance and Use of Technology (UTAUT), slightly modified to the context. The case study contributes to the body of knowledge by providing a selection of design elements that can be combined to enrich students’ learning experiences. The outcomes of the evaluation underline the importance of “flow” for the acceptance of e-learning environments.
Download

Paper Nr:	148
Title:	An Experimental Study on Usability and User Experience Evaluation Techniques in Mobile Applications
Authors:	Eduardo A. Jesus, Guilherme C. Guerino, Pedro Valle, Walter Nakamura, Ana Oran, Renato Balancieri, Thiago Coleti, Marcelo Morandini, Bruna Ferreira and Williamson Silva
Abstract:	Usability and User Experience (UX) are two quality attributes of Human-Computer Interaction (HCI) relevant to the software development process. Thus, to verify the quality of a system, researchers and developers investigate this area, resulting in different Usability and UX evaluation techniques to improve the quality of applications. However, most of them verify only one of these criteria, being necessary, in many cases, to use more than one technique to evaluate an application in both aspects. Therefore, this research aims to present an experimental study to compare the efficiency, effectiveness, and acceptance of two inspection techniques, Userbility and UX-Tips, which jointly evaluate the Usability and UX of mobile applications. In this way, 99 volunteer participants used the techniques to identify defects in two mobile applications. After the evaluation, the participants answered an acceptance questionnaire about the techniques used. The quantitative comparison results show that the techniques have no significant difference regarding efficiency and effectiveness. However, in terms of participant acceptance, Userbility achieved higher rates of usefulness and future usage intentions, while UX-Tips achieved better rates related to ease of use.
Download

Paper Nr:	153
Title:	Remote Controlled Individuals? The Future of Neuralink: Ethical Perspectives on the Human-Computer Interactions
Authors:	Maria Cernat, Dumitru Borțun and Corina Sorana Matei
Abstract:	In an experiment presented to the public, a monkey with a Neuralink inserted in its brain is able interact directly with the computer. The Neuralink experiment opens the door to an extremely complex debate with questions ranging from ontology to epistemology. We explore the political economy of these cutting-edge technologies. What we aim to investigate are ethical questions, namely: is it ethical to install such a device in someone's brain and connect it to a computer? And who controls the computer, since it is plausible to assume that the communication could be bidirectional? We argue that this is in fact the key question we, as social scientists, IT specialists, and computer science specialists, have to ask and attempt to find answers to. Oftentimes, scientific discoveries could lead to disasters and in the era of "surveillance capitalism" we could easily imagine a scenario where companies are competing to gain access to our consciousness, and where our decisions are being marketed and sold to the higher bidder. Scientific discoveries do not occur in a purely rational society and questions of power, access, and control are vital for a future where technology and society are not at odds.
Download

Paper Nr:	180
Title:	Review: Use of EEG on Measuring Stress Levels When Painting and Programming
Authors:	Nataliya Tupikina, Gcinizwe Dlamini and Giancarlo Succi
Abstract:	For years, brain activity, stress level during programming and painting have been analyzed separately. As the world gets more digital and human life gets more dependent on technology, it has become more important to analyse the relationship between programming, software developers’ brain activity, creative practices (i.e painting) and stress level. In this paper, we present the results of a systematic literature review whereby the research questions are centred around analysing the relationship between stress levels and brain activity when a person is painting or writing a piece of software. The search for relevant studies was done on google scholar and IEEE Xplore. The results of our review show that: (1) EEG can be used to accurately measure stress levels, (2) there is limited research in the analysis of stress level pattern of the stress level when people paint depending on different situations and styles of painting. In light of the systematic literature review result, using EEG we plan to conduct experiments to measure the stress level when a person is painting a picture or programming.
Download

Paper Nr:	33
Title:	Retail Platform with Natural Interaction: LOAD's Vision
Authors:	Pedro Colarejo, Davide Ricardo, André Fernandes, Miguel Fonseca, Pedro Oliveira, Nidhal Cherni, João Abrantes, Hélio Guilherme and António Teixeira
Abstract:	In a context of indirect sales channels, where the product reaches the final consumer through intermediaries, it is very difficult for consumers to know the origin of the product and identify the origin of problems. In this paper an innovative technological platform is proposed, oriented to the retail market, and capable of providing information from the entire product distribution chain to its various stakeholders. The platform is based on 3 pillars: a decentralized blockchain-based information network covering the path of a product from its origin to the end consumer; extraction of information about users/customers and products from images and video; interaction using natural language. A first instantiation of the platform is also presented as well as the first results. In its development, recent technologies were used in the areas of image recognition and dialogue systems.
Download

Paper Nr:	90
Title:	Privacy-preservation and the Use of Data for Research: A COVID-19 Use Case in Randomly Generated Healthcare Records
Authors:	Madalena Lopes E. Silva, Maria Claudia Cavalcanti and Maria Luiza M. Campos
Abstract:	The provision of clinical data for research purposes has become central to monitoring and understanding the COVID-19 outbreak. In such a pandemic scenario, obtaining new research results is an imperative and urgent requirement. However, nowadays, personal data are protected by different legal regulations, to which all these data must comply, especially those related to the health of individuals. Then, a tough challenge arises in the academic sphere: how to provide a large amount of detailed clinical data for research and, simultaneously, guarantee the privacy of the individuals involved? Thus, this article discusses how the biomedical community may face this challenge and it presents the main ongoing initiatives and available emergent technologies that are useful to meet such urgent demand. Moreover, it also shows, through a use case, how it is possible to deal with this challenge, presenting the applicability of privacy-preserving techniques over a randomly generated typical dataset of COVID-19 health records.
Download

Paper Nr:	97
Title:	Perceptions on the Use of an Online Tool in the Teaching-learning Process in Microscopy
Authors:	Breno N. S. Keller, Mariana T. Rezende, Tales M. Machado, Saul Delabrida, Claudia M. Carneiro and Andrea G. C. Bianchi
Abstract:	During the COVID-19 pandemic, remote learning was an alternative to maintaining student participation in subjects, active learning, and knowledge development. This approach is necessary for the experimental demands of the practical content of the Cervical Cytology class. This paper presents and discusses using an online platform to learn practical content in the microscopy subject of Cervical Cytology class. The evaluated scenarios demonstrated that the planning of the discipline and personal factors such as student interest and availability could influence the student performance.
Download

Paper Nr:	161
Title:	UX of Chatbots: An Exploratory Study on Acceptance of User Experience Evaluation Methods
Authors:	Marcus Barbosa, Walter Nakamura, Pedro Valle, Guilherme C. Guerino, Alice F. Finger, Gabriel M. Lunardi and Williamson Silva
Abstract:	Companies increasingly invest in designing, developing, and evaluating conversational agents, mainly text-based chatbots. Chatbots have become the main and the fastest communication channel for providing customer service and helping users interact with systems and, consequently, obtain the requested information. Despite this, the potential market for chatbots,there is still too little known about evaluating the quality of chatbots from a User eXperience (UX) perspective, i.e., the emotions, feelings, and expectations of users towards this application. Besides, relatively little research addresses the feasibility and applicability of UX methods (generic or not) or how to adapt them (if necessary) to evaluate chatbots. The goal of this research is to investigate the adequacy, feasibility of use, and acceptance by users of UX methods when employed to evaluate a chatbot. To achieve this objective, we conducted an exploratory study comparing three UX methods: AttrakDiff, Think Aloud, and Method for the Assessment of eXperience. We compared these methods by assessing the degree of ease of use, usefulness, self-predicted future use from end-users. Also, we performed follow-up interviews to understand users’ perceptions about each method. The results show that users preferred to use the Think Aloud method due to the ease and freedom that the user has to express their positive and negative emotions/feelings while using the chatbot. Based on the results, we believe that combining the three methods was essential to capture the whole user experience when using the chatbot.
Download

Paper Nr:	169
Title:	Increasing Explainability of Clustering Results for Domain Experts by Identifying Meaningful Features
Authors:	Michael Behringer, Pascal Hirmer, Dennis Tschechlov and Bernhard Mitschang
Abstract:	Today, the amount of data is growing rapidly, which makes it nearly impossible for human analysts to comprehend the data or to extract any knowledge from it. To cope with this, as part of the knowledge discovery process, many different data mining and machine learning techniques were developed in the past. A famous representative of such techniques is clustering, which allows the identification of different groups of data (the clusters) based on data characteristics. These algorithms need no prior knowledge or configuration, which makes them easy to use, but interpreting and explaining the results can become very difficult for domain experts. Even though different kinds of visualizations for clustering results exist, they do not offer enough details for explaining how the algorithms reached their results. In this paper, we propose a new approach to increase explainability for clustering algorithms. Our approach identifies and selects features that are most meaningful for the clustering result. We conducted a comprehensive evaluation in which, based on 216 synthetic datasets, we first examined various dispersion metrics regarding their suitability to identify meaningful features and we evaluated the achieved precision with respect to different data characteristics. This evaluation shows, that our approach outperforms existing algorithms in 93 percent of the examined datasets.
Download

Area 6 - Enterprise Architecture

Full Papers

Paper Nr:	12
Title:	ERP Projects in Organizations with Low Maturity in BPM: A Collaborative Approach to Understanding Changes to Come
Authors:	Danilo Lima Dutra, Simone C. dos Santos and Flávia M. Santoro
Abstract:	ERP projects are constantly involved with profound changes in organizations that directly impact business processes and their stakeholders. Therefore, process understanding is an essential step in such projects. However, in companies with low maturity levels in BPM, the processes are informal and unstructured; there are no previous models or outdated. Thus, the primary source of information about processes is the people involved in their execution. But, how to engage ERP project stakeholders in the changes resulting from the new system by understanding current processes and future changes? This study proposes a framework called "Meet2Map" to answer this question. This approach promotes identifying, discovering, and modeling processes involving participants based on collaborative interactions and human-centered design principles. Through a case study, the results showed the adequacy of the Meet2Map framework in supporting the process analyst in As-Is process modeling as an essential step in the implantation of ERP systems.
Download

Paper Nr:	18
Title:	Modelling Advanced Technology Integration for Supply Chains
Authors:	Anna-Maria Nitsche and Wibke Kusturica
Abstract:	The fast-paced evolution of supply chains poses increasing challenges as networks have become more complex and dynamic. The intense interaction between information technology and business drives the spread of the physical internet as a supply chain paradigm. While some of the classic supply chain models provide approaches towards the integration of advanced technologies, few publications focus on a comparison or further development of these models. We strived to critically discuss existing supply chain models and to suggest an improved approach for modelling the digital supply chain. We applied the design science research methodology to systematically analyse and critically evaluate four selected supply chain modelling approaches. Based on a literature review and benefit analysis, we present an outlook on the potential future applicability and provide a roadmap for modelling advanced technology integration for supply chains. The comprehensive analysis highlights if and how selected supply chain models can remain relevant regarding the digitalisation of supply chains. Thus, this article informs researchers on future research opportunities and suggests a potential roadmap for practitioners.
Download

Paper Nr:	29
Title:	A Mapping Study about Digital Transformation of Organizational Culture and Business Models
Authors:	Eduardo C. Peixoto, Hector Paulo, César França and Geber Ramalho
Abstract:	According to some predictions, investments in Digital Transformation (DT) may reach US$ 6.8 trillion in 2023. Nevertheless, about 70% of DT initiatives struggle for success. In fact, few agree about what DT is, how a company becomes digital and how to measure its progress towards it. Organizational Culture and Business Models, however, are seen as key companies’ dimensions. Then, in this article, we report a literature mapping study on the characteristics of Organizational Culture and Business Model in the context of DT. We also investigate how these dimensions are assessed by Digital Maturity Models (DMM). Our data reveal that the most frequent cited Organizational Culture characteristic is Organizational Learning, and that Data and People are companies’ Business Models key resources in the context of DT. The selected studies did not provide enough information about how the characteristics of these two dimensions are evaluated by DMM. We concluded that companies’ dimensions characteristics have not been exhaustively explored and further studies on how to evaluate them in the context of DT are needed.
Download

Paper Nr:	48
Title:	Promoting Collaboration and Creativity in Process Improvement: A Proposal based on Design Thinking and Gamification
Authors:	Caroline Tavares Picanço and Simone C. dos Santos
Abstract:	Business Process Management (BPM) enables companies to track their end-to-end activities, establish objectives and drive their business processes to achieve better results, reduce errors, cut costs, and deliver customer value. On the other hand, for a BPM initiative to be successful, it is necessary to align the continuous improvement of business processes, available technologies, and, above all, the involvement of key stakeholders. Commonly associated with users, managers, process owners, and analysts, business process stakeholders are key players in effecting organizational change. However, they are not always involved in improving business processes, and when they are, not always contribute. In this context, the following question motivates this study: “How to promote collaboration and creativity among stakeholders in improving business processes?”. To answer this question, we propose a method based on the Design Thinking process, called Boomerang, including a creative game to involve people while generating insights and ideas. To evaluate this approach, we carried out a case study, collecting the perceptions of professionals engaged in improving a business process in a public education institution. From the results, it was possible to conclude that Boomerang contributes to the business processes improvement in BPM, maximizing the empathy, interaction, and creativity of its stakeholders.
Download

Paper Nr:	85
Title:	Constraint Formalization for Automated Assessment of Enterprise Models
Authors:	Stef Joosten, Ella Roubtsova and El Makki Haddouchi
Abstract:	Enterprises always do their business within some restrictions. In a team of enterprise architects, the restrictions are transformed into the modelling conventions and the corresponding modelling constraints that should be consistently applied across all enterprise models. This paper presents an approach for refining and formalizing modeling conventions into modelling constraints and using them for assessment of enterprise models by a software component called ArchiChecker. The specifics of the proposed approach is that the modeling conventions are first visualized and formalized using the types of elements and relationships of the ArchiMate modeling language, that is also used for modelling of enterprise views. The ArchiMate elements and relationships serve as types to formulate constraints. The elements and relationships in an ArchiMate model are instances of the ArchiMate elements and relationships. Using these types and instances the ArchiChecker automatically generates the lists of violations of modeling conventions in the enterprise models. Each violation shows how a specific enterprise view deviates from a given modeling convention. The paper reports a case study of application of the proposed approach to enterprise modelling views and modelling conventions used in a medical center. The case study is used to discuss the added value of formalization and automated assessment of modelling constraints in enterprise modelling.
Download

Paper Nr:	186
Title:	The Design of a Public Service Cost Model Tool to Evaluate Digital Transformation in Brazilian Government
Authors:	Rejane Maria Da Costa Figueiredo, John Lenon Cardoso Gardenghi, Letícia De Souza Santos, Rafael Arruda Furtado, Rafael Fazzolino Pinto Barbosa, Lucas Ursulino Boaventura, Augusto Samuel Clementino Modesto and Laura Barros Martins
Abstract:	Digital transformation plays a central role both in private and public sectors. In addition to increasing efficiency and practicality in service delivery, digitalization may generate economic savings for the agency provider and for the citizen user of the service. In this context, a central issue emerges: how to quantify economic savings in digitalization? The Brazilian federal government has been implementing many initiatives to promote digital government, and ways to measure the costs of a service are in constant development. One of these ways is through the use of a methodology adapted from the European standard cost model. The present paper aims to provide the design of a tool to estimate the costs of services both for the government and for the citizen, and for physical and digital delivering of a service, by applying a prototyping technique using the mentioned adapted methodology. We present the design and implementation of the tool. As future work, an analysis of the use of the tool in Brazil, how it impacts in the decision of digitalize a service and real economic savings may take place. Our main contribution is to provide a specification that can serve as a basis for similar tools, besides the idea of how to systematically apply a cost model using a tool which can be easily applied by other organizations without advanced knowledge in the area.
Download

Short Papers

Paper Nr:	10
Title:	An Enterprise Architecture-centred Approach towards Eco-Industrial Networking: A Case Study
Authors:	Ovidiu Noran and Aurelia Noran
Abstract:	Circular Economy is one of the main avenues to tackle the ever-increasing effects of what is becoming the most urgent challenge of our times: climate change. Previous work has advocated a multidisciplinary approach towards the optimal enactment of Circular Economy through Eco-Industrial Networking, due to its many-faceted and complex aspects. This paper aims to further the research in the area by examining the practical application of the concepts proposed within a case study and drawing conclusions on applicability, potential pitfalls and improvements, while at the same time advocating a more Enterprise Architecture (EA)centric stance due to its all-encompassing and integrating nature. Thus, a brief explanation of the theoretical background is followed by the description of the scenario and the proposed EA-focused concepts’ application in practice, including challenges and benefits of the chosen approach. Finally, a reflection is performed and conclusions are drawn together with suggestions for future applications and development of the method.
Download

Paper Nr:	11
Title:	IT Governance, Culture, and Individual Behavior
Authors:	Pedro Fernandes, Rúben Pereira and Guilherme Wiedenhoft
Abstract:	Information technology (IT) has become vital to organizations' success. For this reason, IT Governance (ITG) is necessary to control better solutions, sustainable development, and better decision-making. Since an organization's advantage lies in its employees' behavior, this study analyses the impact of ITG institutionalization on the key dimensions of Organizational Citizenship Behavior (OCB). OCB describes individuals' voluntary commitment to an organization. Plus, to moderate this relationship, Organizational Culture (OC) is recognized as an essential asset that affects OCB behaviors and attitudes and ITG performance. In summary, the findings of this study contribute to the development of a conceptual model that considers different OC types to explain how the institutionalization of ITG affects individuals' behavior.
Download

Paper Nr:	24
Title:	MARE: Semantic Supply Chain Disruption Management and Resilience Evaluation Framework
Authors:	Nour Ramzy, Sören Auer, Hans Ehm and Javad Chamanara
Abstract:	Supply Chains (SCs) are subject to disruptive events that potentially hinder the operational performance. Disruption Management Process (DMP) relies on the analysis of integrated heterogeneous data sources such as production scheduling, order management and logistics to evaluate the impact of disruptions on the SC. Existing approaches are limited as they address DMP process steps and corresponding data sources in a rather isolated manner which hurdles the systematic handling of a disruption originating anywhere in the SC. Thus, we propose MARE a semantic disruption management and resilience evaluation framework for integration of data sources included in all DMP steps, i.e. Monitor/Model, Assess, Recover and Evaluate. MARE, leverages semantic technologies i.e. ontologies, knowledge graphs and SPARQL queries to model and reproduce SC behavior under disruptive scenarios. Also, MARE includes an evaluation framework to examine the restoration performance of a SC applying various recovery strategies. Semantic SC DMP, put forward by MARE, allows stakeholders to potentially identify the measures to enhance SC integration, increase the resilience of supply networks and ultimately facilitate digitalization.
Download

Paper Nr:	36
Title:	Reference Architectures Facilitating a Retailer’s Dual Role on Digital Marketplaces: A Literature Review
Authors:	Tobias Wulfert and Jan Busch
Abstract:	Electronic commerce and digital marketplaces (DMs) have proven to be successful business models compared with traditional brick-and-mortar retailing. Online sales and the simultaneous orchestration of participants from independent market sides on DMs (dual role) pose additional requirements for information systems. Reference architectures (RAs) can be used as blueprints for the implementation of information systems for DMs supporting a retailer’s dual role. However, RAs in retail were mostly developed for brick-and-mortar environments. The peculiarities of electronic commerce and DMs require adaptations and enhancements. Thus, we conduct a literature review following vom Brocke et al. (2009) involving 1,357 research papers to identify RAs supporting a retailer’s dual role on DMs. We identified seven DM-specific architecture requirements and analyzed RAs identified according to Angelov et al. (2012). Our analysis revealed 13 RAs with only limited support for a retailer’s dual role on DMs.
Download

Paper Nr:	47
Title:	Assessing Business Architecture Readiness in Organisations
Authors:	Tiko Iyamu and Irja Shaanika
Abstract:	Business architecture lags because no theoretical framework or model have yet been validated or tested. This study empirically tests a business architecture model that was developed to assess readiness of environment. It is interpretivist research in which the case study approach was employed. Qualitative data was collected through the semi-structured technique. Actor-network theory (ANT) was employed to interpret the outcome from testing the readiness assessment model. The findings suggest that the model solidifies foundation for the deployment of EBA and bring benefits to managers and architects. The result is intended to boost the confidence of promoters and organisations in the concept and possibly increase implementation and practice. This research empirically tested a business architecture readiness assessment model in five South African public and private organisations. The test draws on four main variables: readiness usefulness; value add; design and automation; and ease of use. The variables purportedly help to detect technical and non-technical factors that can derail the implementation or practice of business architecture in an organisation.
Download

Paper Nr:	102
Title:	Enterprise Maps: Zooming in and out of Enterprise Models
Authors:	Maja Spahic-Bogdanovic and Knut Hinkelmann
Abstract:	A company’s architecture can be represented by domain-specific models, which are defined by domain-specific modeling language. Since not all stakeholders are interested in the same models, dedicated views can be created to support navigation through the enterprise models. These views offer a snippet of the entire company and cover stakeholder-specific concerns. The relationships between the different views and models remain hidden and can be unveiled with much effort. The developed concept of the zoomability principle offers the ability to change the degree of detail using zoom in and out of the enterprise model. The different models and modeling languages used to express an enterprise are considered, and a form of navigation is established similar to an online map. The concept is based on two pillars, ”Zoom Within” and ”Zoom into Complements”. For this purpose, a metamodel was developed, which formalizes the elements used in the concept and their relationships. Developing the artifact, rules were defined that contribute to a generic approach allowing an application to another case. Furthermore, a prototype was developed, representing the zoomability principle and offering the possibility to perform zooming behavior. The artifact was evaluated through a demonstration. An additional prototype was created to demonstrate that the developed concept can be applied to a predefined set of situations.
Download

Paper Nr:	114
Title:	How LIME Explanation Models Can Be Used to Extend Business Process Models by Relevant Process Details
Authors:	Myriel Fichtner, Stefan Schönig and Stefan Jablonski
Abstract:	Business process modeling is an established method to describe workflows in enterprises. The resulting models contain tasks that are executed by process participants. If the descriptions of such tasks are too abstract or do not contain all relevant details of a business process, deviating process executions may be observed. This leads to reduced process success regarding different criteria, e.g., product quality. Existing improvement approaches are not able to identify missing details in process models that have an impact on the overall process success. In this work, we present an approach to extract relevant process details from image data. Deep learning techniques are used to predict the success of process executions. We use LIME explanation models to extract relevant features and values that are related to positive process predictions. We show how a general conclusion of these explanations can be derived by applying further image mining techniques. We extensively evaluate our approach by experiments and demonstrate the extension of an existing process model by identified details.
Download

Paper Nr:	150
Title:	Metrics of Parallel Complexity of Operational Business Processes
Authors:	Andrea Chiorrini, Claudia Diamantini, Alex Mircoli and Domenico Potena
Abstract:	This paper addresses the problem of quantifying the parallelism in a business process. Having a synthetic metric to quantify the parallelism of a process may provide an assessment of the complexity of the process and guide certain design choice. In the present paper we discuss the advantages and disadvantages of two metrics presented in the literature, as well of two novel metrics that leverage on the notion of Instance Graph. Analysis is performed by means of use cases that are representative of operational business processes. The proposed metrics show to provide a sensible way to evaluate the overall parallel complexity of a process model.
Download

Paper Nr:	151
Title:	Enterprise Architecture to Identify the Benefits of Enterprise Building Information Model Data: An Example from Healthcare Operations
Authors:	Sobah Abbas Petersen and Tor Åsmund Evjen
Abstract:	This paper explores the concept of Enterprise Building Information Models in a hospital and how they could be used in combination with Enterprise Architecture to support innovation in healthcare operations. The motivation for this work has been to improve the use of easily available data for creating new value-added services that could make an enterprise more flexible and agile. Enterprise Architecture has been used as the approach for structuring and visualising how the data in the Building Information Models can be utilised in combination with other data and applications in a hospital to identify potential new services for the benefit of multiple stakeholders. In this paper, we have considered Enterprise Building Information Models as analogous to the concept of data exchanges identified in some Enterprise Architecture frameworks and use Enterprise Architecture models to describe a business case based on a dynamic scheduling algorithm for cleaning. The main contribution of this work is an Enterprise Architecture model of the hospital, which relates data in Building Information Models to strategic and operational processes.
Download

Paper Nr:	173
Title:	Content-based Filtering for Worklist Reordering to improve User Satisfaction: A Position Paper
Authors:	Sebastian Petter, Myriel Fichtner, Stefan Schönig and Stefan Jablonski
Abstract:	Business Process Management (BPM) is an approach to optimize business processes regarding certain company goals, e.g., the duration or quality of process outcome. Human resources are essential to business processes and are often neglected during process optimization. Considering domains focusing on users, it can be observed that recommender systems are often used to support user decisions and increase user satisfaction. This inspired us to use recommendation techniques in the context of BPM. Employee satisfaction significantly influences productivity, while employees are more satisfied when their preferences are taken into account during process execution. In this work, we propose to adopt the concept of content-based filtering to recommend worklist items to process participants they probably prefer. Since this work is part of a research project, we illustrate our approach on a simplified real-world business process from one of our application partners.
Download

Paper Nr:	17
Title:	Expediting Omni-channel Retailing at C&A Brazil: A Timely Response to the COVID19 Pandemic Powered by RFID
Authors:	Rebecca Angeles
Abstract:	This case study features the experiences of C&A Brazil, in deploying radio frequency identification (RFID) initiative to lay the foundation for omni-channel retailing capabilities as it meets the retailing requirements of the marketplace during the Covid19 pandemic. The exploration of C&A Brazil is successful beyond even the firm’s initial expectations. The significant benefits experienced by C&A Brazil after rolling out RFID in a number of stores to manage its inventory systems more efficiently include: (1) reduction in inventory inaccuracy from 20 percent down to 3 percent; (2) cutting back on customer order cancellations from 10+ percent to less than 3 percent; (3) sales growth in RFID-enabled stores reported 80 to 100 percent increase compared to stores without RFID; (4) omni-channel business operations were conducted two times faster in RFID stores versus non-RFID stores; and (5) C&A Brazil expects to break even from investments made in its RFID projects by 2022. This case study uses the qualitative research method of content analysis and the “structurational model of technology” as its theoretical framework in understanding the firm’s experiences.
Download

Paper Nr:	134
Title:	Towards Holistic Enterprise Modelling: Value Creation Concept in Fractal Enterprise Model (FEM)
Authors:	Victoria Klyukina
Abstract:	Some researchers argue that traditional applications of enterprise modelling (EM) may provide limited value when performing holistic analysis due to disjoint modelling domains that comprise organizations. Fractal Enterprise Modelling (FEM) is a promising EM approach addressing this issue. FEM uses a modelling technic that articulates an organisational fractal structure of an enterprise, and has been used for representing different practical challenges in organisations. This paper is a part of an ongoing research where FEM is used for a holistic analysis of organizational change associated with a strategy implementation in an organization. Particularly, the paper discusses the application of a previously emerged modelling pattern that was useful for supporting operational decision making. In this paper, it is argued that the same pattern is also useful for representation of a value chain concept that allows to connect a high organisational level to the elements of a lower operational level. The results imply that the usage of this pattern might be beneficial to promote systematic and holistic modelling for business analysis using FEM.
Download

Paper Nr:	138
Title:	A Tool for Developing and Evaluating Methods (TDEM)
Authors:	Murad Huseynli, Michael Chima Ogbuachi and Udo Bub
Abstract:	Digital Innovation (DI) is the use of digital technologies during the process of innovation or as the result of innovation. While we see a huge amount of interest in DI research from IS scholars, there is still scarcity in terms of using Design Science Research (DSR) for the engineering of DI, specifically from a method perspective and a possible integration of both fields. In this paper, we propose a tool for developing and evaluating methods to build method design theories, not just in the DI field, but also generally applicable to method engineering in other fields as well. Thus, we contribute systematically to the body of knowledge for information systems with a tool that shall be an effective assistant in method engineering. The tool builds on the eight components of a design theory first introduced by (Gregor and Jones, 2007) and conceptualized by (Offermann et al., 2010a) for the artifact type “method”. Finally, we analyze the role and utility of the tool in detail, concentrating on a method for the engineering of DI, in a DI project related to the microservices architecture, and lastly, we show generalizability of the tool in terms of design process evaluation, not specifically following a DSR paradigm.
Download

Paper Nr:	143
Title:	A New Approach for a Dynamic Enterprise Architecture Model using Ontology and Case-based Reasoning
Authors:	Imane Ettahiri, Karim Doumi and Noureddine Falih
Abstract:	To meet the demands of a dynamic and constantly changing environment, (DRP) Disaster Recovery Plans, (BCP) Business Continuity Plans, change management, agile activities, and best practice guides are developed with the ultimate objective of providing enterprises with the tools to deal with change rapidly and flexibly. Starting from the premise that Enterprise architecture remains the instrument ensuring this alignment Strategy//business//IT, dynamic aspects should be present in the EA representation but also should be perceived in the reaction of enterprises managing the change. On the other hand, ontologies offer a formal and a shared representation of the domain studied; EA in our case. Once formalized, the representation became computable so, all the EA reactions became dynamic towards the triggers of change. To benefit from the previous experiences, Case-based reasoning is introduced in our approach allowing a problem resolution via similarity and adaptation of knowledge to the current context.
Download

Paper Nr:	159
Title:	From BPMN Model to Design Sequence Diagrams
Authors:	Wiem Khlif, Samar Daoudi and Nadia Bouassida
Abstract:	Today’s enterprises, independently of their size, depend on the successful development of an automated Information System (IS). This moves them to the software development world. The success of this move is often hindered by the difficulty in collecting the IS knowledge to produce a software that is aligned with the business logic of the enterprise. For enterprise systems, this transformation must consider the enterprise context where the system will be deployed. However, the complexity of today's Business Process (BP)-IS alignment impedes their maintaining when the enterprise develops a new IS or changes its IS. The problem is expressed from the dissimilarities in the knowledge of the information system developers and the business process experts. To face these difficulties, the current paper presents a methodology to derive design sequence diagrams. Our methodology is based on a set of rules that transform a business process model into design sequence diagrams. Its originality resides in the Computation Independent Model (CIM) to Platform Independent Model (PIM) transformations which account for the BP structural and semantic perspectives in order to generate an aligned IS model.
Download

Paper Nr:	176
Title:	Configuration Settings on Nokia 7750 SR Routers to Optimize Network Performance
Authors:	Indrit Enesi and Anduel Kuqi
Abstract:	One of the main advantages of the Nokia 7750 SR-7 / SR-12 router type is the fact that it meets the most advanced innovations of the time, enabling a network of very high performance (high capacity and speed). But even in these types of routers there are configurations which must be avoided in order to obtain exactly the optimal performance. In this paper we will treat two configurations for 7750 SR-7 which led to a problem where there was not enough capacity in the LAG and we will replace the configurations in order to have optimal parameters.
Download

Paper Nr:	187
Title:	Lifting Existing Applications to the Cloud: Abstractions, Separation of Responsibilities and Tooling Support
Authors:	He Huang, Zhicheng Zeng and Tu Ouyang
Abstract:	The benefits from running applications on the cloud: easy to scale up, low cost from competition of many cloud vendors, and many others, are compelling reasons for more and more applications being developed and/or deployed against the cloud environment. This trend promptes application developers to rethink the structure of their applications, the runtime assumption of the applications, and what are the appropriate input/output abstractions. New generation of applications can be built from the scratch with the recent development of the cloud-native primitives (clo, nd). However there are many existing applications which were previously developed against single-machine environment. Some of them now are needed to be lift to the cloud so as to enjoy the benefits from cloud environment such as computation elasticity. What does a principled process look like to lift such applicaitons to the clcoud? In this paper, we present what we have learnt from helping our customers to lift their existing applications to cloud. We identify the key challenges from common questions being asked in practice, and we present our proposed methodologicl framework, to partially address these key challenges. The solution is comprised of various methods identifing right abstractions for cloud resources, separating the responsibilities between application developer and cloud DevOps, and how to leverage tooling to streamline the whole process. We try to design our methods as much cloud-vendor agnostic as possible. We use the lifting process of one application, a web crawler from our practice, to exemplify various aspects of the proposed methodological framewor.
Download