Abstract: |
Social media sites have appeared to the cyber space during the last 5-7 years and have attracted hundreds of millions of users. The sites are often viewed as instances of Web 2.0 technologies and support easy uploading and downloading of user generated contents. This content contains valuable real time information about the state of affairs in various parts of the world that is often public or at least semipublic. Many governments, businesses, and individuals are interested in this information for various reasons. In this paper we describe how ontologies can be used in constructing monitoring software that would extract useful information from social media sites and store it over time for further analysis. Ontologies can be used at least in two roles in this context. First, the crawler accessing a site must know the “native ontology” of the site in order to be able to parse the pages returned by the site in question, extract the relevant information (such as friends of a user) and store it into the persistent generic (graph) model instance at the monitoring site. Second, ontologies can be used in data analysis to capture and filter the collected data to find information and phenomena of interest. This includes influence analysis, grouping of users etc. In this paper we mainly discuss the construction of the ontology-guided crawler. |