A Time Series Clique (TSC) consists of multiple time series which are related to each other by natural relations. The natural relations that are found between the time series depend on the application domains. For example, a TSC can consist of time series which are trajectories in video that have spatial relations. In conventional time […]
Archives for March 2012
A Framework for Learning Comprehensible Theories in XML Document Classification
XML has become the universal data format for a wide variety of information systems. The large number of XML documents existing on the web and in other information storage systems makes classification an important task. As a typical type of semistructured data, XML documents have both structures and contents. Traditional text learning techniques are not […]
Intertemporal Discount Factors as a Measure of Trustworthiness in Electronic Commerce
In multiagent interactions, such as e-commerce and file sharing, being able to accurately assess the trustworthiness of others is important for agents to protect themselves from losing utility. Focusing on rational agents in e-commerce, we prove that an agent’s discount factor (time preference of utility) is a direct measure of the agent’s trustworthiness for a […]
PCloud: A Distributed System for Practical PIR
Computational Private Information Retrieval (cPIR) protocols allow a client to retrieve one bit from a database, without the server inferring any information about the queried bit. These protocols are too costly in practice because they invoke complex arithmetic operations for every bit of the database. In this paper, we present pCloud, a distributed system that […]
Topic Mining over Asynchronous Text Sequences
Time stamped texts, or text sequences, are ubiquitous in real-world applications. Multiple text sequences are often related to each other by sharing common topics. The correlation among these sequences provides more meaningful and comprehensive clues for topic mining than those from each individual sequence. However, it is nontrivial to explore the correlation with the existence […]
Identifying Evolving Groups in Dynamic Multimode Networks
A multimode network consists of heterogeneous types of actors with various interactions occurring between them. Identifying communities in a multimode network can help understand the structural properties of the network, address the data shortage and unbalanced problems, and assist tasks like targeted marketing and finding influential actors within or between groups. In general, a network […]
Effective Pattern Discovery for Text Mining
Many data mining techniques have been proposed for mining useful patterns in text documents. However, how to effectively use and update discovered patterns is still an open research issue, especially in the domain of text mining. Since most existing text mining methods adopted term-based approaches, they all suffer from the problems of polysemy and synonymy. […]
BibPro: A Citation Parser Based on Sequence Alignment
Dramatic increase in the number of academic publications has led to growing demand for efficient organization of the resources to meet researchers’ needs. As a result, a number of network services have compiled databases from the public resources scattered over the Internet. However, publications by different conferences and journals adopt different citation styles. It is […]
Falcons Concept Search: A Practical Search Engine for Web Ontologies
Web ontologies provide shared concepts for describing domain entities and thus enable semantic interoperability between applications. To facilitate concept sharing and ontology reusing, we developed Falcons Concept Search, a novel keyword-based ontology search engine. In this paper, we illustrate how the proposed mode of interaction helps users quickly find ontologies that satisfy their needs and […]
Predictable High-Performance Computing Using Feedback Control and Admission Control
Historically, batch scheduling has dominated the management of High-Performance Computing (HPC) resources. One of the most significant limitations using this approach is an inability to predict both the start time and end time of jobs. Although existing researches such as resource reservation and queue-time prediction partially address this issue, a more predictable HPC system is […]