Process mining is a technique for extracting process models
from event logs. Event logs contain abundant information related to an
event such as the timestamp of the event, the actions that triggers the
event, etc. Much of existing process mining research has been focused
on discoveries of process models behind event logs. How to uncover the
timing constraints from event logs that are associated with the discovered
process models is not well-studied. In this paper, we present an approach
that extends existing process mining techniques to not only mine but
also integrate timing constraints with process models discovered and
constructed by existing process mining algorithms. The approach contains
three major steps, i.e., first, for a given process model constructed by
an existing process mining algorithm and represented as a workflow net,
extract a time dependent set for each transition in the workflow net model.
Second, based on the time dependent sets, develop an algorithm to extract
timing constraints from event logs for each transition in the model. Third,
extend the original workflow net into a time Petri net where the discovered
timing constraints are associated with their corresponding transitions. A
real-life road traffic fine management process scenario is used as a case
study to show how timing constraints in the fine management process
can be discovered from event logs with our approach.
more »
« less
Using Event Log Timing Information to Assist Process Scenario Discoveries
Event logs contain abundant information, such as activity
names, time stamps, activity executors, etc. However, much of existing
trace clustering research has been focused on applying activity names
to assist process scenarios discovery. In addition, many existing trace
clustering algorithms commonly used in the literature, such as k-means
clustering approach, require prior knowledge about the number of
process scenarios existed in the log, which sometimes are not known
aprior. This paper presents a two-phase approach that obtains timing
information from event logs and uses the information to assist process
scenario discoveries without requiring any prior knowledge about process
scenarios. We use five real-life event logs to compare the performance of
the proposed two-phase approach for process scenario discoveries with
the commonly used k-means clustering approach in terms of model’s
harmonic mean of the weighted average fitness and precision, i.e., the
F1 score. The experiment data shows that (1) the process scenario
models obtained with the additional timing information have both higher
fitness and precision scores than the models obtained without the timing
information; (2) the two-phase approach not only removes the need for
prior information related to k, but also results in a comparable F1 score
compared to the optimal k-means approach with the optimal k obtained
through exhaustive search.
more »
« less
- PAR ID:
- 10311280
- Date Published:
- Journal Name:
- 2020 IEEE Third International Conference on Artificial Intelligence and Knowledge Engineering (AIKE)
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Process Mining is a technique for extracting process models from event logs. Event logs contain abundant explicit information related to events, such as the timestamp and the actions that trigger the event. Much of the existing process mining research has focused on discovering the process models behind these event logs. However, Process Mining relies on the assumption that these event logs contain accurate representations of an ideal set of processes. These ideal sets of processes imply that the information contained within the log represents what is really happening in a given environment. However, many of these event logs might contain noisy, infrequent, missing, or false process information that is generally classified as outliers. Extending beyond process discovery, there are many research efforts towards cleaning the event logs to deal with these outliers. In this paper, we present an approach that uses hidden Markov models to filter out outliers from event logs prior to applying any process discovery algorithms. Our proposed filtering approach can detect outlier behavior, and consequently, help process discovery algorithms return models that better reflect the real processes within an organization. Furthermore, we show that this filtering method outperforms two commonly used filtering approaches, namely the Matrix Filter approach and the Anomaly Free Automation approach for both artificial event logs and real-life event logs.more » « less
-
Villazón-Terrazas, B. (Ed.)Given the ubiquity of unstructured biomedical data, significant obstacles still remain in achieving accurate and fast access to online biomedical content. Accompanying semantic annotations with a growing volume biomedical content on the internet is critical to enhancing search engines’ context-aware indexing, improving search speed and retrieval accuracy. We propose a novel methodology for annotation recommendation in the biomedical content authoring environment by introducing the socio-technical approach where users can get recommendations from each other for accurate and high quality semantic annotations. We performed experiments to record the system level performance with and without socio-technical features in three scenarios of different context to evaluate the proposed socio-technical approach. At a system level, we achieved 89.98% precision, 89.61% recall, and an 89.45% F1-score for semantic annotation recollection. Similarly, a high accuracy of 90% is achieved with the socio-technical approach compared to without, which obtains 73% accuracy. However almost equable precision, recall, and F1- score of 90% is gained by scenario-1 and scenario-2, whereas scenario-3 achieved relatively less precision, recall and F1-score of 88%. We conclude that our proposed socio-technical approach produces proficient annotation recommendations that could be helpful for various uses ranging from context-aware indexing to retrieval accuracy.more » « less
-
Deep neural network clustering is superior to the conventional clustering methods due to deep feature extraction and nonlinear dimensionality reduction. Nevertheless, deep neural network leads to a rough representation regarding the inherent relationship of the data points. Therefore, it is still difficult for deep neural network to exploit the effective structure for direct clustering. To address this issue,we propose a robust embedded deep K-means clustering (REDKC) method. The proposed RED-KC approach utilizes the δ-norm metric to constrain the feature mapping process of the auto-encoder network, so that data are mapped to a latent feature space, which is more conducive to the robust clustering. Compared to the existing auto-encoder networks with the fixed prior, the proposed RED-KC is adaptive during the process of feature mapping. More importantly, the proposed RED-KC embeds the clustering process with the autoencoder network, such that deep feature extraction and clustering can be performed simultaneously. Accordingly, a direct and efficient clustering could be obtained within only one step to avoid the inconvenience of multiple separate stages, namely, losing pivotal information and correlation. Consequently, extensive experiments are provided to validate the effectiveness of the proposed approach.more » « less
-
Radio channel propagation models for the millimeter wave (mmWave) spectrum are extremely important for planning future 5G wireless communication systems. Transmitted radio signals are received as clusters of multipath rays. Identifying these clusters provides better spatial and temporal characteristics of the mmWave channel. This paper deals with the clustering process and its validation across a wide range of frequencies in the mmWave spectrum below 100 GHz. By way of simulations, we show that in outdoor communication scenarios clustering of received rays is influenced by the frequency of the transmitted signal. This demonstrates the sparse characteristic of the mmWave spectrum (i.e., we obtain a lower number of rays at the receiver for the same urban scenario). We use the well-known k-means clustering algorithm to group arriving rays at the receiver. The accuracy of this partitioning is studied with both cluster validity indices (CVIs) and score fusion techniques. Finally, we analyze how the clustering solution changes with narrower-beam antennas, and we provide a comparison of the cluster characteristics for different types of antennas.more » « less