Abstract
Frequent episode discovery is a popular framework for temporal pattern discovery in event streams. An episode is a partially ordered set of nodes with each node associated with an event type. Currently algorithms exist for episode discovery only when the associated partial order is total order (serial episode) or trivial (parallel episode). In this paper, we propose efficient algorithms for discovering frequent episodes with unrestricted partial orders when the associated event-types are unique. These algorithms can be easily specialized to discover only serial or parallel episodes. Also, the algorithms are flexible enough to be specialized for mining in the space of certain interesting subclasses of partial orders. We point out that frequency alone is not a sufficient measure of interestingness in the context of partial order mining. We propose a new interestingness measure for episodes with unrestricted partial orders which, when used along with frequency, results in an efficient scheme of data mining. Simulations are presented to demonstrate the effectiveness of our algorithms.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Achar A (2010) Discovering frequent episodes with general partial orders. PhD thesis, Department of Electrical Engineering, Indian Institute of Science, Bangalore
Achar A, Laxman S, Raajay V, Sastry PS (2009) Discovering general partial orders from event streams. Technical report. arXiv:0902.1227v2 [cs.AI]. http://arxiv.org
Agrawal R, Srikant R (1995) Mining sequential patterns. In: Proceedings of the 11th international conference on data engineering, Taipei, Taiwan. IEEE Computer Society, Washington, DC
Bouqata B, Caraothers CD, Szymanski BK, Zaki MJ (2006) Vogue: a novel variable order-gap state machine for modeling sequences. In: Proceedings of the 10th European conference on principles and practice of knowledge discovery in databases, vol 4213. Springer-Verlag, Berlin, Heidelberg, pp 42–54
Brown E, Kass K, Mitra P (2004) Multiple neuronal spike train data analysis: state of art and future challenges. Nat Neurosci 7: 456–461
Casas-Garriga G (2003) Discovering unbounded episodes in sequential data. In Proceedings of the 7th European conference on principles and practice of knowledge discovery in databases (PKDD’03). Cavtat-Dubvrovnik, Croatia, pp 83–94
Casas-Garriga G (2005) Summarizing sequential data with closed partial orders. In: Proceedings of 2005 SIAM international conference on data mining (SDM’05)
Diekman C, Sastry PS, Unnikrishnan KP (2009) Statistical significance of sequential firing patterns in multi-neuronal spike trains. J Neurosci Methods 182: 279–284
Hätönen K, Klemettinen M, Mannila H, Ronkainen P, Toivonen H (1996) Knowledge discovery from telecommunication network alarm databases. In: Proceedings of the twelfth international conference on data engineering (ICDE ’96). IEEE Computer Society, Washington, DC, pp 115–122
Iwanuma K, Takano Y, Nabeshima H (2004) On anti-monotone frequency measures for extracting sequential patterns from a single very-long sequence. In: Proceedings of the 2004 IEEE conference on cybernetics and intelligent systems, vol 1, pp 213–217
Laxman S (2006) Discovering frequent episodes: fast algorithms, connections with HMMs and generalizations. PhD thesis, Department of Electrical Engineering, Indian Institute of Science, Bangalore
Laxman S, Sastry PS, Unnikrishnan KP (2005) Discovering frequent episodes and learning Hidden Markov models: a formal connection. IEEE Trans Knowl Data Eng 17: 1505–1517
Laxman S, Sastry PS, Unnikrishnan KP (2007a) A fast algorithm for finding frequent episodes in event streams. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’07). San Jose, CA, 12–15 Aug, pp 410–419
Laxman S, Sastry PS, Unnikrishnan KP (2007b) Discovering frequent generalized episodes when events persist for different durations. IEEE Trans Knowl Data Eng 19: 1188–1201
Laxman S, Tankasali V, White RW (2008) Stream prediction using a generative model based on frequent episodes in event sequences. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’09), pp 453–461
Luo J, Bridges SM (2000) Mining fuzzy association rules and fuzzy frequent episodes for intrusion detection. Int J Intell Syst 15: 687–703
Mannila H, Meek C (2000) Global partial orders from sequential data. In: Proceedings of the sixth ACM SIGKDD international conference on knowledge discovery and data mining (KDD’07). ACM, New York, pp 161–168
Mannila H, Toivonen H, Verkamo AI (1997) Discovery of frequent episodes in event sequences. Data Min Knowl Discov 1(3): 259–289
Nag A, Fu AW (2003) Mining frequent episodes for relating financial events and stock trends. In: Proceedings of 7th Pacific-Asia conference on knowledge discovery and data mining (PAKDD 2003). Springer-Verlag, Berlin, pp 27–39
Patnaik D, Sastry PS, Unnikrishnan KP (2008) Inferring neuronal network connectivity from spike data: a temporal data mining approach. Sci Program 16: 49–77
Pei J, Wang H, Liu J, Ke W, Wang J, Yu PS (2006) Discovering frequent closed partial orders from strings. IEEE Trans Knowl Data Eng 18: 1467–1481
Sastry PS, Unnikrishnan KP (2010) Conditinal probability based significance tests for sequential patterns in multi-neuronal spike trains. Neural Comput 22(4): 1025–1059
Tatti N (2009) Significance of episodes based on minimal windows. In: Proceedings of 2009 IEEE international conference on data mining
Tatti N, Cule B (2010) Mining closed strict episodes. In: Proceedings of 2010 IEEE international conference on data mining
Unnikrishnan KP, Shadid BQ, Sastry PS, Laxman S (2009) Root cause diagnostics using temporal datamining. US Patent 7509234, 24 Mar 2009
Wagenaar DA, Pine J, Potter SM (2006) An extremely rich repertoire of bursting patterns during the development of cortical cultures. BMS Neurosci
Wang J, Han J (2004) BIDE: efficient mining of frequent closed sequences. In: 20th international conference on data engineering. Boston
Wang M-F, Wu Y-C, Tsai M-F (2008) Exploiting frequent episodes in weighted suffix tree to improve intrusion detection system. In: Proceedings of the 22nd international conference on advanced information networking and applications—workshops. IEEE Computer Society, Washington, DC, pp 1246–1252
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible editor: Eamonn Keogh.
Rights and permissions
About this article
Cite this article
Achar, A., Laxman, S., Viswanathan, R. et al. Discovering injective episodes with general partial orders. Data Min Knowl Disc 25, 67–108 (2012). https://doi.org/10.1007/s10618-011-0233-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10618-011-0233-y