Abstract
The quest for a real-time suffix tree construction algorithm is over three decades old. To date there is no convincing understandable solution to this problem. This paper makes a step in this direction by constructing a suffix tree online in time O(log n) per every single input symbol. Clearly, it is impossible to achieve better than O(log n) time per symbol in the comparison model, therefore no true real time algorithm can exist for infinite alphabets. Nevertheless, the best that can be hoped for is that the construction time for every symbol does not exceed O(log n) (as opposed to an amortized O(log n) time per symbol, achieved by current known algorithms). To our knowledge, our algorithm is the first that spends in the worst caseO(log n) per every single input symbol.
We also provide a simple algorithm that constructs online an indexing structure (the BIS) in time O(log n) per input symbol, where n is the number of text symbols input thus far. This structure and fast LCP (Longest Common Prefix) queries on it, provide the backbone for the suffix tree construction. Together, our two data structures provide a searching algorithm for a pattern of length m whose time is \(O(min(m {\rm log} |{\it \Sigma}|,m + {\rm log} n) + tocc)\), where tocc is the number of occurrences of the pattern.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Adelson-Velskii, G.M., Landis, E.M.: An algorithm for the organizaton of information. Soviet Math. Doklady 3, 1259–1263 (1962)
Bayer, R.: Symetric Binary B-trees: Data structure and maintenance algorithms. Acta Informatica 1, 290–306 (1972)
Bayer, R., McCreight, E.M.: Organization and maintenance of large ordered indexes. Acta Informatica 1(3), 173–189 (1972)
Bender, M., Cole, R., Demaine, E., Farach-Colton, M., Zito, J.: Two simplified algorithms for maintaining order in a list. In: Möhring, R.H., Raman, R. (eds.) ESA 2002. LNCS, vol. 2461, pp. 152–164. Springer, Heidelberg (2002)
Cole, R., Hariharan, R.: Dynamic lca queries in trees. In: Proc. 10th ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 235–244 (1999)
Dietz, P.F., Sleator, D.D.: Two algorithms for maintaining order in a list. In: Proc. 19th ACM Symposium on Theory of Computing (STOC), pp. 365–372 (1987)
Farach, M.: Optimal suffix tree construction with large alphabets. In: Proc. 38th IEEE Symposium on Foundations of Computer Science, pp. 137–143 (1997)
Franceschini, G., Grossi, R.: A general technique for managing strings in comparison-driven data structures. In: DÃaz, J., Karhumäki, J., Lepistö, A., Sannella, D. (eds.) ICALP 2004. LNCS, vol. 3142, pp. 606–617. Springer, Heidelberg (2004)
Grossi, R., Italiano, G.F.: Efficient techniques for maintaining multidimensional keys in linked data structures. In: Wiedermann, J., Van Emde Boas, P., Nielsen, M. (eds.) ICALP 1999. LNCS, vol. 1644, pp. 372–381. Springer, Heidelberg (1999)
Gusfield, D.: Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. Cambridge University Press, Cambridge (1997)
Kärkkäinen, J., Sanders, P.: Simple linear work suffix array construction. In: Baeten, J.C.M., Lenstra, J.K., Parrow, J., Woeginger, G.J. (eds.) ICALP 2003. LNCS, vol. 2719, pp. 943–955. Springer, Heidelberg (2003)
Manber, U., Myers, G.: Suffix arrays: A new method for on-line string searches. In: Proc. 1st ACM-SIAM Symp. on Discrete Algorithms (SODA), pp. 319–327 (1990)
McCreight, E.M.: A space-economical suffix tree construction algorithm. J. of the ACM 23, 262–272 (1976)
Rauhe, T., Alstrup, S., Brodal, G.S.: Pattern matching in dynamic texts. In: Proc. 11th ACM-SIAM Symposium on Discrete algorithms (SODA), pp. 819–828 (2000)
Sahinalp, S.C., Vishkin, U.: Efficient approximate and dynamic matching of patterns using a labeling paradigm. In: Proc. 37th FOCS, pp. 320–328 (1996)
Ukkonen, E.: On-line construction of suffix trees. Algorithmica 14, 249–260 (1995)
Weiner, P.: Linear pattern matching algorithm. In: Proc. 14 IEEE Symposium on Switching and Automata Theory, pp. 1–11 (1973)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Amir, A., Kopelowitz, T., Lewenstein, M., Lewenstein, N. (2005). Towards Real-Time Suffix Tree Construction. In: Consens, M., Navarro, G. (eds) String Processing and Information Retrieval. SPIRE 2005. Lecture Notes in Computer Science, vol 3772. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11575832_9
Download citation
DOI: https://doi.org/10.1007/11575832_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29740-6
Online ISBN: 978-3-540-32241-2
eBook Packages: Computer ScienceComputer Science (R0)