Algorithms on Strings
The area of algorithms on strings deals with one of the simplest and most fundamental data structures: a string (or sequence) of characters. Classical algorithmic problems on strings include exact pattern matching, string comparison, and approximate pattern matching. Algorithms on strings have numerous applications in different areas of science and technology, from simple text editors to sophisticated data mining and telecommunications. A particularly rich source of applications is computational molecular biology, where the role of strings is played by nucleotide sequences in a DNA molecule.
DIMAP members have contributed to research on string algorithms for several decades. These contributions include the fastest known algorithm for computing the edit distance between a pair of strings; several improvements to the classical Boyer--Moore exact pattern matching algorithm; efficient approximation of shortest common superstrings; efficient computation of string periods; efficient parallel algorithms for exact and approximate pattern matching problems in a number of parallel computation models.
The current research of DIMAP in the area of string algorithms is concerned with designing efficient algorithms for detailed string comparison and approximate pattern matching. Parts of this research are carried out in collaboration with Warwick Systems Biology Centre. We also investigate the communication requirements of parallel string comparison algorithms, and have recently designed the first parallel string comparison algorithm with scalable communication.
Sample publications:
- A. Tiskin. Fast distance multiplication of unit-Monge matrices. In Proceedings of the 21st ACM-SIAM Symposium on Discrete Algorithms (SODA'10), pages 1287 - 1296, Austin, Texas, January 17 - 19, 2010.
- P. Krusche and A. Tiskin. Longest increasing subsequences in scalable time and memory. In Proceedings of PPAM, 2009, to appear.
- P. Krusche and A. Tiskin. Computing alignment plots efficiently. In Proceedings of ParCo, 2009, to appear.
- A. Tiskin. Periodic string comparison. In Proceedings of CPM, vol. 5577 of Lecture Notes in Computer Science, pp. 193—-206, 2009.
- P. Krusche and A. Tiskin. String comparison by transposition networks. In Proceedings of London Algorithmics Workshop, 2008. Appeared in London Algorithmics 2008: Theory and Practice, vol. 11 of Texts in Algorithmics, College Publications, pp. 184—204, 2009.
- A. Tiskin. Semi-local longest common subsequences in subquadratic time. Journal of Discrete Algorithms, 6(4): 570 - 581, December 2008.
- A. Tiskin. Semi-local string comparison: Algorithmic techniques and applications. Mathematics in Computer Science, 1(4), pp. 571 - 603, June 2008.
- P. Krusche and A. Tiskin. Efficient parallel string comparison. In Proceedings of ParCo, vol. 38 of NIC Series, John von Neumann Institute for Computing, pp. 193 - 200, 2007.
- A. Tiskin. Longest common subsequences in permutations and maximum cliques in circle graphs. In Proceedings of the 17th Annual Symposium on Combinatorial Pattern Matching (CPM'06), pages 271 - 282, 2006.
- A. Czumaj and L. Gasieniec. On the Complexity of Determining the Period of a String. In Proceedings of the 11th Annual Symposium on Combinatorial Pattern Matching (CPM '00), pages 412 - 422, 2000.
- M. Crochemore, A. Czumaj, L. Gasieniec, T. Lecroq, W. Plandowski, and W. Rytter. Fast Practical Multi-Pattern Matching. Information Processing Letters, 71(3-4): 107 - 113, 1999.
- A. Czumaj, L. Gasieniec, M. Piotrow, and W. Rytter. Sequential and Parallel Approximation of Shortest Superstrings. Journal of Algorithms, 23(1): 74 - 100, 1997.
- A. Czumaj, P. Ferragina, L. Gasieniec, S. Muthukrishnan, and J. L. Träff. The Architecture of a Software Library for String Processing. In Proceedings of the 1st Workshop on Algorithm Engineering, Venice, Italy, September 11 - 13, 1997.
- R. Cole, R. Hariharan, M. Paterson, and U. Zwick. Tighter Lower Bounds on the Exact Complexity of String Matching. SIAM Journal on Computing, 24(1): 30 - 45, 1995.
- A. Czumaj, Z. Galil, L. Gasieniec, K. Park, and W. Plandowski. Work-Time-Optimal Parallel Algorithms for String Problems. In Proceedings of the 27th ACM Symposium on Theory of Computing (STOC '95), pages 713 - 722, 1995.
- M. Crochemore, A. Czumaj, L. Gasieniec, S. Jarominek, T. Lecroq, W. Plandowski, and W. Rytter. Speeding Up Two String-Matching Algorithms. Algorithmica, 12(4/5): 247 - 267, 1994.
- W.J. Masek and M.S. Paterson. A Faster Algorithm for Computing String Editing Distances. Journal of Computer and System Sciences, 20: 18 - 31, 1980.
- M.J. Fischer and M.S. Paterson. String Matching and other Products. In R. Karp (Ed.), Proceedings of the 7th SIAM–AMS Complexity of Computation, pages 113 - 125, 1974.