This book constitutes the refereed proceedings of the 13th International Conference on String Processing and Information Retrieval, SPIRE 2006. The 26 revised full papers and 5 revised short papers presented together with 2 invited talks were carefully reviewed and selected. The papers are organized in topical sections on Web clustering and text categorisation, strings, user behaviour, Web search algorithms, compression, correction, information retrieval applications, bio-informatics, and Web search engines.
Web Clustering and Text Categorization.- MP-Boost: A Multiple-Pivot Boosting Algorithm and Its Application to Text Categorization.- TreeBoost.MH: A Boosting Algorithm for Multi-label Hierarchical Text Categorization.- Cluster Generation and Cluster Labelling for Web Snippets: A Fast and Accurate Hierarchical Solution.- Principal Components for Automatic Term Hierarchy Building.- Strings.- Computing the Minimum Approximate ?-Cover of a String.- Sparse Directed Acyclic Word Graphs.- On-Line Repetition Detection.- User Behavior.- Analyzing User Behavior to Rank Desktop Items.- The Intention Behind Web Queries.- Web Search Algorithms.- Compact Features for Detection of Near-Duplicates in Distributed Retrieval.- Inverted Files Versus Suffix Arrays for Locating Patterns in Primary Memory.- Efficient Lazy Algorithms for Minimal-Interval Semantics.- Output-Sensitive Autocompletion Search.- Compression.- A Compressed Self-index Using a Ziv-Lempel Dictionary.- Mapping Words into Codewords on PPM.- Correction.- Improving Usability Through Password-Corrective Hashing.- Word-Based Correction for Retrieval of Arabic OCR Degraded Documents.- Information Retrieval Applications.- A Statistical Model of Query Log Generation.- Using String Comparison in Context for Improved Relevance Feedback in Different Text Media.- A Multiple Criteria Approach for Information Retrieval.- English to Persian Transliteration.- Bio Informatics.- Efficient Algorithms for Pattern Matching with General Gaps and Character Classes.- Matrix Tightness: A Linear-Algebraic Framework for Sorting by Transpositions.- How to Compare Arc-Annotated Sequences: The Alignment Hierarchy.- Web Search Engines.- Structured Index Organizations for High-Throughput Text Querying.- Adaptive Query-Based Sampling of Distributed Collections.- Short Papers.- Dotted Suffix Trees A Structure for Approximate Text Indexing.- Phrase-Based Pattern Matching in Compressed Text.- Discovering Context-Topic Rules in Search Engine Logs.- Incremental Aggregation of Latent Semantics Using a Graph-Based Energy Model.- A New Algorithm for Fast All-Against-All Substring Matching.