Word Sense Disambiguation
Edited by Eneko Agirre and Philip Edmonds

Chapter 8: Knowledge Sources for WSD

Eneko Agirre, Mark Stevenson

Abstract

This chapter explores the different sources of linguistic knowledge that can be employed by WSD systems. These are more abstract than the features used by WSD algorithms, which are encoded at the algorithmic level and normally extracted from a lexical resource or corpora. The chapter begins by listing a comprehensive set of knowledge sources with examples of their application and then explains whether this linguistic knowledge may be found in corpora, lexical knowledge bases or machine readable dictionaries. An analysis of knowledge sources used in actual WSD systems is then presented. It has been observed that the best results are often obtained by combining knowledge sources and the chapter concludes by analyzing experiments on the effect of different knowledge sources which have implications about the effectiveness of each.

Contents

8.1 Introduction. 217

8.2 Knowledge sources relevant to WSD.. 218

8.2.1 Syntactic. 219

Part of speech (KS 1) 219

Morphology (KS 2) 219

Collocations (KS 3) 220

Subcategorization (KS 4) 220

8.2.2 Semantic. 220

Frequency of senses (KS 5) 220

Semantic word associations (KS 6) 221

Selectional preferences (KS 7) 221

Semantic roles (KS 8) 222

8.2.3 Pragmatic/Topical 222

Domain (KS 9) 222

Topical word association (KS 10) 222

Pragmatics (KS 11) 223

8.3 Features and lexical resources. 223

8.3.1 Target-word specific features. 224

8.3.2 Local features. 225

8.3.3 Global features. 227

8.4 Identifying knowledge sources in actual systems. 228

8.4.1 Senseval-2 systems. 229

8.4.2 Senseval-3 systems. 231

8.5 Comparison of experimental results. 231

8.5.1 Senseval results. 232

8.5.2 Yarowsky and Florian (2002) 233

8.5.3 Lee and Ng (2002) 234

8.5.4 Martínez et al. (2002) 237

8.5.5 Agirre and Martínez (2001a) 238

8.5.6 Stevenson and Wilks (2001) 240

8.6 Discussion. 242

8.7 Conclusions. 245

Acknowledgments. 246

References. 247

Copyright © 2006 Springer. All rights reserved.