|
Guest Speakerer
Effective Probabilistic Models for Syntactic
and Semantic Disambiguation
Kristina Toutanova
Stanford University
Thurs., Mar. 3rd
3:30pm; 480 Dreese Labs
All interested parties are invited.
Refreshments will be served immediately preceding the talk.
Abstract:
This talk presents statistical classifiers for sophisticated
syntactic and semantic natural language representations. In
natural language processing, many important problems can be
formulated as disambiguation problems. The goal is to disambiguate
or classify an ambiguous natural language object in context.
For solving high-level problems, such as question answering,
the labels are complex structures - syntactic parse trees and
semantic role structures. This entails new challenges for classification
algorithms.
I advance the state of the art in several domains by (i)
choosing representations that encode domain knowledge more effectively
and (ii) developing machine learning algorithms that deal with
the specific properties of linguistic disambiguation tasks -
sparsity of training data and large, structured spaces of hidden
labels.
I focus on three tasks. For syntactic disambiguation, I
present a novel representation of parse trees that connects
the words of the sentence with the hidden syntactic structure
in a direct way and leads to natural definition of tree kernels
via convolution of string kernels. For disambiguation of the
semantic role structure of verbs, I describe a model that captures
regularities of the label space and incorporates the knowledge
that the semantic frame of a verb is a joint structure with
strong
dependencies between arguments. Finally, I present a method
for combination
of multiple knowledge sources for estimation of lexical distributions,
based on construction of random walks in the space of words.
Host: Donna Bryon
|