CSE 794L: Foundations of Spoken Language Processing

TR 2-3:18, 304 Journalism
Instructor: Eric Fosler-Lussier, fosler at cse

Description

Fundamentals of automatic speech recognition and speech synthesis; lab projects concentrating on building systems to process speech.

Level, Credits, Class Time Distribution, Prerequisites

Level Credits Class Time Distribution Prerequisites
UG 3 2 1.5-hr cl 625 or Ling 484.01; 630; 730 or Stat 428

Quarters Offered

General Information, Exclusions, Cross-listings, etc.

Intended Learning Outcomes

Texts and Other Course Materials

Topics

Number of Hours Topic
3 Human hearing, acoustics, and phonetics
3 Finite state transducers
2 ASR toolkits
3 Dynamic time warping and acoustic modeling
4 HMMs, expectation-maximization, and search
3 Language modeling
3 Text analysis
2 Speech synthesis
1 Speech processing in context (systems)
1 Speaker recognition
1 Quizzes
4 Project presentations

Representative Lab Assignments

Grades

Homeworks 40%
Final Project 30%
Exams (2 x 10%) 20%
Participation 10%

Expectations

This course will be a hybrid lecture/seminar course. Each week we will explore a new topic in spoken language processing. On Tuesdays, I will lecture to give everyone a grounding in the topic. Most Thursdays will be reserved for hands-on group work. You must do the readings before class, especially on Thurdays, since you will be expected to participate in the group discussions. The schedule of readings will be kept on the calendar.

The Thursday hands-on assignments will usually require people to bring in laptops (one per group); let me know if you can bring a laptop to class, and if so, what OS it's running.

There will be approximately 6 labs, each carrying equal weight; however, you are only required to do four of them. (This is to give a bit of flexiblity in terms of scheduling your work.) Labs are to be done individually or in teams of 2, although you may not have the same partner for more than one lab.

There will be two 30-minute quizzes, each worth 10% of the final grade. These are mainly to ensure that you're keeping up with material. There will be no final exam.

You will be required to participate in a final group project involving spoken language processing. You will be required to:

Groups will be composed of 2-3 students; the level of expertise in the group will define my expectations (for example, a group of 1 grad and 1 undergrad would be expected to do less work than 3 grad students). You are encouraged (but not required) to form groups with people of different backgrounds. We will begin forming groups sometime in week 3, and then I will meet with each group to try to help define projects.

Individual projects will only be permitted after serious, significant consultation with the instructor. You have been (or will be) given access to the departmental linux servers for this class (numbered lcc03e - lcc11e). These are relatively slow linux servers, but you are the only people using them, so you can run without CPU limitations that are on stdsun.

Policy on Academic Misconduct

As with any class at this university, you are required to follow the Ohio State "Code of Student Conduct." If you are unfamiliar with this policy, you should read it at http://oaa.osu.edu/coam/code.html. In particular, you should note that you are not allowed to, among other things, (a) knowingly provide or receive information during exams, (b) knowingly provide or receive assistance on homeworks unless I say it's OK, and (c) submit plagiarized (copied but unacknowledged) work for credit. If any violation occurs, I am required to report the violation to the Council on Academic Misconduct.