Natural Language Processing Fundamentals

CMPE 561

Instructor: Prof. Dr. John Doe

Semester: Fall 2023

Credits: 3

Course Description

This course covers the fundamental concepts and techniques in Natural Language Processing (NLP), including text preprocessing, language modeling, part-of-speech tagging, syntactic parsing, semantic analysis, and basic neural network approaches for NLP tasks.

Natural Language Processing Fundamentals

Course Description

This course provides an introduction to Natural Language Processing (NLP), covering both traditional and neural approaches to processing and analyzing human language. Students will learn the fundamental concepts and techniques used in NLP, including text preprocessing, language modeling, part-of-speech tagging, syntactic parsing, semantic analysis, and basic neural network approaches for NLP tasks.

Learning Objectives

By the end of this course, students will be able to:

  1. Understand the fundamental challenges in processing natural language
  2. Apply basic text preprocessing techniques
  3. Implement and evaluate language models
  4. Develop solutions for various NLP tasks such as part-of-speech tagging, named entity recognition, and sentiment analysis
  5. Understand and implement basic neural network architectures for NLP tasks
  6. Evaluate NLP systems using appropriate metrics

Prerequisites

  • CMPE 322: Algorithms
  • Basic knowledge of probability and statistics
  • Programming experience in Python

Course Outline

Week 1-2: Introduction to NLP

  • Overview of NLP and its applications
  • Text preprocessing: tokenization, normalization, stemming, lemmatization
  • Regular expressions for text processing

Week 3-4: Language Modeling

  • N-gram models
  • Smoothing techniques
  • Evaluation metrics: perplexity

Week 5-6: Part-of-Speech Tagging and Morphological Analysis

  • Hidden Markov Models
  • Maximum Entropy Markov Models
  • Morphological analysis for Turkish

Week 7-8: Syntactic Parsing

  • Context-free grammars
  • Dependency parsing
  • Evaluation metrics

Week 9-10: Word Representations

  • Distributional semantics
  • Word embeddings: Word2Vec, GloVe
  • Contextual embeddings: ELMo, BERT

Week 11-12: Neural Networks for NLP

  • Feed-forward neural networks
  • Recurrent neural networks
  • Transformer architecture

Week 13-14: Applications

  • Named Entity Recognition
  • Sentiment Analysis
  • Question Answering
  • Machine Translation

Assessment

  • Assignments (40%)
  • Midterm Exam (20%)
  • Final Project (30%)
  • Participation (10%)

Textbooks

  • Jurafsky, D., & Martin, J. H. (2021). Speech and Language Processing (3rd ed. draft).
  • Goldberg, Y. (2017). Neural Network Methods for Natural Language Processing.

Additional Resources

  • Manning, C. D., & Schütze, H. (1999). Foundations of Statistical Natural Language Processing.
  • Course materials will be available on the course GitHub repository.