CS-4395

Repository for CS 4395 (NLP) taken 22f at UTD


Project maintained by adityaguin Hosted on GitHub Pages — Theme by mattgraham

Welcome to Aditya Guin’s HLT page!

My name is Aditya Guin, and I recently graduated from UTD. During the fall semester of my senior year (Fall 2022), I completed various projects completed in CS 4395 (Human Language Technologies). This course is primarily an NLP course, a rapidly growing field within AI. Python is primarily used in these projects, and many libraries such as NLTK is incorporated as well.

Portfolio Contents

  1. Overview of NLP – My views on the NLP field, alongside why I want to study it
  2. Text Processing with Python – Learning basic text processing in Python
  3. Exploring NLTK – Exploring the NLTK API.
  4. Guessing game! – Exploring lexical diversity, part of speech tagging, lemmatization, and implementing a fun guessing game!
  5. Word Net – Exploring Wordnet, its hierarchies, lexical analysis, sentwordnet, collocations, mutual information, and more!
  6. NGrams – Exploring unigrams, bigrams, and creating language model for three languages (English, French, Italian) (worked on with Varin Sikand)!
  7. WebCrawler – We webcrawl pages about Magnus Carlsen (given the chess situation), and creating a knowledge base for a future chatbot (worked on with Varin Sikand)!
  8. Syntax Parsing – Exploring three different methods of parsing a sentence!
  9. Author Attribution – Exploring different neural networks to classify author based off text! Used Naive Bayes, Logistic Regression from Sklearn
  10. Chester 1.0 – Created a chatbot for answering questions about chess grandmasters and tournaments. Worked with Varin Sikand. Running instructions in the Readme.md, and full description given in the report.