NLP (Natural Language Processing) Predictive Text App

NLP (Natural Language Processing) Predictive Text App

Project type
RData ScienceNatural Language Processing


This NLP (Natural Language Processing) text prediction app was built as part of the final capstone project for the ten-course Data Science Specialization offered by John Hopkins University. As the final project, students were challenged to delve into an unfamiliar area of data science that was not previously taught in any of the courses prior. As stated in the instructions for the assignment on Coursera:

"You will use all of the skills you have learned during the Data Science Specialization in this course, but you'll notice that we are tackling a brand new application: analysis of text data and natural language processing. This choice is on purpose. As a practicing data scientist you will be frequently confronted with new data types and problems. A big part of the fun and challenge of being a data scientist is figuring out how to work with these new data types to build data products people love."

Overall, I had a great (and challenging) time diving into the world of NLP in order to fully learn the nuances and main challenges/steps in order to create a working text processing algorithm. All in all, it took me around three weeks of on-and-off work to fully complete this project, starting from having zero baseline knowledge of NLP.

Milestone Report: Exploratory Analysis of the Text Data

About halfway through the project, students were also instructed to create a report that analyzes some of the main key features of the text data that was provided.

Click the link below to view the exploratory analysis that was performed on the data


Click the link below to access and try out the app yourself!

Additionally, there is an entire “App Info” section within the app that goes into more detail as to how the algorithm works.

NOTE: The app has been tested and confirmed working by multiple people at the time of writing this, but if for some reason the app does not work or output predictions properly, please let me know by emailing me at:


Click the link below to view the GitHub repository for the project. Descriptions of each file are listed in the linked “README” file