

We analyzed real user data from Duolingo to uncover the key factors that influence language learning success. The project focused on learning patterns, word-level difficulty, and engagement to understand why some learners thrive while others disengage. The goal was to identify what drives high accuracy, where learners struggle, and when they are most likely to drop out.
Using 13 million rows of user activity collected over 13 days, we cleaned and explored the data in Python within Google Colab. We measured accuracy, engagement, and learning intervals, comparing results across language pairs and word attributes such as frequency, length, and part of speech. Finally, we visualized engagement and churn patterns to pinpoint where users tend to lose interest, providing insights that can help improve retention and learning outcomes.