

Developed a next-generation matching system for LGBTQ+ women and non-binary individuals using deep learning and natural language processing. Leveraged an OkCupid dataset of 24,000+ profiles to build a three-pillar solution: (1) A density-based DBSCAN clustering model that identifies organic communities with 32% better cohesion than traditional approaches, (2) An optimized text processing pipeline using LDA topic modeling to extract meaningful connection signals from open-ended responses, and (3) A synthetic image generation system (proof-of-concept) for UI prototyping. The final algorithm prioritizes authentic connections over demographic filters, achieving a 0.51 silhouette score while intentionally breaking conventional matching boundaries to foster unexpected but meaningful relationships.
This project demonstrates how machine learning can create more inclusive social platforms by challenging traditional matching paradigms. Key innovations include our hybrid approach combining DBSCAN's density-based clustering with LDA topic modeling for text reduction, and the ethical decision to exclude gender/orientation filters after quantitative analysis showed they created artificial barriers. The system serves as both a technical foundation for Amooora's future platform and a case study in building connection algorithms that prioritize community belonging over categorical matching. Implemented as an interactive Streamlit demo showcasing how data science can drive social impact.
Demo day video
Tech stack
Python

Numpy

Google Cloud

Docker
.png)
Pandas

SciKit-Learn
TensorFlow
Keras
FastAPI

Streamlit