Mike Terekhov

MS CS @ University of Southern California

Boston University - B.S. Mechanical Engineering with a Concentration in Machine Learning

University of Southern California - M.S. Computer Science

Contact Me

Current Research Projects

HIV Research AI Regimen Project

I am developing an AI-powered clinical decision tool using Retrieval-Augmented Generation (RAG) to optimize Antiretroviral Therapy recommendations for HIV patients globally. My project features a round-table style AI framework where multiple AI agents simulate perspectives of virologists, clinicians, and pharmacists to collaboratively refine treatments. I am implementing agentic workflows with Agents, Planning, and Reasoning Chains to analyze drug-mutation interactions and generate personalized treatment recommendations.

Vision Transformer Research

I am exploring methods to detect voids in Composite Oriented Strand Boards using machine learning, aiming for up to 95% accuracy. I formulated and implemented a Vision Transformer architecture with an encoder and 8 attention heads to explore advanced image representation techniques. My approach includes a preprocessing pipeline that segments 128×128 images into 16×16 patches for the Vision Transformer, enabling efficient processing of micro-CT scan data for void detection.

Previous Research Projects

AI vs Human Text Classifier

In this project, my team and I set out to explore whether machine learning models could effectively distinguish between human-written and AI-generated high school essays. We gathered a diverse dataset that included original human essays, AI-generated essays using models like GPT-4, GPT-2, Mistral, and Gemma, as well as both AI- and human-paraphrased texts. We applied classical text analysis methods such as Bag of Words, TF-IDF, and SVMs, alongside a neural network using GloVe embeddings. While traditional methods performed well on standard AI text, we found that they struggled with identifying AI-paraphrased versions of human essays. However, our GloVe-based model showed strong performance even in these more nuanced cases. Through this work, we demonstrated both the potential and the challenges in detecting AI-generated content, particularly as language models become more adept at mimicking human writing.

View Paper

LLM Text to SQL Model

For this project, we developed a text-to-SQL system by fine-tuning the DeepSeek-Coder 1.3B model using Low-Rank Adaptation (LoRA) on a curated dataset of 1000+ natural language and SQL query pairs from an NBA database. We integrated a Retrieval-Augmented Generation (RAG) module to dynamically improve prompts based on schema similarity, which helped optimize model performance. Through fine-tuning and data augmentation strategies, we increased SQL validity by 22% and result correctness by 23% over the baseline. I also built a custom evaluation framework to track SQL validity, result accuracy, and query matching across experiments.

View Paper