Iterative Chain-of-Thought Refinement

Tech Stack: PyTorch, Hugging Face, LoRA, LLaMA, Gemma, GPT-2 Embeddings, CUDA, NLTK
Github URL: Project Link

This project introduces a self-correcting feedback loop between a LLaMA-3B generator and a Gemma-2B verifier that iteratively improves reasoning chains. Evaluated on GSM8K and REVEAL datasets.