/ RESEARCH | Notion

Things I have worked on.

Beyond RAG: Enabling Explicit Memory in Pretrained LLMs for Enhanced Reasoning and Efficiency

Abstract: Explicit memory refers to highly sparse attention key-value pairs derived from the reference text. As a form of knowledge that offers lower decoding costs compared to plain-text retrieval augmented generation (RAG) and lower encoding costs compared to model parameters, explicit memory can help large language mod els (LLMs) reduce the cost of acquiring new knowledge. Previous work has explored train ing models from scratch to reason with explicit memory, achieving significant capability improvements while reducing training and inference costs. However, many existing pretrained models lack the ability to utilize explicit memory. In this work, we aim to study how to enable pretrained models to learn to use explicit memory through supervised fine-tuning (SFT), without forgetting their pretrained knowledge. We curated 120,000 training examples and de signed a training strategy to teach the model to use explicit memory to solve math and coding tasks. Experiment results show that the use of explicit memory can help models solve mathematical problems; however, the improvements are limited and come at the cost of degraded performance on coding tasks. Furthermore, the results reveal that pretrained models are prone to catastrophic forgetting during memory in volving fine-tuning. This work serves as an initial attempt to explore how pretrained models can be equipped with explicit memory, and we plan to continue investigating better training and data strategies in the future research. All of our work can be found on our GitHub homepage: https://github.com/szjiozi/Explicit-Memory

Quantifying Risk Aversion and Risk-Seeking Behavior Through Utility-Based Policy Learning in Blackjack

Abstract: Modeling human decision-making under risk and uncertainty remains a significant challenge, with implications in fields like economics and cognitive science. Many decision-making models fail to fully account for individual risk preferences and utility functions. This work investigates the relationship between payoff and utility across diverse risk rofiles, using Blackjack as a simulation to quantify rewards and outcomes. By modeling the game as a deterministic environment, we apply dynamic programming to compute action values for each state and subsequent subgame encountered during gameplay

based on different utility functions, and select the highest-value action, weighted by probabilistic outcomes1. Drawing on cognitive theories such as Prospect Theory and Expected Utility Theory, our findings show that variations in utility functions significantly influence decision-making and financial outcomes, offering new insights into decision-making under un-certainty.