It is well known that modern functional programming languages are naturally amenable to parallel programming. Achieving efficient parallelism using functional languages, however, remains difficult. Perhaps the most important reason for this is their lack of support for efficient in-place updates, i.e., mutation, which is important for the implementation of both parallel algorithms and the run-time system services (e.g., schedulers and synchronization primitives) used to execute them. In this paper, we propose techniques for efficient mutation in parallel functional languages. To this end, we couple the memory manager with the thread scheduler to make reading and updating data allocated by nested threads efficient. We describe the key algorithms behind our technique, implement them in the MLton Standard ML compiler, and present an empirical evaluation. Our experiments show that the approach performs well, significantly improving efficiency over existing functional language implementations.