This is the second part of my thoughts on writing good SRS prompts after reading this guide by Andy Matuschak. In part 1, I mused on what it means for a prompt to be good. One aspect of judging whether a prompt is good has to do with what we intend to learn with the prompt. That is, how good is the learning goal underlying the learning tasks defined by the prompt? But given the diversity of goals that we might have, how do we know if the spaced repetition system is the right approach to achieve them?
To be more specific, SRS is tailor-made for memorization-type learning goals. But ideally, we want to learn things way beyond remembering facts, and it would be very exciting if SRS can be an effective tool for those as well. So this time, I want to focus on the very idea of using SRS beyond memorization. Do we have any reason to believe this is a good idea? What could go wrong? While I want to express certain concerns, I’ll also make some conjectures on how we might address those concerns.
Using SRS for Memorization
Spaced repetition systems are most well known for helping us remember facts. And indeed, there are some good explanations for why SRS are effective for memorization. So let’s start from there, before we go beyond memorization.
Spaced repetition systems are designed around two principles that help us remember things better: retrieval practice and optimal scheduling.
Retrieval: Trying to recall something is a more effective way to memorize it, compared to studying it again. It is how flash cards work.
Optimal scheduling: This is where a digital SRS diverges from a physical deck of flash cards. If we review cards randomly or at a fixed interval, we’ll be reviewing many cards either too early (so that it’s very easy and doesn’t reinforce our memory very much), or too late (we’ve forgotten too much about it so we fail to recall, and it’s a frustrating experience). So timing the reviews just right can make our review sessions more efficient. Typically, this means that the gaps between review sessions start short and get exponentially longer.
These two principles are most often associated with memorization, rather than sense-making. But we already know that SRS is good for remembering things. Do we also have reasons to suspect that SRS can be good for learning beyond memorization?
Using SRS Beyond Memorization
For one thing, remembering is fundamental to understanding.
Memorization as a Foundation to Understanding
Knowledge builds on top of other knowledge. Remembering facts is often a prerequisite to understanding deeper concepts.
Mathematical definitions and theorems are some of the first examples that come to my mind. I studied math at the graduate level. At some point, I naively thought that math is great for someone who hates to memorize stuff like me. Reasoning and understanding was all I need. But thinking back, this was not the case. I needed to remember the definitions of mathematical objects and operations before I can understand their properties. I then need to internalize those properties, and not having to reason them from scratch every time (although doing so occasionally is fine), to access the more intricate and interesting results. Although a lot of that memorization happened “naturally” rather than dedicated efforts to memorize, deeper understanding couldn’t happen without some memorization.
There are some learning-related models that embody this idea. In Bloom’s taxonomy, the six cognitive domains are: remember, understand, apply, analyze, evaluate, and create. Associated with this model is the idea that those lower levels are building blocks for the higher ones. In other words, remembering is fundamental to deeper understanding.
There is a related formulation in the Knowledge-Learning-Instruction (KLI) framework by Dr. Koedinger. One idea that the KLI framework explores is the matching between learning processes and knowledge types. Some learning processes seem to be especially effective for low-complexity knowledge components, like facts, while others are great for high-complexity knowledge components, like understanding principles. What if we switch up the matching? The observation is that learning processes good for low-complexity knowledge are often also good for high-complexity knowledge (although maybe not to the same degree), but not vice versa.
What I infer from these is this: Even if SRS is primarily good for memorization, we’d still expect it to be helpful for deeper understanding.
Other Good Learning Principles for SRS
To go one step further considering the effectiveness of SRS beyond memorization, we might ask: does the use of SRS involve other good learning principles that benefit understanding? I think there are at least two that fit naturally here, which are self-explanation and interleaving.
Self-explanation: Trying to explain something using our own words is an effective way to help understand it better. During SRS review sessions, we’d often engage in reasoning around a prompt and not merely recalling the answer. That could either be because it helps us synthesize the answer, or because we’ve explicitly asked “why” in our prompt.
Interleaving: Alternating between different topics during practice sessions might help us learn better, versus “blocking,” or grouping the practice of the same topic together. In SRS we often have a big bucket of mixed prompts from multiple sources, so during a review session we’d be hopping from one topic to another, and that can be a good thing.
I’m glossing over all kinds of finer details, but if we write prompts that ask us to explain things or otherwise induce mental processes other than mere factual recall, then I can see how principles such as these are at play to help us develop deeper understanding.
Andy brought up the concept of salience for what he calls “salience prompts” in the guide. I find it really interesting and think it deserves special mention.
Salience is about what you notice. When something is fresh in your mind, you notice it more around you. This is a mechanism firmly rooted in our biological hardware: scientists have identified areas dubbed the “salience network” in the brain. It could be an underlying mechanism for the higher-level learning principles above, like interleaving.
Reviewing prompts in an SRS brings the things we review to our active attention. We might write prompts that do this more explicitly, asking us “what might you think about or notice in this situation?” But most likely, it’s not just these kinds of purpose-crafted prompts that affect the way we notice things in our lives. Rather, all prompts do.
This is quite an interesting effect of using an SRS, and it tells us how we might use SRS beyond memorization. By writing a prompt, we’re “brewing” ideas over time, by bringing them to our conscious attention on a schedule, and each time it will remain salient for some time. Those ideas will have more time and more opportunities to grow roots by connecting to other things we see and think, creating more enduring memories and a more well-connected expert knowledge network.
Sometimes we might even see serendipitous connections of ideas. It’s a cool feeling when you’ve reviewed a prompt, then the thought somehow “clicked” that day as you notice how it is relevant in a completely different context. However, this is jut a secondary effect. If this kind of connection-making is important, then I’d prefer more intentionality, such as explicitly trying to find different contexts in which the idea might apply.
In a sense, we can think of writing a prompt as a declaration of what ideas we want to pay more attention to. Our brains would then keep tap of those ideas even when we’re not consciously thinking about them. I think this fits well with the development of deeper understanding — giving ideas time to grow. Meanwhile, I wonder how much we know about this phenomenon. What are the chances that we might misuse this power?
Should We Be Worried?
So when we add a prompt into our SRS, the system would guarantee that we practice and remember it, think about it for a long time, and notice it more than other things. This is really amazing, but also alarming.
Of course, the concern is about whether we’re focusing all this energy and attention on the right things. When we use SRS, we are usually in a self-directed learning context. If we’re not making good decisions about what to write prompts for, then we’d be wasting time and effort. For example, remembering the US state capitals may not have much benefit for most people. But could there be situations where wasting efforts is not the worst outcome? Thinking in terms of salience, I worry that there might be situations where poor decisions could cause active harm.
Before getting into that, maybe we should ask, should we even doubt whether we know how to pick the right things to make prompts for? I suspect the answer is yes, especially when we’re still a novice in a subject area.
Novices and the Difficulties of Self-Directed Learning
When someone is a novice, not only do they know less about the subject, they also know less about how to advance. Our self-learning ability correlates with our expertise in the subject area that we’re learning.
As a software developer, I often see articles giving advice to novice developers. One common advice is to stop trying to memorize all the syntax of a language, or all APIs in a framework. Another one is to not fall prey to the “shiny object syndrome” — continuously chasing after new technologies and trends. In these cases, it’d be more beneficial for a novice to spend their time thinking through the patterns and approaches to problem-solving, than over-commit on the surface level knowledge.
As another example, when I first ventured into learning sciences, I found it difficult to decide which research papers to read, which parts to read, and how much time to spend reading them. As far as I can tell, this is a common problem for students new to any field of research.
Novices face a number of metacognitive challenges. A lot of knowledge is in the realm of “unconscious incompetence” for novices, or putting it more plainly, things they aren’t even aware that they don’t know. Some things are better internalized, others can be looked up on demand, but it’s hard to tell which is which. It’s also hard to differentiate something we like, e.g. shiny objects, from something that is harder and less interesting, but better for developing expert-level understanding.
So yes, we should be concerned that people will probably make some bad choices when deciding what to learn, and the problem is worse when one’s level of expertise is still low. In terms of SRS usage, that means creating ineffective prompts that are a waste of time to write and review. I don’t want to over-exaggerate this loss of efficiency as disastrous. That journey of exploring new knowledge can itself be a valuable thing. But I do want to run with this train of thought further and hypothesize what might be the worst-case scenario.
In machine learning, overfitting is a situation where an ML model “specializes” too much on a particular dataset, so that it performs really well on those data, but is terrible at generalizing the inference to data it hasn’t seen before. Overfitting might happen when the model is too powerful, the dataset is too small, and/or the training process is longer than needed.
Overfitting is what I keep thinking about regarding how using an SRS could go wrong, in light of its power described above. Given how it’s easy (especially for novices) to target the wrong things to learn, and how SRS would ensure that you keep coming back to them and maintain their salience, this feels just like a machine learning model being extensively trained on a small and noisy set of data. Such a model would probably perform poorly, and make unexpected mistakes in prediction.
I don’t love this analogy, though. One reason is that it has the smell of overthinking, something I’m prone to do. The other reason is that I don’t believe machine learning is a good source of inference/inspiration for human learning.
That said, with SRS, we can easily target a few ideas and incubate them. The opposite of that might be to expose oneself to a wide range of ideas, without dwelling too long on each. The challenge of when to go deep versus when to go broad is almost like a multi-armed bandit problem, and writing prompts correspond to “exploit,” where you double down on what you think is valuable based on limited information. We achieve optimal expected return only when we strike a good balance between exploitation and exploration.
In Andy’s guide, he mentioned how the “Baader-Meinhof Phenomenon” (namely, once you notice something, you’ll notice it more often) may be unhelpful sometimes. Meanwhile, looking at the neighboring concept of deliberate practice, one major consideration there is that while it helps people become more efficient at a skill, it might also make them less adaptive. These both have the same vibe as what I call overfitting, so perhaps lending some legitimacy to my concern.
How might it look in practice? Here’s an attempt of a crappy example by extending one (originally good example) taken from the guide. Say for a cooking prompt, we write that instead of adding water to a dish, we can consider using stock instead to get more flavor. We incubate that thought over time by reviewing the prompt. The next time we encounter a situation where we might use water, we immediately think of using stock, but that inhibits other ideas like condensing liquids or turning it into a different dish. Maybe the more fundamental concept here is managing fluids and layering on flavor, and we should’ve kept those in the forefront of our minds instead of looking for more opportunities to replace water with stock.
If you buy these arguments, then the follow-up question is: Given that we sometimes write ineffective prompts, what can we do about it?
The most obvious, and I suspect probably the best remedy, is to have a robust routine for revising our prompts.
What we want to avoid is the excessive incubation of ideas that aren’t helpful, but we’ve put into the system because of poor judgments. We’ve likely put them in because we didn’t know enough at the time yet to make better decisions.
Andy’s guide, as well as other SRS-related guides, recommends that we add prompts in several passes when we study a topic, flexibly pick things up as we go, and revise or discard any old prompts that are no longer appropriate. This makes perfect sense, since as our understanding of a topic keeps improving, we can course-correct and reorient our focus on what we need most.
However, I personally find this difficult to follow through in practice. The natural tendency is to keep moving forward and adding new prompts into the system. If we do our reviews in fragmented time slots on a smartphone, then when we notice that a prompt might need to be updated, the most we could do is to flag it and deal with it later. Updating old prompts seems like a skill of its own, and an additional process that we need to intentionally set time for.
Aside from having a routine to revise prompts, I think we simply need to be more conscious about keeping our prompts targeted on the right things. If we’re not the best judges ourselves, we could use some scaffolding or outside support. I’ve brainstormed a couple of ideas here.
Evaluate confidence: Well-designed courses and guides give us some sense of what we should pay attention to, explicitly or implicitly. So if we’re making prompts based on those, we might not need to worry as much about targeting the right things. Meanwhile, if we’re writing prompts based on articles, papers, books and other unstructured materials with less of an instructional nature, then we might want to be more mindful about revising those prompts down the road.
Speaking of courses and guides, Andy goes a step further and offers author-written prompts. Using the integrated Orbit platform is quite an interesting experience, and I’ll try to organize my thoughts on that next time.
Mentorship: Instead of trying to do everything right ourselves, perhaps we could consult human mentors. Have a mentor look at the prompts you’ve made, and they can quickly correct any misunderstandings, spot missed opportunities and advise future directions. It’d also be a great way to communicate what we’re thinking and learning.
Self-reflection techniques and support: Tips for writing good prompts are naturally also useful for inspecting and revising old prompts. But are there specific strategies and actions that we can adopt, during review sessions or otherwise, that help us become better at revising our prompts? In part 1, I mentioned how I try to imagine the application context when I review prompts. This also helps me spot stale prompts due for maintenance. How else might we tell if a prompt we’re reviewing is no longer appropriate?
It’s quite natural that we might try to expand the use of spaced repetition systems to learning beyond simple memorization. We might do this explicitly by writing prompts that promote higher-level thinking, or have it happen implicitly by the way our attention works.
As they say, SRS makes memory a choice. Along with that, it makes keeping ideas salient easy. So even for simpler memorization prompts, it is worth considering: Is this something I want to notice more, and have in my mind more often?
Self-directed learning has many challenges, especially when we’re a relative novice in a field. If Anki gives me the power to choose whatever I want to remember, do I naively abuse the fact that I could remember as much as I want, or do I have the wisdom to know what’s not worth reviewing repeatedly? If keeping ideas salient is easy, how do I know I’m not overdoing it, such that it skews my perspective or makes me think too rigidly? Do we understand the full implication of using SRS, and how it changes our thinking?
While I’m sure I would gain more insights into these questions as I continue to use SRS myself and reading others’ work, what I’d really like to see is more rigorous research efforts. One way that could happen is if we build SRS tools that incorporate the means for generating insights. This seems to be already happening with the Orbit system that Andy built and used for this guide. Next time, I’d like to collect some remaining thoughts about the experience of using Orbit, author-written prompts, and data-informed prompt improvement.