The cleanest way to find out whether a supplement does anything for you is to run a structured n=1 with pre-registered outcomes, calendared retest dates, and stopping rules written before you start. Most "creatine for cognition" experiments fail not because creatine doesn't work but because the experimenter didn't measure anything, didn't adhere, or didn't stratify by the population variable that actually predicts response.
This article is a Blueprint-style protocol you can run on yourself, anchored to published trials rather than personal anecdote.
The hypothesis
Creatine raises brain phosphocreatine stores, which buffers cognitive performance under conditions where ATP demand exceeds the rested baseline: sleep deprivation, hypoxia, hypoglycemia, mental fatigue, and aging. The trials that show the cleanest cognitive signal are those that selected participants with low baseline brain creatine (vegetarians, who have ~20-30% lower muscle and presumably brain creatine pools because they eat less of it) or that stressed the system experimentally (sleep deprivation crossovers). In well-fed, well-rested young omnivores tested at baseline, the brain phosphocreatine pool is already near saturation, and there is little room for supplementation to move the needle.
The corollary hypothesis: if you fit one of the three responder profiles (vegetarian, sleep-deprived, over 60), you should see a measurable improvement on working memory and reaction-time-under-load tasks within 12 weeks. If you do not fit any of those profiles, you probably will not.
The protocol
| Phase | Duration | Action |
|---|---|---|
| Baseline | Week 0 | Three cognitive batteries on three separate days. Average the three. |
| Loading (optional) | Days 1-5 | 20 g/day in 4 doses (only if you want faster saturation). |
| Maintenance | Weeks 1-12 | 5 g/day creatine monohydrate, any time, any drink. |
| Retest 1 | Week 4 | Repeat the three batteries on three separate days. Average. |
| Retest 2 | Week 8 | Repeat. |
| Retest 3 | Week 12 | Repeat. Compare to baseline. Decide. |
Adherence target: 90% or better. Log each day taken in a checkbox tracker. The Lally 2009 habit-formation cohort suggests a cue-stacked daily ritual (next to your toothbrush, coffee maker, or pill organizer) reaches automaticity in roughly 66 days; build the habit into your existing routine rather than relying on willpower.
Co-conditions to keep stable:
- Sleep duration and timing within roughly 30 minutes of your typical pattern.
- Caffeine intake at the same time of day, same dose.
- Test always taken at the same time of day, same caffeine state, same time-since-last-meal.
- No other new supplements introduced during the 12 weeks.
The point is to keep the only varying input the creatine itself. Confounding the trial with a new sleep tracker, a new training block, and a magnesium experiment in the same window is the most common reason these protocols return uninterpretable data.
The evidence foundation
The case for creatine as a cognitive supplement is strongest in three populations, with one mechanistic review tying them together.
Vegetarians. Rae 2003 (Proceedings of the Royal Society B, n=45) ran a 6-week double-blind placebo-controlled crossover in healthy vegetarian adults at 5 g/day creatine monohydrate ( Rae, Digney, McEwan & Bates 2003, n=45 ). Backward digit span improved by roughly 1 item (effect size d ~0.55), and Raven progressive matrices accuracy improved meaningfully versus placebo. The vegetarian inclusion criterion is doing real work here: vegetarians have lower baseline tissue creatine because dietary creatine (almost exclusively from animal flesh) is roughly 1-2 g/day in omnivores and near zero in strict vegetarians.
The sleep-deprived. McMorris 2006 (Psychopharmacology, n=19) ran a 36-hour sleep-deprivation crossover RCT with 7 days of 20 g/day creatine loading prior to the deprivation challenge ( McMorris, Harris, Swain et al. 2006, n=19 ). Random number generation, mood, and reaction time were all preserved on creatine versus a clear decline on placebo. Effect sizes were moderate-to-large (d in the 0.4-0.7 range across measures). The n is small and the loading dose is non-standard, but the direction has replicated across several smaller military and exam-week studies in subsequent years.
The elderly. Rawson & Venezia 2011 (Amino Acids review) summarized the controlled-trial evidence in older adults, with consistent positive signals for working memory and reasoning tasks at 5-10 g/day for 1-12 weeks ( Rawson & Venezia 2011 ). Roschel 2021 (Nutrients review) updated the synthesis and added the mechanistic argument: brain phosphocreatine declines with age, brain ATP turnover slows, and creatine supplementation partially restores the buffer ( Roschel, Gualano, Ostojic & Rawson 2021 ).
The synthesis. Avgerinos 2018 (Experimental Gerontology) ran a systematic review of 6 RCTs in healthy adults ( Avgerinos, Spyrou, Bougioukas & Kapogiannis 2018 ). The conclusion: reliable improvement in short-term memory and reasoning under stress, null effects at baseline cognition in young healthy omnivores. The size of the cognitive case has grown since 2018, particularly in older adults and stressed populations, but the population-stratification finding has held: who you are matters more than how much you take.
What to track
Three batteries, free, validated, and runnable in 15-20 minutes total per session.
1. Backward digit span (working memory). The test Rae 2003 used. A research-grade administration is available through the Cognitive Ability Test (CAT) on cognitivefun.net or via the Psytoolkit "Digit Span" experiment (https://www.psytoolkit.org/experiment-library/digit_span.html). Read 5-9 digits, repeat them in reverse. Record longest correctly-reversed sequence on three trials, average. Healthy adults score 4-7.
2. Stroop interference (executive function under load). Free implementations at psytoolkit.org and at the Princeton Open Source psychology toolkit. Color-word incongruent reaction time minus congruent reaction time = your interference score. Lower interference is better. Score is sensitive to fatigue, attention, and creatine in the McMorris-style studies.
3. Paired associates learning (PAL) or n-back (working-memory load). PAL is part of the Cambridge Brain Sciences free battery. Alternatively, dual n-back implementations (n=2 to n=3) are available free at brainscale.net. Pick one and stick with it across the 12 weeks; switching tools mid-study breaks comparability.
Why these three. They cover working memory (digit span), executive function under interference (Stroop), and learning rate (PAL or n-back). The Rae 2003, McMorris 2006, and Avgerinos 2018 trials all used at least one of these or close cousins. Use the same time of day, same setup, same caffeine state across all 12 sessions.
Optional add-ons. Self-rated mental fatigue on a 0-10 visual analog scale at the same daily time. Sleep duration logged via wearable. Resistance training volume logged via app. These are not the primary outcome but help interpret an ambiguous result at week 12.
Expected effects and confidence intervals
Effect sizes in the published literature, translated to what you might see on yourself:
Vegetarian profile. Backward digit span: expect a 0.5-1 item improvement on the longest correctly-recalled span by week 8-12. Stroop interference: expect a 5-15% reduction in the interference score. PAL: small but detectable improvement in trials-to-criterion.
Sleep-deprived profile. Test under your typical low-sleep state (this is the population the McMorris trial selected). Expect Stroop reaction time to be ~10-20% faster than your placebo baseline; expect digit span to lose less ground after a poor night. The "improvement" here is mostly resilience, not raised ceiling.
Over-60 profile. Expect a small but real improvement in working memory and reasoning, on the order of d = 0.3, by week 12. Some trials show a larger effect at higher doses (10 g/day) in older adults; if 5 g/day is null at week 8, consider a single dose escalation step before stopping.
Healthy young omnivore profile. Expect very little. Avgerinos 2018 was fairly clear on this: at baseline cognition in young healthy adults eating animal protein, supplementation does not move the needle in most measures most of the time. If you run the protocol anyway, take the lack of effect as informative, not as a refutation of the supplement.
Honest framing. A d = 0.3 effect means the supplemented mean is 0.3 standard deviations above the placebo mean. In an n=1 with a single-digit sample of test sessions, you may not detect this against your own day-to-day cognitive variability. Triple-baseline (three sessions) and triple-retest (three sessions per timepoint) is the minimum design for any chance of seeing it cleanly.
When to stop or pivot
A pre-registered stopping rule beats post-hoc reasoning. Three branches:
Branch 1: clear positive at week 8. Continue the 5 g/day maintenance dose indefinitely. Cost per year is roughly $25-50 for the supplement plus 30 minutes of weekly testing. The asymmetry favors continuation.
Branch 2: ambiguous at week 12 with high adherence. Run a single loading-phase troubleshoot: 20 g/day in 4 split doses for 5 days, then back to 5 g/day for an additional 4 weeks. Re-test at week 16. The Syrotuik & Bell 2004 data suggests roughly 30% of subjects are slow saturators on the 5 g/day-only protocol, and a loading phase clarifies whether the issue is undersaturation or genuine non-response ( Syrotuik & Bell 2004, n=34 ). If still null at week 16, stop.
Branch 3: no detectable change at week 12 with high adherence and no loading-phase rescue. Stop. The published trials suggest you are likely either (a) already cognitively saturated, (b) outside the responder phenotypes, or (c) running a measurement protocol that cannot detect a d ~0.3 effect against your noise floor. None of those are improved by continuing to spend money on the supplement.
Side-effect monitoring. If serum creatinine rises by more than 0.3 mg/dL at the next blood panel, that is the cosmetic creatine-to-creatinine conversion and is not a kidney issue (Kreider 2017 ISSN review covers this in detail) ( Kreider et al. 2017 ). Cystatin-C-based eGFR clarifies if you are concerned. Flag genuine GI distress, bloating beyond the first 2 weeks, or any acute change in renal function as reasons to consult a clinician.
A specific dissenter: Stuart Phillips, a major figure in protein-and-aging research, has been more skeptical of generalized creatine-as-nootropic enthusiasm than the popular podcast world. His position, which I find defensible, is that the cognitive case is real but population-bounded and that the marketing has run somewhat ahead of the average effect size. The protocol above is calibrated to that view; it tells you to run a fair test in the population where the evidence is strongest, and to stop if your own data does not support continuing.