Cessen's Ramblings

2019 - 01 - 09

Rust Community Norms for Unsafe Code

I recently released Ropey 1.0, a text rope library for Rust. Ropey uses unsafe code internally, and its use of unsafe unsurprisingly came up in the 1.0 release thread on Reddit.

The ensuing discussion (especially thanks to Shnatsel) helped me significantly reduce the amount of unsafe code in Ropey with minimal (though not non-existent) performance degradation. But the whole thing nevertheless got me thinking about unsafe code and community norms around it, and I figured writing some of those thoughts down might be useful.

My hope is this post will be part of a community-wide discussion, and I would love for others (likely smarter than me) to write their thoughts on this topic as well. This post is simply my take on it.

Different Use-cases

I've gotten the impression that, on-the-whole, the Rust community's attitude towards unsafe code is something along the lines of, "Don't ever use it unless you're a wizard... and you're not a wizard." Unsafe code, it seems to me, is seen as this incredibly dangerous thing, and if it's in your crate or application then there's something wrong.

(Having said that, I'm not at all sure whether this impression is actually accurate, but I imagine if I've gotten that impression, many others likely have as well.)

In any case, I absolutely don't see unsafe code that way, and I think this is because my use-cases for Rust are different.

As I understand it, Rust more-or-less began its life at Mozilla with the goal of making browsers more secure. From that standpoint I largely agree with avoiding unsafe code. We've seen time and time again that memory safety bugs are both easy to introduce and a significant attack vector. Eliminating that whole class of bugs is good for security.

However, for my own projects I'm mostly uninterested in the web. I suspect this puts me in a minority, both within the Rust community and within the software development community as a whole. The vast majority of software developed today seems to be web-facing in some way or another. However, the software I'm most interested in writing runs locally on people's computers and is not web-facing. For example, my 3d renderer.

Now, I'm not saying there aren't security concerns with local software. Local software has a way of eventually finding itself on networks (for example, render farms in the case of 3d rendering). Nevertheless, the cost-benefit analysis is different: trading some risk of unsafety for better performance is almost always the right choice for something like an offline 3d renderer.

So for me Rust's memory safety guarantees aren't about security, they're about making software development easier. Tracking down memory safety bugs is a pain—in my experience, far more of a pain than fighting the borrow checker. But if I need to drop into unsafe code in a few places to squeeze more performance out of my renderer, I'm certainly going to do it.

Of course, there are a lot of use-cases where that is absolutely the wrong trade-off, and I suspect that includes most software being written today. But I think it's worth making this distinction: different software has (legitimately) different priorities. And I would like to see more acknowledgment of that in the Rust ecosystem.

Which leads me to the next part of this post...

Marking Crates

Ropey has a prominent notice in its readme about its use of unsafe code. I put it there precisely because of varying use-cases: people who are writing software where security is a high priority probably shouldn't use Ropey, or should only utilize it in a compartmentalized way. On the other hand, people who are writing high performance text editors that can handle crazy-huge files without breaking a sweat...? That's precisely what I designed Ropey for.

Ropey's readme notice—or something like it—is something I would like to see adopted as a norm in the Rust community. Different software has different needs, and I don't think a one-size-fits-all attitude towards unsafe code makes sense. But I do think we all need to be forthright about the priorities of our crates. In fact, that forthrightness probably shouldn't be limited to just the use of unsafe code.

So I guess what I'm saying is: I don't know exactly what I want this to look like, but I would like there to be some generally accepted way for crates to advertise what their priorities are. This would allow people to better choose crates with trade-offs and cost/benefit analysis appropriate to their own projects.

Safety Feature Flag

At the most recent Seattle Rust Meetup I had a great conversation about unsafety with Ivan (don't know his last name, alas, but thanks Ivan!) and he had an excellent idea for Ropey that I think may be more widely applicable: provide a feature flag that switches to an all-safe version of Ropey. In Ropey's case this is pretty straightforward: the unsafe code is well compartmentalized, and could be easily swapped out with safe (but slower / more memory hungry) equivalents based on a feature flag.

I am seriously considering this for Ropey, but I think it can apply to other crates as well. And maybe having a standard flag for this would be good. It obviously wouldn't be useful for every crate, but it would allow crate consumers to make this trade-off themselves whenever a crate does provide the flag.

But Can't We Just Make Safe Code Faster?

One of the things I legitimately love about the Rust ecosystem is the push to get unsafe code compartmentalized into crates with safe APIs. Having a rich set of crates that are working hard to make unsafe code correct, and expose it safely for others, is super awesome.

However, this will never be able to cover all possible situations. Covering reasonably common cases is feasible, but eventually it becomes a game of whac-a-mole. There are always going to be things that won't be covered in the ecosystem, especially the more niche you get.

Moreover, some optimizations are specific to the needs of a holistic design. (John Carmack's talk about systems engineering explains this much better than I can, and is also just a great talk.) And sometimes these sorts of optimizations will require unsafe code, and because of their specificity won't be generally useful as a separate crate.

For these and other reasons, I believe there will always be legitimate cases for using unsafe code in one-off ways. The more we can shrink the number of such cases, the better, for sure. But it will never shrink to zero.

Wrapping Up

I don't mean any of this to be an "excuse" for people using unsafe code haphazardly. Even in my own case, I was able to significantly reduce (though not eliminate) the unsafe code in Ropey with a little prodding from Shnatsel on Reddit. When you're in a performance mindset it's easy to reach for the lowest level before you strictly need to.

But legitimate uses of unsafe code are inevitable, and I think a "know and broadcast your use-case" ethos would be healthier and more useful for the Rust community than a more-or-less "don't use unsafe code" attitude.

2018 - 12 - 12

Rust 2019 - It's the Little Things

(This post is in response to this call for posts on the Rust blog. Also, this post focuses on Rust as a language, because I feel like the crates ecosystem is something that will naturally grow and mature over time as more people use Rust for more use-cases. Also, in most respects I agree strongly with Jonathan Turner's Rust 2019 post, which has a similar flavor to this one.)

This might be an uncommon opinion—especially among those motivated enough to write a Rust 2019 post—but I actually think Rust is pretty much at a good place now. For the kinds of things that I want to do (e.g. my path tracer), there isn't much that Rust is lacking as a language. There are some fiddly things like "placement new" that could be useful, but nothing really major. And of course, well-designed new features are always welcome, they just don't seem particularly critical to me at this point.

In other words, I'm pretty much satisfied. Mission accomplished, as far as I'm concerned. I think the rest is just polish. Just the little things.

But... the little things aren't necessarily little in terms of effort and hours needed to accomplish them. I don't think any of the following is trivial. I mean "little" with respect to shininess and visibility/marketing optics. Also, little != unimportant here.

So without further delay, these are the things I would like to see Rust focus on in 2019.

Type-level Integers (a.k.a. Integers as Generic Arguments)

This paper cut is pretty easy to run into. I was at the Seattle Rust Meetup this month, and while helping out a new rustacian we ended up stumbling over the list of impls for fixed-size arrays in the standard library. It's... silly. Tons of impls that are identical except for varying the length of the array. I felt like I had to apologize to the person I was helping, "Yeah... yeah, it's super stupid. Yeah, I'm sorry. Yeah, you're totally right. There are plans to improve the situation. No, I'm not sure when, or how far along it is. But you can work around it this way."

I realize that allowing values of (almost) any type as generic arguments is a seductive and appealing idea. But from my perspective, Rust already has type-level integer arguments, in the form of fixed-size arrays. It's just that users don't have access to them for their own types, nor for impls of their own traits over arrays. It's a lot like how Go has built-in generic maps, etc. but you can't make your own generic types in it. There's this sort of implicit acknowledgement in the language design that "this is useful to be able to do", but at the same time it doesn't let you do it yourself.

So I think that just focusing on pushing type-level integers across the finish line should be a goal for 2019, without all the complications of more general type-level values. It's a long-standing paper cut that feels like a missing feature, rather than a wish-list new feature. It makes Rust feel inconsistent, and has concrete, negative, practical impacts even in the standard library itself as per the link above.

Expanding Supported Architectures

This isn't directly interesting to me—I'm pretty happy in x86/x64 land. But moving more architectures to the higher support tiers seems like it would be good. Rust is a "systems" programming language, and even though people don't seem to agree what "systems programming" means any more than they agree on what "art" means, I nevertheless think having a rich set of well-supported architectures is worth aiming for in a language like Rust. It would be nice if people didn't have to reach for C or other memory unsafe languages just because they're targeting a different chipset.

Compile Times

This is a boring topic to bring up, but Rust's compile times still leave a lot to be desired. I am really pleased with the progress already made, and in no way mean to belittle that or the efforts that have led to that progress. It's more to say: more please! I think this is actually really important. Performance matters. In fact, caring about performance is one of the reasons for using Rust, right? So I think prioritizing performance and other not-so-shiny things (as opposed to e.g. new features) for 2019 might be worth considering.

In Short...

This is all to say: I think Rust is great, and I'm actually really happy with where it's at now. I realize that my use-cases are not the same as everyone else's (e.g. I don't personally have much use for async/await). But there are a lot of little things that I would like to see some focus put on. That is my hope for 2019: the year of polish. Not shiny new features, not even ergonomics improvements per se as in the previous cycle. But polish on technical items.

Rust 2021 Edition

As for the possible future Rust 2021 Edition... well, this may also be an uncommon opinion, but I hope we don't have one. Frankly, I hope we don't ever have another edition of Rust at all. I hope 2018 is the first and last one.

That may seem harsh, but let me clarify: Rust 2018 was absolutely a good thing, and I'm glad that we did it. But it was also a bad thing, in that it makes the world of Rust more confusing.

This was highlighted for me at the recent Seattle Rust Meetup, where a new user couldn't get his code to compile. I also struggled to figure out why while helping him. Eventually someone else came by and asked, "Is this in Rust 2018?" We checked, and indeed the new project was Rust 2018, but his code was written in Rust 2015 style in a way that caused an error in 2018 (I don't recall how anymore).

Rust 2018 is backwards compatible in the sense that the compiler can still build 2015 code, and even use both 2015 and 2018 crates together. This is an impressive feat, and a great way to move forward with breaking changes. But technological compatibility is not the only important kind of compatibility.

All of the articles and tutorials on Rust, all of the Stack Overflow questions and answers, and really anything written about Rust at all is now not only out of date but incompatible with the user's first out-of-the-box experience. We went through this once already with Rust 1.0. It's not a good thing. It's quite bad, actually. And I would really, really like to avoid doing it again.

So what I really meant above is not that we shouldn't ever have another edition, but rather that we should try not to. If there are good reasons and things we really want to improve, sure, let's go for it. But let's absolutely not take editions as a given thing, and certainly not as a regular thing.

Let's make new editions only when justified--when the benefits outweigh the drawbacks. Not as part of a cadence. And let's try to make them as infrequent as possible. If we can go ten years without a new edition, I think that should be viewed as a good thing. Editions are a tool we can reach for when needed, but they are a tool with a cost.

2018 - 10 - 11

Optimal Anki Settings - Take 2

I discovered a significant error in the simulations I ran in the previous post. So I'd like to present the corrected simulations, and also explain more deeply the assumptions and known weaknesses even in these fixed simulations. I also made the charts easier to use. However, I won't re-explain the background, so please see the previous post for that.

Summary for those who don't want to read the whole thing: Using an interval modifier near 140% is probably a reasonable default for language learning with Anki.

The Error(s)

The major mistake I made was failing to account for the additional reviews that you do after lapsing on a card. This, unsurprisingly, has a huge impact on the results. In fact, the amount of extra time you spend on a lapse has a far more significant impact than the initial time you spend creating and learning a new card.

I also had an error that Matt vs Japan caught in his own work, which is that calculating the total number of cards you know isn't quite as simple as just counting the cards in your deck. There's a forumla that the Supermemo people derived to get a correct approximation of known cards at any given point in time, and I am now using that. This also turns out to have a significant impact on the results.

Assumptions and Limitations

Before I jump into the updated results, I want to lay out more explicitly what the assumptions of my simulations are, as well as some limitations you should be aware of.

Anki Settings

I'll actually cover a few variations in the results, but the simulation's assumption about Anki settings is that you are pretty much using the defaults (aside from the Interval Modifier, of course).

The only exception is the lapse "new interval" setting, which determines how much a card's interval is reduced when you lapse on it. The Anki default is to reset the interval to zero, but that seems like a flagrantly bad choice to me: if you already successfully remembered it just one interval prior, there's no reason to drop you back to square one. Matt vs Japan uses 75% as his default, but that also seems not optimal. If your interval grows by, for example, 10x every time you get it right, then dropping it to 75% seems like it doesn't go far enough. Alternatively, if you're only growing by 1.1x, then 75% drops you back too far.

What I've come up with—and this is totally just hand-waving "this seems reasonable" territory—is to configure it so that lapsing twice will reverse a single success. So, for example, if getting a card right increases the interval 4x, then lapsing will halve the interval, because halving something twice results in 1/4 the size, reversing the 4x. For the curious, the way to calculate that is 1.0 / sqrt(N), where N is how much your interval is multiplied on success (4.0 in the example I just gave).

Other than that, it's all Anki defaults (except where noted in the variations). My reasoning for this is that it's not clear what effect all of the settings have on retention etc., so sticking to the defaults gives us some reasonable confidence that the formulas for the interval modifier will apply with some accuracy.

The Simulated User

There are two properties of the simulated user that affect the simulation:

  • How much time they spend creating and learning a new card.
  • How much time they spend per review.

However, for the graphs the only thing that actually matters is the ratio between these two items. Knowing their absolute magnitudes is unnecessary for calculating optimal efficiency, although it may still be interesting.

The assumption I've used is that creating a new card and doing the initial reviews to learn it take a combined time of 120 seconds (or two minutes). And each individual review after that takes 20 seconds. These seem like reasonable numbers to me. Moreover, except at bizarre extremes (e.g. new cards take almost no time) it doesn't appear that the ratio impacted the graphs significantly.

Length of the Simulation

For all the simulations in this post, I use a simulated time of 365 days (one year). Doing it for fewer days or more days does impact the results some, but basing things on one year seems reasonable to me. If your cards grow older than a year, it's not totally clear to me how much you're really getting out of them—at least in the context of language learning. And studying for significantly less than a year doesn't make sense for seriously learning a language.

General Limitations

I alluded to this earlier, but there are things that these simulations don't account for. The biggest one is that it's not at all clear how the following variables impact retention of cards:

  • Increasing/decreasing the number of reviews for initial learning of new cards.
  • Increasing/decreasing the number of additional reviews after lapses.
  • Increasing/decreasing the lapse "new interval" setting.

I mean, it's pretty clear that increasing the first two and decreasing the latter will improve retention, but it's not at all clear how to quantify that, or create formulas for them (or at least I don't know). This is also a limitation of Matt vs Japan's work, and the simulator that he's now using.

Because of that, the retention impacts of these factors are completely unaccounted for in these simulations. This means that the particular choice used for one of these factors, even though it affects the simulation, does not affect it accurately. For example, you could set the lapse new-interval setting to 100000% (1000x multiplier), so that whenever you lapse a card it will launch its interval into the stratosphere. From the simulation's perspective, that's a massive increase in efficiency, because it assumes the retention rate for that card still stays the same. But that's obviously false in reality—in reality that setting would be roughly equivalent to deleting all lapsed cards from your deck.

That's why I'm generally trying to stick to Anki's defaults, or at least "reasonable" settings. And that also means that all results from not only my simulations, but also from Matt or anybody else, should be treated as fuzzy guides, not as an exact science. There's a lot that we're not accounting for, and it's not totally clear how that affects the simulations.

Results

So with that out of the way, here are the results. As in the last post, here's how to interpret the graphs:

  • The vertical axis is the Interval Modifier setting in Anki.
  • The horizontal axis is your personal retention rate at (roughly) default Anki settings.
  • The white strip is 99+% optimal efficiency.
  • The light gray strip is 95+% optimal efficiency.

"Efficiency" in this case means "cards learned per hour of time studying". And studying includes both reviews and time spent creating and learning new cards.

All of these graphs are normalized, and therefore don't reflect efficiency differences between different graphs. They are also normalized individually within each vertical slice of pixels, and therefore also don't reflect differences in efficiency between personal retention rates. The latter is intentional, as it makes it easy to e.g. find your personal retention rate and determine what interval modifier results in your personal optimal efficiency. See the previous post for more details.

The "Reasonable" Graph

I actually have quite a few graphs this time, but this is the one I think most people should use:

It represents precisely the settings and simulated user I described earlier, and I think is a reasonable graph to work from. In particular, it assumes:

  • An average of two minutes spent creating + learning each new card.
  • An average of 20 seconds per review.
  • A single additional review per lapse.
  • A "max lapse" setting of 8 (cards are suspended when they exceed 8 lapses).
  • The 1.0 / sqrt(N) lapse new-interval setting I described earlier.

The main take-away is that an interval modifier of around 140% gets you 95% efficiency in almost the entire 75%-95% personal retention rate range. Which is pretty great! So I think this is probably a good default. But going up as high as 160% also seems quite reasonable. And if you have especially good retention, even as high as 200% might make sense.

But now that we have the "reasonable" graph out of the way, let's start playing with the settings!

Varying Max Lapses

The above graph changes the "max lapse" setting to 4.

And this one changes the "max lapse" setting to 12.

These graphs together illustrate something useful: increasing your max lapse setting beyond 8 doesn't make much difference in efficiency, but lower numbers definitely do!

It's also worth noting (although not illustrated in the normalized graphs) that decreasing your max lapse setting has a negative impact on your efficiency. In general, with a max lapse setting of 8 or higher, you're at 99+% of optimal efficiency, but a setting of zero slashes your efficiency by a factor of 2, depending on your personal retention rate. A setting of 4 gives you 95+% max efficiency.

As far as the simulation is concerned, increasing your max lapse setting always improves efficiency. I think this makes sense. Although it's not in these simulations, I also did some sims where each card had a slightly different difficulty (i.e. individual retention rate), and the variance between cards had to get pretty huge before it wasn't always beneficial to increase max lapses.

So my takeaway is this: beyond a max lapse setting of 8 doesn't really make a difference to efficiency. But feel free to max out the setting if it makes you feel better.

However, there is also a psychological component to studying, and culling out cards you're having a tough time with might make sense. In that case, a setting of 4 probably makes sense, since it's pretty low but still has a minimal impact on efficiency.

Varying Additional Lapse Reviews

This should be taken with a significant grain of salt, as per the "limitations" section earlier in the article. But here are a few graphs varying how many additional reviews are done when you lapse. The "reasonable graph" earlier is "one additional review".

^ Zero additional reviews.

^ Two additional reviews.

^ Three additional reviews.

The differences are pretty stark. Especially at zero additional reviews, it seems like you can get great efficiency with much larger interval modifiers! However, that seems very suspect to me, because the simulation isn't accounting for how these additional reviews may improve retention.

Having said that, because we don't know exactly how much those additional reviews help, any of these graphs are potentially as valid as the "reasonable" one that I presented. This is an open area for additional work. If anyone has any data about the impact of additional lapse reviews, I would be very interested to see it!

Varying Lapse New Interval

Finally, let's see how varying the lapse "new interval" setting (how much intervals are reduced on lapses) impacts things.

^ Matt's 75% setting.

^ Anki's default 0% setting.

As you can see, this setting also has a notable impact. But just like varying the additional lapse reviews, I have no idea how this impacts things in reality—this simulation doesn't account for important (but currently unknown) factors. So this also needs further study!

Wrap-up

I'm reasonably confident in the results from these simulations, except for the limitations and unknown factors I've described throughout this post. If anyone has any insight around those issues, I would be very interested to hear from you!

But for now, I think it is at least reasonable to go with an interval modifier of %140. If you want to use my lapse "new interval" setting scheme, that corresponds to a lapse "new interval" settings of 53%.

2018 - 10 - 04

Optimal Anki Settings

UPDATE: I found a major problem with the below simulations, which I have fixed in a follow up post. Please DO NOT use the graphs or settings advice below. Instead see the follow up post.

(If you're already familiar with all this background and just want to see the results, skip down to the Results section.)

Something I haven't talked about on this blog yet is that I'm learning Japanese.

One of the tools that many people use to aid in learning a new language is an application called Anki. It's a flashcard application with automated review scheduling based on spaced repetition. You can make flash cards in Anki for just about anything, but in the context of language learning it's really useful for moving vocabulary, grammar, etc. into your long-term memory. As I understand it, it shouldn't be your primary study method, but it can accelerate your language learning when used as a supplement to e.g. immersion/input-based approaches.

One of the resources I've found useful in figuring out how to even approach learning Japanese has been the YouTube channel Matt vs Japan, and he recently posted a video suggesting a different approach to configuring Anki's spaced repetition settings.

Traditionally, you try to maximize your retention rates (i.e. minimize forgotten cards), balanced with how much time you're willing to spend studying. But in this video Matt presents a remarkable insight: what matters isn't your retention rate, what matters is how many total cards you memorize per unit of time you spend studying. And it turns out, you can memorize more total cards if you're also willing to forget more cards (up to a point) by making your intervals larger.

Matt took a crack at calculating what those optimal intervals would be. It depends on a variety of factors, so he came up with a formula that people can use to determine their own optimal intervals. However, in the video he also encouraged people to not take it on faith, and instead make sure the math actually works themselves. He is (like all of us) human, after all.

So I decided to take a different approach to solving the same problem: simulation. Partly this is to verify Matt's work, and partly this is because we can take more variables into account with simulation, and potentially get more accurate optimal intervals.

Results

The code for the simulation can be found on my github. I haven't put a license on the repo yet, but please consider it as public domain. I encourage anyone who wants to to mess around with it or build on it to do so. It's written in Rust, so it should be pretty painless to build and run for anyone with basic command line experience.

My simulation makes most of the same assumptions as in Matt's video, so if the base assumptions are wrong, my simulation is wrong as well. So take this with a certain amount of salt.

The fun part of my simulation is that it produces visualizations. For example, this:

The vertical axis is the card's interval factor, and the horizontal axis is the user's retention rate with default Anki settings. The brightness of the pixels represents how many cards you memorize per hour of study: brighter is more, darker is fewer.

(A side-note about "Interval Factor": this is literally the number that your card's interval is multiplied by when you answer "good" on a card. I find this easier to reason about during simulation. Getting the Interval Modifier setting—which Matt talked in terms of—from this number is easy: just multiply by 40. For example, an Interval Factor of 7.5x = 7.5 * 40 = Interval Modifier 300%. This assumes that you leave your new card ease at 250%, as is default in Anki.)

One of the things that is immediately obvious is that having better personal retention is... better. Which shouldn't be surprising. As you get closer and closer to 100% retention, the interval factor matters less and less, and you can crank it up crazy high. In other words, if you never forget anything, you never have to review! So it is completely unsurprising that the brightest pixels in the chart are in the upper-right corner: having 100% retention is super time-efficient!

Of course, no human has 100% memory retention (that I'm aware of). And, in fact, for any given person, the only thing that matters in this graph is the vertical slice of pixels corresponding to their personal retention rate. So I have a modified chart that helps us visualize that:

This is the same graph as before, except that each vertical slice of pixels has been individually normalized so that its own brightest pixel is white. In other words, the white pixels show the curve of optimal interval factors. (Note: the right-most part of the image has a strange discontinuity—this is because the optimal factor goes off the top of the chart.)

One of the hilarious things about this image is the left side: at some point, if you're really bad at memorizing things, super long intervals start to be optimal again. However, I'm pretty sure this is a weakness in the simulation, as its assumptions start to break down. It probably doesn't actually match reality. Nevertheless, I find it funny.

In any case, the most obvious thing about this image is that the optimal intervals curve upwards as the personal retention rate increases. This, again, makes perfect sense given what we know: if you can hold cards in your memory longer, you don't need to review as often, so you can be more efficient with your time at longer intervals.

Another thing worth noting about this chart is that the falloff from white (optimal) is very smooth and gradual. This is important, because it means that the exact settings aren't delicate. You can be off by a decent bit and still be close to optimally efficient.

To drive that last point home even stronger, here is the last (and most useful) chart:

The white strip is the area where you are within 99% percent of optimal efficiency. The light gray strip outside of that is the area within 95% of optimal efficiency.

For example, if your personal retention rate at default Anki settings is 90%, then you can use an interval factor anywhere between 6x and 8x and still be 99% efficient. And if you're willing to go as low as 95% efficiency, you can range between 4x and 11.5x. That's a huge range.

However, keep in mind that this is all a bit fuzzy. There are many assumptions made in this simulation, and the sampling has a small bit of noise in it (which is why the strips aren't smooth curves). So aiming for the middle of the 99% efficiency strip is probably best if you have good data on your retention rate.

None of this is especially revolutionary, so far. Matt already figured most of this stuff out in his video. So this is mostly just validating his formula.

But what I find most interesting about this graph is what it means for people who are just starting out, and don't already have data on their retention rate. In other words: if you have no idea what your retention rate is, what interval factor should you use? This graph helps us answer that question.

Although I don't have any data to back this up, my guess is that most people fall somewhere in the 80-90% retention range, with maybe some outliers going as low as 75%. So if we look at the entire range between 75% and 90%, we can see that a 5x interval factor gives us 95%+ efficiency in the entire 80-90% range, and even at a 75% retention rate you barely squeeze in at 95% of optimal efficiency.

My guess at typical retention rates could be off, but I think it's at least a reasonable guess (if anyone has real data on this, I'd love to know!). So my suggestion is this: when starting out, use a 5x interval factor (or interval modifier of 200%, as per Matt's video). Then you're likely operating within 95% of optimal efficiency. Once you've been using Anki for several months, then you can take a look at your actual retention rate and adjust from there.

Do note, however, that the above graphs have personal retention rates corresponding to default anki settings. So here's another graph for retention rates using the 200% interval modifier:

Once you've collected enough retention data using an interval modifier of 200%, use this graph to find your optimal setting. That is, if you care about squeezing out that last 5% of efficiency.

Final Notes

There's more I'd like to say on this topic, and I've glossed over quite a few things in this post. But it's already quite long, so I'm going to end it here. But I'll mention a couple final things.

First, tweaking the other settings of the simulation impact these graphs a bit. In particular, the "max lapses" setting can have a really significant impact. I set it to 8 lapses for the simulations in this post. In my testing, this closely matches most max lapse settings you might choose, except for very low ones (e.g. 3 or less). Aside from that, I've tried to use fairly "typical" settings for the graphs in this post, and therefore this should be pretty close to accurate for most configurations. But if your settings are especially atypical, I recommend re-running the simulations yourself with your own Anki settings.

Second, as Matt noted in his video, these ideas are untested in the real world. The assumptions and math used in these simulations might not match real human beings. Therefore, take this with an appropriate grain of salt. However, I do think the principle is sound, and I would be surprised if e.g. using an interval factor of 4x (interval modifier 160%) wouldn't be beneficial to efficiency, even if the assumptions break down at more extreme settings like the ones suggested in this post and by Matt in his video. In any case, I plan to use the 5x factor (200% modifier) myself, and see how it goes. If you're up for being a guinea pig, feel free to join me!