^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 1) .. _active_mm:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 2)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 3) =========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 4) Active MM
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 5) =========
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 6)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 7) ::
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 8)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 9) List: linux-kernel
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 10) Subject: Re: active_mm
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 11) From: Linus Torvalds <torvalds () transmeta ! com>
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 12) Date: 1999-07-30 21:36:24
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 13)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 14) Cc'd to linux-kernel, because I don't write explanations all that often,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 15) and when I do I feel better about more people reading them.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 16)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 17) On Fri, 30 Jul 1999, David Mosberger wrote:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 18) >
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 19) > Is there a brief description someplace on how "mm" vs. "active_mm" in
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 20) > the task_struct are supposed to be used? (My apologies if this was
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 21) > discussed on the mailing lists---I just returned from vacation and
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 22) > wasn't able to follow linux-kernel for a while).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 23)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 24) Basically, the new setup is:
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 25)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 26) - we have "real address spaces" and "anonymous address spaces". The
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 27) difference is that an anonymous address space doesn't care about the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 28) user-level page tables at all, so when we do a context switch into an
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 29) anonymous address space we just leave the previous address space
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 30) active.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 31)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 32) The obvious use for a "anonymous address space" is any thread that
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 33) doesn't need any user mappings - all kernel threads basically fall into
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 34) this category, but even "real" threads can temporarily say that for
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 35) some amount of time they are not going to be interested in user space,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 36) and that the scheduler might as well try to avoid wasting time on
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 37) switching the VM state around. Currently only the old-style bdflush
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 38) sync does that.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 39)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 40) - "tsk->mm" points to the "real address space". For an anonymous process,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 41) tsk->mm will be NULL, for the logical reason that an anonymous process
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 42) really doesn't _have_ a real address space at all.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 43)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 44) - however, we obviously need to keep track of which address space we
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 45) "stole" for such an anonymous user. For that, we have "tsk->active_mm",
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 46) which shows what the currently active address space is.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 47)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 48) The rule is that for a process with a real address space (ie tsk->mm is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 49) non-NULL) the active_mm obviously always has to be the same as the real
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 50) one.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 51)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 52) For a anonymous process, tsk->mm == NULL, and tsk->active_mm is the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 53) "borrowed" mm while the anonymous process is running. When the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 54) anonymous process gets scheduled away, the borrowed address space is
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 55) returned and cleared.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 56)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 57) To support all that, the "struct mm_struct" now has two counters: a
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 58) "mm_users" counter that is how many "real address space users" there are,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 59) and a "mm_count" counter that is the number of "lazy" users (ie anonymous
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 60) users) plus one if there are any real users.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 61)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 62) Usually there is at least one real user, but it could be that the real
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 63) user exited on another CPU while a lazy user was still active, so you do
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 64) actually get cases where you have a address space that is _only_ used by
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 65) lazy users. That is often a short-lived state, because once that thread
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 66) gets scheduled away in favour of a real thread, the "zombie" mm gets
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 67) released because "mm_count" becomes zero.
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 68)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 69) Also, a new rule is that _nobody_ ever has "init_mm" as a real MM any
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 70) more. "init_mm" should be considered just a "lazy context when no other
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 71) context is available", and in fact it is mainly used just at bootup when
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 72) no real VM has yet been created. So code that used to check
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 73)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 74) if (current->mm == &init_mm)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 75)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 76) should generally just do
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 77)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 78) if (!current->mm)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 79)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 80) instead (which makes more sense anyway - the test is basically one of "do
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 81) we have a user context", and is generally done by the page fault handler
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 82) and things like that).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 83)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 84) Anyway, I put a pre-patch-2.3.13-1 on ftp.kernel.org just a moment ago,
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 85) because it slightly changes the interfaces to accommodate the alpha (who
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 86) would have thought it, but the alpha actually ends up having one of the
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 87) ugliest context switch codes - unlike the other architectures where the MM
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 88) and register state is separate, the alpha PALcode joins the two, and you
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 89) need to switch both together).
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 90)
^8f3ce5b39 (kx 2023-10-28 12:00:06 +0300 91) (From http://marc.info/?l=linux-kernel&m=93337278602211&w=2)