mbind — Set memory policy for a memory range
#include <numaif.h>
| int
            mbind( | void * | start, | 
| unsigned long | len, | |
| int | policy, | |
| unsigned long * | nodemask, | |
| unsigned long | maxnode, | |
| unsigned | flags ); | 
cc ... −lnuma
mbind() sets the NUMA memory
      policy for the memory
      range starting with start and continuing for
      len bytes. The memory
      of a NUMA machine is divided into multiple nodes. The memory
      policy defines in which node memory is allocated.
      mbind() only has an effect for
      new allocations; if the pages inside the range have been
      already touched before setting the policy, then the policy
      has no effect.
Available policies are MPOL_DEFAULT, MPOL_BIND, MPOL_INTERLEAVE, and MPOL_PREFERRED. All policies except
      MPOL_DEFAULT require the caller
      to specify the nodes to which the policy applies in the
      nodemask parameter.
      nodemask is a bitmask
      of nodes containing up to maxnode bits. The actual number
      of bytes transferred via this argument is rounded up to the
      next multiple of sizeof(unsigned
      long), but the kernel will only use bits up to
      maxnode. A NULL
      argument means an empty set of nodes.
The MPOL_DEFAULT policy is
      the default and means to use the underlying process policy
      (which can be modified with set_mempolicy(2)). Unless
      the process policy has been changed this means to allocate
      memory on the node of the CPU that triggered the allocation.
      nodemask should be
      specified as NULL.
The MPOL_BIND policy is a
      strict policy that restricts memory allocation to the nodes
      specified in nodemask. There won't be
      allocations on other nodes.
MPOL_INTERLEAVE interleaves
      allocations to the nodes specified in nodemask. This optimizes for
      bandwidth instead of latency. To be effective the memory area
      should be fairly large, at least 1MB or bigger.
MPOL_PREFERRED sets the
      preferred node for allocation. The kernel will try to
      allocate in this node first and fall back to other nodes if
      the preferred nodes is low on free memory. Only the first
      node in the nodemask
      is used. If no node is set in the mask, then the memory is
      allocated on the node of the CPU that triggered the
      allocation allocation).
If MPOL_MF_STRICT is passed
      in flags and
      policy is not
      MPOL_DEFAULT, then the call
      will fail with the error EIO
      if the existing pages in the mapping don't follow the policy.
      In 2.6.16 or later the kernel will also try to move pages to
      the requested node with this flag.
If MPOL_MF_MOVE is passed in
      flags, then an
      attempt will be made to move all the pages in the mapping so
      that they follow the policy. Pages that are shared with other
      processes are not moved. If MPOL_MF_STRICT is also specified, then the
      call will fail with the error EIO if some pages could not be moved.
If MPOL_MF_MOVE_ALL is
      passed in flags, then
      all pages in the mapping will be moved regardless of whether
      other processes use the pages. The calling process must be
      privileged (CAP_SYS_NICE) to
      use this flag. If MPOL_MF_STRICT is also specified, then the
      call will fail with the error EIO if some pages could not be moved.
On success, mbind() returns
      0; on error, −1 is returned and errno is set to indicate the error.
There was a unmapped hole in the specified memory range or a passed pointer was not valid.
An invalid value was specified for flags or mode; or start + len was less than
            start; or
            policy was
            MPOL_DEFAULT and
            nodemask
            pointed to a non-empty set; or policy was MPOL_BIND or MPOL_INTERLEAVE and nodemask pointed to an
            empty set,
System out of memory.
MPOL_MF_STRICT was
            specified and an existing page was already on a node
            that does not follow the policy.
NUMA policy is not supported on file mappings.
MPOL_MF_STRICT is ignored on
      huge page mappings right now.
It is unfortunate that the same flag, MPOL_DEFAULT, has different effects for
      mbind(2) and set_mempolicy(2). To select
      "allocation on the node of the CPU that triggered the
      allocation" (like set_mempolicy(2)
      MPOL_DEFAULT) when calling
      mbind(), specify a policy of MPOL_PREFERRED with an empty nodemask.
The mbind(), get_mempolicy(2), and
        set_mempolicy(2) system
        calls were added to the Linux kernel with version 2.6.7.
        They are only available on kernels compiled with
        CONFIG_NUMA.
Support for huge page policy was added with 2.6.16. For interleave policy to be effective on huge page mappings the policied memory needs to be tens of megabytes or larger.
MPOL_MF_MOVE and
        MPOL_MF_MOVE_ALL are only
        available on Linux 2.6.16 and later.
These system calls should not be used directly. Instead,
        the higher level interface provided by the numa(3) functions in the
        numactl package
        is recommended. The numactl package is
        available at ftp://ftp.suse.com/pub/people/ak/numa/.
You can link with −lnuma to get system call
        definitions. libnuma is available in the
        numactl package.
        This package also has the numaif.h header.
numa(3), numactl(8), set_mempolicy(2), get_mempolicy(2), mmap(2)
| Copyright 2003,2004 Andi Kleen, SuSE Labs. Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies. Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one. Since the Linux kernel and libraries are constantly changing, this manual page may be incorrect or out-of-date. The author(s) assume no responsibility for errors or omissions, or for damages resulting from the use of the information contained herein. Formatted or processed versions of this manual, if unaccompanied by the source, must acknowledge the copyright and authors of this work. 2006-02-03, mtk, substantial wording changes and other improvements |