Limit memory and cpu with lxc-execute
First of all i would like you to understand Cgroups that are a part of the LXC utility. when you have a container, you would obviously want to ensure that the various containers you have running done starve any other container or process within. With this in mind, the nice guy of the LXC project a.k.a Daniel Lezcano integrated cgroups with the container technology he was creating i.e. LXC. Now if you want to assign resource usage, you will need to look into configuring your CGROUP. Cgroups allow you to allocate resources—such as CPU time, system memory, network bandwidth, or combinations of these resources—among user-defined groups of tasks (processes) running on a system. You can monitor the cgroups you configure, deny cgroups access to certain resources, and even reconfigure your cgroups dynamically on a running system. The cgconfig ( control group config) service can be configured to start up at boot time and reestablish your predefined cgroups, thus making them persistent across reboots. Cgroups can have multiple hierarchies because each hierarchy is attached to one or more subsystems (also known as resources controllers or controllers). This will then create multiple trees which are unconnected. There are nine subsystems available.
- blkio sets limits on input/output access on block devices
- cpu scheduler for cgroup task access to the CPU
- cpuacct generate reports for CPU use and cgroup
- cpuset assign CPUs and memory to a cgroup
- devices manage access to devices by tasks
- freezer suspend/resume tasks
- memory limit memory
- net_cls tag network packets to allow Linux traffic controller to identify task traffic
- ns namespace
We can list the subsystems we have in our kernel by the command :
lssubsys –am
lxc-cgroup get or set value from the control group associated with the container name. Manage the control group associated with a container. example usage:
lxc-cgroup -n foo cpuset.cpus "0,3"
assign the processors 0 and 3 to the container.
Now, i have in my opinion answered your original question. But let me add a bit of the parameters that might be useful to you for configuring your container for using lxc. there are condensed form of the documentation of resource control by redhat
BLKIO Modifiable Parameters:
blkio.reset_stats : any int to reset the statistics of BLKIO
blkio.weight : 100 - 1000 (relative proportion of block I/O access)
blkio.weight_device : major, minor , weight 100 - 1000
blkio.time : major, minor and time (device type and node numbers and length of access in milli seconds)
blkio.throttle.read_bps_device : major, minor specifies the upper limit on the number of read operations a device can perform. The rate of the read operations is specified in bytes per second.
blkio.throttle.read_iops_device :major, minor and operations_per_second specifies the upper limit on the number of read operations a device can perform
blkio.throttle.write_bps_device : major, minor and bytes_per_second (bytes per second)
blkio.throttle.write_iops_device : major, minor and operations_per_second
CFS Modifiable Parameters:
cpu.cfs_period_us : specifies a period of time in microseconds for how regularly a cgroup's access to CPU resources should be reallocated. If tasks in a cgroup should be able to access a single CPU for 0.2 seconds out of every 1 second, set cpu.cfs_quota_us to 200000 and cpu.cfs_period_us to 1000000.
cpu.cfs_quota_us : total amount of time in microseconds that all tasks in a cgroup can run during one period. Once limit has reached, they are not allowed to run beyond that.
cpu.shares : contains an integer value that specifies the relative share of CPU time available to tasks in a cgroup.
Note: For example, tasks in two cgroups that have cpu.shares set to 1 will receive equal CPU time, but tasks in a cgroup that has cpu.shares set to 2 receive twice the CPU time of tasks in a cgroup where cpu.shares is set to 1. Note that shares of CPU time are distributed per CPU. If one cgroup is limited to 25% of CPU and another cgroup is limited to 75% of CPU, on a multi-core system, both cgroups will use 100% of two different CPUs.
RT Modifiable Parameters:
cpu.rt_period_us : time in microseconds for how regularly a cgroups access to CPU resources should be reallocated.
cpu.rt_runtime_us : same as above.
CPUset :
cpuset subsystem assigns individual CPUs and memory nodes to cgroups.
Note: here some parameters are mandatory
Mandatory:
cpuset.cpus : specifies the CPUs that tasks in this cgroup are permitted to access. This is a comma-separated list in ASCII format, with dashes (" -") to represent ranges. For example 0-2,16 represents CPUs 0, 1, 2, and 16.
cpuset.mems : specifies the memory nodes that tasks in this cgroup are permitted to access. same as above format
Optional:
cpuset.cpu_exclusive : contains a flag ( 0 or 1) that specifies whether cpusets other than this one and its parents and children can share the CPUs specified for this cpuset. By default ( 0), CPUs are not allocated exclusively to one cpuset.
cpuset.mem_exclusive : contains a flag ( 0 or 1) that specifies whether other cpusets can share the memory nodes specified for this cpuset. By default ( 0), memory nodes are not allocated exclusively to one cpuset. Reserving memory nodes for the exclusive use of a cpuset ( 1) is functionally the same as enabling a memory hardwall with the cpuset.mem_hardwall parameter.
cpuset.mem_hardwall : contains a flag ( 0 or 1) that specifies whether kernel allocations of memory page and buffer data should be restricted to the memory nodes specified for this cpuset. By default ( 0), page and buffer data is shared across processes belonging to multiple users. With a hardwall enabled ( 1), each tasks' user allocation can be kept separate.
cpuset.memory_pressure_enabled : contains a flag ( 0 or 1) that specifies whether the system should compute the memory pressure created by the processes in this cgroup
cpuset.memory_spread_page : contains a flag ( 0 or 1) that specifies whether file system buffers should be spread evenly across the memory nodes allocated to this cpuset. By default ( 0), no attempt is made to spread memory pages for these buffers evenly, and buffers are placed on the same node on which the process that created them is running.
cpuset.memory_spread_slab : contains a flag ( 0 or 1) that specifies whether kernel slab caches for file input/output operations should be spread evenly across the cpuset. By default ( 0), no attempt is made to spread kernel slab caches evenly, and slab caches are placed on the same node on which the process that created them is running.
cpuset.sched_load_balance : contains a flag ( 0 or 1) that specifies whether the kernel will balance loads across the CPUs in this cpuset. By default ( 1), the kernel balances loads by moving processes from overloaded CPUs to less heavily used CPUs.
Devices:
The devices subsystem allows or denies access to devices by tasks in a cgroup.
devices.allow : specifies devices to which tasks in a cgroup have access. Each entry has four fields: type, major, minor, and access.
type can be of following three values:
a - applies to all devices
b - block devices
c - character devices
access is a sequence of one or more letters:
r read from device
w write to device
m create device files that do not yet exist
devices.deny : similar syntax as above
devices.list : reports devices for which access control has been set for tasks in this cgroup
Memory:
The memory subsystem generates automatic reports on memory resources used by the tasks in a cgroup, and sets limits on memory use by those tasks Memory modifiable parameters: memory.limit_in_bytes : sets the maximum amount of user memory. can use suffixes like K for kilo and M for mega etc. This only limits the groups lower in the heirarchy. i.e. root cgroup cannot be limited memory.memsw.limit_in_bytes : sets the maximum amount for the sum of memory and swap usage. again this cannot limit the root cgroup.
Note: memory.limit_in_bytes should always be set before memory.memsw.limit_in_bytes because only after limit, can swp limit be set
memory.force_empty : when set to 0, empties memory of all pages used by tasks in this cgroup
memory.swappiness : sets the tendency of the kernel to swap out process memory used by tasks in this cgroup instead of reclaiming pages from the page cache. he default value is 60. Values lower than 60 decrease the kernel's tendency to swap out process memory, values greater than 60 increase the kernel's tendency to swap out process memory, and values greater than 100 permit the kernel to swap out pages that are part of the address space of the processes in this cgroup.
Note: Swappiness can only be asssigned to leaf groups in the cgroups architecture. i.e if any cgroup has a child cgroup, we cannot set the swappiness for that
memory.oom_control : contains a flag ( 0 or 1) that enables or disables the Out of Memory killer for a cgroup. If enabled ( 0), tasks that attempt to consume more memory than they are allowed are immediately killed by the OOM killer.
net_cls:
The net_cls subsystem tags network packets with a class identifier (classid) that allows the Linux traffic controller ( tc) to identify packets originating from a particular cgroup. The traffic controller can be configured to assign different priorities to packets from different cgroups.
net_cls.classid : 0XAAAABBBB AAAA = major number (hex)
BBBB = minor number (hex)
net_cls.classid contains a single value that indicates a traffic control handle. The value of classid read from the net_cls.classid file is presented in the decimal format while the value to be written to the file is expected in the hexadecimal format. e.g. 0X100001 = 10:1
net_prio :
The Network Priority ( net_prio) subsystem provides a way to dynamically set the priority of network traffic per each network interface for applications within various cgroups. A network's priority is a number assigned to network traffic and used internally by the system and network devices. Network priority is used to differentiate packets that are sent, queued, or dropped. traffic controller (tc) is responsible to set the networks priority.
net_prio.ifpriomap : networkinterface , priority (/cgroup/net_prio/iscsi/net_prio.ifpriomap)
Contents of the net_prio.ifpriomap file can be modified by echoing a string into the file using the above format, for example:
~]# echo "eth0 5" > /cgroup/net_prio/iscsi/net_prio.ifpriomap