Difference between buffer and cache
- 获取链接
- X
- 电子邮件
- 其他应用
free
Buffer and Cache are the indicators we get with free.
buffers
Memory used by kernel buffers (Buffers in/proc/meminfo)
cache Memory used by the page cache and slabs (Cached and SReclaimable in/proc/meminfo)
buff/cache
Sum of buffers and cache
From the free manual, you can see the description of buffer and cache.
Memory used by kernel buffers, corresponding to the Buffers value in/proc/meminfo.
Cache is the kernel page cache and the memory used by Slab, which corresponds to the sum of Cached and SReclaimable in/proc/meminfo.
The description here tells us that these values are from/proc/meminfo, but the meaning of more specific Buffers, Cached, and SReclaimable is still not clear.
To figure out what they are, I guess your first reaction is to go to Baidu or Google. Although most of the time, a web search will give an answer. However, not to mention the time and effort spent on filtering information, it is difficult for you to guarantee the accuracy of this answer.
Note that the conclusions on the Internet may be correct, but they may not match your environment. In the simplest terms, the specific meaning of the same indicator may be quite different due to different kernel versions and performance tool versions. That's why, I always emphasize common ideas and methods in the column, instead of letting you memorize the conclusions. For case practice, the machine environment is our biggest limitation.
So, is there an easier and more accurate way to query their meaning?
proc file system
As I mentioned earlier in the CPU performance module,/proc is a special file system provided by the Linux kernel and an interface for users to interact with the kernel. For example, the user can query the running status and configuration options of the kernel from/proc, query the running status and statistics of the process, etc. Of course, you can also modify the kernel configuration through/proc.
The proc file system is also the ultimate source of data for many performance tools. For example, the free we just saw is to get the memory usage by reading/proc/meminfo.
Continue to/proc/meminfo. Since the indicators Buffers, Cached, and SReclaimable are not easy to understand, we must continue to check the proc file system for detailed definitions.
Run man proc and you will get detailed documentation of the proc file system.
Note that this document is relatively long, and you'd better search for it (such as meminfo) to locate the memory part faster.
Buffers %lu
Relatively temporary storage for raw disk blocks that shouldn't get tremendously large (20MB or so).
Cached %lu
In-memory cache for files read from the disk (the page cache). Doesn't include SwapCached.
...
SReclaimable %lu (since Linux 2.6.19)
Part of Slab, that might be reclaimed, such as caches.
SUnreclaim %lu (since Linux 2.6.19)
Part of Slab, that cannot be reclaimed on memory pressure.
Through this document, we can see:
Buffers are temporary storage of raw disk blocks, that is, used to cache data on the disk, usually not particularly large (about 20MB). In this way, the kernel can centralize scattered writes and optimize disk writes in a unified manner. For example, multiple small writes can be combined into a single large write.
Cached is a page cache that reads files from disk, that is, it is used to cache data read from files. This way, the next time you access these file data, you can quickly get it directly from memory without having to access the slow disk again.
SReclaimable is part of Slab. Slab consists of two parts, of which the recyclable part is recorded with SReclaimable; the non-recyclable part is recorded with SUnreclaim.
Well, we finally found detailed definitions of these three indicators. At this point, have you taken a sigh of relief and thought with satisfaction that you finally figured out Buffer and Cache. But do you really understand this definition? Here I have two questions for you. Think about whether you can answer them first.
The first question, the Buffer's documentation does not mention whether this is a buffer for reading data or writing data to the disk, and in the results of many web searches, it is mentioned that Buffer is only a buffer for data to be written to disk. On the other hand, will it also cache data read from disk?
The second question, the document mentions that Cache is a cache for reading data from files, so does it also cache data for writing files?
In order to answer these two questions, I will use several cases to show the use of Buffer and Cache in different scenarios.
Case
Your preparation
Like previous experiments, today's case is also based on Ubuntu 18.04, of course, other Linux systems are also applicable. This is my case environment.
Machine configuration: 2 CPU, 8GB memory.
Install the sysstat package in advance, such as apt install sysstat.
The reason for installing sysstat is because we need to use vmstat to observe the changes of Buffer and Cache. Although the same results can be read from/proc/meminfo, the results of vmstat are more intuitive after all.
In addition, these cases use dd to simulate disk and file I/O, so we also need to observe the changes in I/O.
After the above tools are installed, you can open two terminals and connect to the Ubuntu machine.
The last step in the preparation process. To reduce the impact of the cache, remember to run the following command in the first terminal to clear the system cache:
# Clear various caches of file pages, directory entries, inodes, etc.
$ echo 3 >/proc/sys/vm/drop_caches
Here/proc/sys/vm/drop_caches is an example of modifying the kernel behavior through the proc file system. Writing 3 means cleaning up various caches such as file pages, directory entries, inodes, and so on. You don't need to worry about the difference between these several types of caches, as we will talk about later.
Scenario 1: Disk and file write case
Let's first simulate the first scenario. First, in the first terminal, run the following vmstat command:
# Output 1 set of data every 1 second
$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 0 7743608 1112 92168 0 0 0 0 52 152 0 1 100 0 0
0 0 0 7743608 1112 92168 0 0 0 0 36 92 0 0 100 0 0
In the output interface, the buff and cache of the memory part, and the bi and bo of the io part are the focuses we need to focus on.
buff and cache are the Buffers and Cache we saw earlier, the unit is KB.
bi and bo represent the read and write sizes of the block device, respectively, in blocks/second. Because the block size in Linux is 1KB, this unit is also equivalent to KB/s.
Under normal circumstances, in an idle system, you should see that these values have remained the same for multiple results.
Next, go to the second terminal and execute the dd command to read a random device to generate a 500MB file:
$ dd if=/dev/urandom of=/tmp/file bs=1M count=500
Then go back to the first terminal and observe the changes of Buffer and Cache:
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 0 7499460 1344 230484 0 0 0 0 29 145 0 0 100 0 0
1 0 0 7338088 1752 390512 0 0 488 0 39 558 0 47 53 0 0
1 0 0 7158872 1752 568800 0 0 0 4 30 376 1 50 49 0 0
1 0 0 6980308 1752 747860 0 0 0 0 24 360 0 50 50 0 0
0 0 0 6977448 1752 752072 0 0 0 0 29 138 0 0 100 0 0
0 0 0 6977440 1760 752080 0 0 0 152 42 212 0 1 99 1 0
...
0 1 0 6977216 1768 752104 0 0 4 122880 33 234 0 1 51 49 0
0 1 0 6977440 1768 752108 0 0 0 10240 38 196 0 0 50 50 0
By observing the output of vmstat, we find that when the dd command is run, the Cache keeps growing and the Buffer remains basically unchanged.
Looking further at the I/O situation, you will see that
When the cache started to grow, the block device I/O was very small, bi appeared only 488 KB/s once, and bo only 4KB once. After some time, a large number of block device writes will appear, such as bo becomes 122880.
After the dd command ends, the cache no longer grows, but the block device writes will continue for a while, and the results of multiple I/O writes add up to the 500M data that dd will write.
Comparing this result with the definition of Cache we just learned, you may be a little dazed. Why does the previous document say that Cache is a page cache for file reading, and how can I write a file now to have it?
Let's write down this question for the time being, and then look at another example of disk writing. After the two cases are over, we will conduct a unified analysis.
However, for the next case, I must emphasize one point:
The following command is very demanding on the environment. It requires your system to configure multiple disks, and the disk partition/dev/sdb1 must be in an unused state. If you have only one disk, don't try it, otherwise it will damage your disk partition.
If your system meets the standards, you can continue to run the following command in a second terminal. After clearing the cache, write 2GB of random data to the disk partition/dev/sdb1:
# First clear the cache
$ echo 3 >/proc/sys/vm/drop_caches
# Then run the dd command to write 2G data to the disk partition/dev/sdb1
$ dd if=/dev/urandom of=/dev/sdb1 bs=1M count=2048
Then, go back to terminal one and observe the changes in memory and I/O:
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 0 7584780 153592 97436 0 0 684 0 31 423 1 48 50 2 0
1 0 0 7418580 315384 101668 0 0 0 0 32 144 0 50 50 0 0
1 0 0 7253664 475844 106208 0 0 0 0 20 137 0 50 50 0 0
1 0 0 7093352 631800 110520 0 0 0 0 23 223 0 50 50 0 0
1 1 0 6930056 790520 114980 0 0 0 12804 23 168 0 50 42 9 0
1 0 0 6757204 949240 119396 0 0 0 183804 24 191 0 53 26 21 0
1 1 0 6591516 1107960 123840 0 0 0 77316 22 232 0 52 16 33 0
From this you can see that although writing data is the same, writing to disk and writing to file are different. When writing to disk (that is, when bo is greater than 0), both Buffer and Cache are growing, but obviously Buffer is growing much faster.
This shows that writing to disk uses a lot of Buffer, which is the same as the definition we found in the document.
Comparing the two cases, we found that when writing files, Cache is used to cache data, and when writing to disk, Buffer is used to cache data. So, returning to the question just now, although the document only mentions that Cache is a cache for file reading, in fact, Cache also caches data when writing files.
Scenario 2: Disk and file read case
Knowing the situation of disk and file writing, we think about it the other way around, what happens when the disk and file are read?
We return to the second terminal and run the following command. After clearing the cache, read data from the file/tmp/file and write to the empty device:
# First clear the cache
$ echo 3 >/proc/sys/vm/drop_caches
# Run the dd command to read the file data
$ dd if=/tmp/file of=/dev/null
Then, go back to terminal one and observe the changes in memory and I/O:
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 1 0 7724164 2380 110844 0 0 16576 0 62 360 2 2 76 21 0
0 1 0 7691544 2380 143472 0 0 32640 0 46 439 1 3 50 46 0
0 1 0 7658736 2380 176204 0 0 32640 0 54 407 1 4 50 46 0
0 1 0 7626052 2380 208908 0 0 32640 40 44 422 2 2 50 46 0
Observe the output of vmstat, you will find that when reading the file (that is, when bi is greater than 0), the Buffer remains unchanged, and the Cache keeps growing. This is consistent with the definition we found that "Cache is a page cache for reading files".
So, what about disk reads? Let's run the second case to see it.
First, go back to the second terminal and run the following command. After clearing the cache, read data from the disk partition/dev/sda1 and write to the empty device:
# First clear the cache
$ echo 3 >/proc/sys/vm/drop_caches
# Run the dd command to read the file
$ dd if=/dev/sda1 of=/dev/null bs=1M count=1024
Then, go back to terminal one and observe the changes in memory and I/O:
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 0 7225880 2716 608184 0 0 0 0 48 159 0 0 100 0 0
0 1 0 7199420 28644 608228 0 0 25928 0 60 252 0 1 65 35 0
0 1 0 7167092 60900 608312 0 0 32256 0 54 269 0 1 50 49 0
0 1 0 7134416 93572 608376 0 0 32672 0 53 253 0 0 51 49 0
0 1 0 7101484 126320 608480 0 0 32748 0 80 414 0 1 50 49 0
Observing the output of vmstat, you will find that when reading the disk (that is, when bi is greater than 0), both the Buffer and the Cache are growing, but obviously the Buffer is growing much faster. This means that when reading the disk, the data is buffered into the Buffer.
Of course, I think that after analyzing the two cases in the previous scenario, you can also draw this conclusion by comparison: data is cached in the Cache when reading files, and data is cached in the Buffer when reading disks.
At this point you should find that although the documentation provides descriptions of Buffer and Cache, it still cannot cover all the details. For example, today we learned two things:
- Buffer can be used as both a "buffer to write data to disk" and a "buffer to read data from disk".
- Cache can be used as both a "page cache for reading data from a file" and a "page cache for writing to a file".
In this way, we answered two questions before the case began.
In simple terms, Buffer is a buffer for disk data, and Cache is a buffer for file data. They are used in both read and write requests.
- 获取链接
- X
- 电子邮件
- 其他应用
评论