PHP8 —— New String Helpers

新增了三个字符串函数,str_starts_with, str_ends_with, str_contains, PHP 的函数这么方便,很难想象竟然一直没有这几个。 str_starts_with 判断字符串是否以另一个字符串开头,在PHP7以及之前 $id = 'inv_abcdefgh'; $result = strpos($id, 'inv_') === 0; var_dump($result); // true PHP8 中可以直接这么写 $result = str_starts_with($id, 'inv_'); str_ends_with 判断字符串是否以另外一个字符串结尾,在 PHP7 及之前,比较麻烦,通常是这么写 $id = 'abcd_inv'; $result = strpos(strrev($id), strrev('_inv')) === 0; 或者 $result = substr($id, -1 * strlen('_inv')) === '_inv'; 或者上正则吧 $result = preg_match('/_inv$/', $id) === 1; 看起来都是比较麻烦的。PHP8 里面可以简化成下面这样了 $id = 'abcd_inv'; $result = str_ends_with($id, '_ind'); str_contains 字符串包含,PHP8 之前一般就是 strpos 来实现了 $url = 'https://example?for=bar'; $result = strpos($url, '?') !== FALSE; PHP8 就直接一点 $result = str_contains($url, '?');

Difference between buffer and cache

free

Buffer and Cache are the indicators we get with free.

buffers

Memory used by kernel buffers (Buffers in/proc/meminfo)

cache  Memory used by the page cache  and slabs (Cached and SReclaimable in/proc/meminfo)

buff/cache

Sum  of buffers and  cache

From the free manual, you can see the description of buffer and cache.
  Memory used by kernel buffers, corresponding to the Buffers value in/proc/meminfo.

Cache is the kernel page cache and the memory used by Slab, which corresponds to the sum of Cached and SReclaimable in/proc/meminfo.

The description here tells us that these values are from/proc/meminfo, but the meaning of more specific Buffers, Cached, and SReclaimable is still not clear.

To figure out what they are, I guess your first reaction is to go to Baidu or Google. Although most of the time, a web search will give an answer. However, not to mention the time and effort spent on filtering information, it is difficult for you to guarantee the accuracy of this answer.

Note that the conclusions on the Internet may be correct, but they may not match your environment. In the simplest terms, the specific meaning of the same indicator may be quite different due to different kernel versions and performance tool versions. That's why, I always emphasize common ideas and methods in the column, instead of letting you memorize the conclusions. For case practice, the machine environment is our biggest limitation.
So, is there an easier and more accurate way to query their meaning?

proc file system

As I mentioned earlier in the CPU performance module,/proc is a special file system provided by the Linux kernel and an interface for users to interact with the kernel. For example, the user can query the running status and configuration options of the kernel from/proc, query the running status and statistics of the process, etc. Of course, you can also modify the kernel configuration through/proc.

The proc file system is also the ultimate source of data for many performance tools. For example, the free we just saw is to get the memory usage by reading/proc/meminfo.

Continue to/proc/meminfo. Since the indicators Buffers, Cached, and SReclaimable are not easy to understand, we must continue to check the proc file system for detailed definitions.

Run man proc and you will get detailed documentation of the proc file system.

Note that this document is relatively long, and you'd better search for it (such as meminfo) to locate the memory part faster.

Buffers %lu

Relatively temporary storage for raw disk blocks that shouldn't get tremendously large (20MB or so).

Cached %lu

In-memory cache  for files read  from the disk (the page cache). Doesn't include SwapCached.

...

SReclaimable %lu (since Linux 2.6.19)

Part of Slab, that might be reclaimed, such as caches.



SUnreclaim %lu (since Linux 2.6.19)

Part of Slab, that cannot be reclaimed on memory pressure.

Through this document, we can see:

Buffers are temporary storage of raw disk blocks, that is, used to cache data on the disk, usually not particularly large (about 20MB). In this way, the kernel can centralize scattered writes and optimize disk writes in a unified manner. For example, multiple small writes can be combined into a single large write.

Cached is a page cache that reads files from disk, that is, it is used to cache data read from files. This way, the next time you access these file data, you can quickly get it directly from memory without having to access the slow disk again.

SReclaimable is part of Slab. Slab consists of two parts, of which the recyclable part is recorded with SReclaimable; the non-recyclable part is recorded with SUnreclaim.

Well, we finally found detailed definitions of these three indicators. At this point, have you taken a sigh of relief and thought with satisfaction that you finally figured out Buffer and Cache. But do you really understand this definition? Here I have two questions for you. Think about whether you can answer them first.

The first question, the Buffer's documentation does not mention whether this is a buffer for reading data or writing data to the disk, and in the results of many web searches, it is mentioned that Buffer is only a buffer for data to be written to disk. On the other hand, will it also cache data read from disk?

The second question, the document mentions that Cache is a cache for reading data from files, so does it also cache data for writing files?

In order to answer these two questions, I will use several cases to show the use of Buffer and Cache in different scenarios.

Case

Your preparation

Like previous experiments, today's case is also based on Ubuntu 18.04, of course, other Linux systems are also applicable. This is my case environment.

Machine configuration: 2 CPU, 8GB memory.

Install the sysstat package in advance, such as apt install sysstat.

The reason for installing sysstat is because we need to use vmstat to observe the changes of Buffer and Cache. Although the same results can be read from/proc/meminfo, the results of vmstat are more intuitive after all.

In addition, these cases use dd to simulate disk and file I/O, so we also need to observe the changes in I/O.

After the above tools are installed, you can open two terminals and connect to the Ubuntu machine.

The last step in the preparation process. To reduce the impact of the cache, remember to run the following command in the first terminal to clear the system cache:

# Clear various caches of file pages, directory entries, inodes, etc.
$ echo 3 >/proc/sys/vm/drop_caches

Here/proc/sys/vm/drop_caches is an example of modifying the kernel behavior through the proc file system. Writing 3 means cleaning up various caches such as file pages, directory entries, inodes, and so on. You don't need to worry about the difference between these several types of caches, as we will talk about later.

Scenario 1: Disk and file write case

Let's first simulate the first scenario. First, in the first terminal, run the following vmstat command:


# Output 1 set of data every 1 second

$ vmstat 1

procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----

r b swpd free buff cache si so bi bo in cs us sy id wa st

0  0  0  7743608  1112  92168  0  0  0  0  52  152  0  1  100  0  0

0  0  0  7743608  1112  92168  0  0  0  0  36  92  0  0  100  0  0

In the output interface, the buff and cache of the memory part, and the bi and bo of the io part are the focuses we need to focus on.

buff and cache are the Buffers and Cache we saw earlier, the unit is KB.

bi and bo represent the read and write sizes of the block device, respectively, in blocks/second. Because the block size in Linux is 1KB, this unit is also equivalent to KB/s.

Under normal circumstances, in an idle system, you should see that these values have remained the same for multiple results.

Next, go to the second terminal and execute the dd command to read a random device to generate a 500MB file:

$ dd if=/dev/urandom of=/tmp/file bs=1M count=500

Then go back to the first terminal and observe the changes of Buffer and Cache:


procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----

r b swpd free buff cache si so bi bo in cs us sy id wa st

0  0  0  7499460  1344  230484  0  0  0  0  29  145  0  0  100  0  0

1  0  0  7338088  1752  390512  0  0  488  0  39  558  0  47  53  0  0

1  0  0  7158872  1752  568800  0  0  0  4  30  376  1  50  49  0  0

1  0  0  6980308  1752  747860  0  0  0  0  24  360  0  50  50  0  0

0  0  0  6977448  1752  752072  0  0  0  0  29  138  0  0  100  0  0

0  0  0  6977440  1760  752080  0  0  0  152  42  212  0  1  99  1  0

...

0  1  0  6977216  1768  752104  0  0  4  122880  33  234  0  1  51  49  0

0  1  0  6977440  1768  752108  0  0  0  10240  38  196  0  0  50  50  0

By observing the output of vmstat, we find that when the dd command is run, the Cache keeps growing and the Buffer remains basically unchanged.

Looking further at the I/O situation, you will see that

When the cache started to grow, the block device I/O was very small, bi appeared only 488 KB/s once, and bo only 4KB once. After some time, a large number of block device writes will appear, such as bo becomes 122880.

After the dd command ends, the cache no longer grows, but the block device writes will continue for a while, and the results of multiple I/O writes add up to the 500M data that dd will write.

Comparing this result with the definition of Cache we just learned, you may be a little dazed. Why does the previous document say that Cache is a page cache for file reading, and how can I write a file now to have it?

Let's write down this question for the time being, and then look at another example of disk writing. After the two cases are over, we will conduct a unified analysis.

However, for the next case, I must emphasize one point:

The following command is very demanding on the environment. It requires your system to configure multiple disks, and the disk partition/dev/sdb1 must be in an unused state. If you have only one disk, don't try it, otherwise it will damage your disk partition.

If your system meets the standards, you can continue to run the following command in a second terminal. After clearing the cache, write 2GB of random data to the disk partition/dev/sdb1:

# First clear the cache

$ echo 3 >/proc/sys/vm/drop_caches

# Then run the dd command to write 2G data to the disk partition/dev/sdb1

$ dd if=/dev/urandom of=/dev/sdb1 bs=1M count=2048

Then, go back to terminal one and observe the changes in memory and I/O:

procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----

r b swpd free buff cache si so bi bo in cs us sy id wa st

1  0  0  7584780  153592  97436  0  0  684  0  31  423  1  48  50  2  0

1  0  0  7418580  315384  101668  0  0  0  0  32  144  0  50  50  0  0

1  0  0  7253664  475844  106208  0  0  0  0  20  137  0  50  50  0  0

1  0  0  7093352  631800  110520  0  0  0  0  23  223  0  50  50  0  0

1  1  0  6930056  790520  114980  0  0  0  12804  23  168  0  50  42  9  0

1  0  0  6757204  949240  119396  0  0  0  183804  24  191  0  53  26  21  0

1  1  0  6591516  1107960  123840  0  0  0  77316  22  232  0  52  16  33  0

From this you can see that although writing data is the same, writing to disk and writing to file are different. When writing to disk (that is, when bo is greater than 0), both Buffer and Cache are growing, but obviously Buffer is growing much faster.

This shows that writing to disk uses a lot of Buffer, which is the same as the definition we found in the document.

Comparing the two cases, we found that when writing files, Cache is used to cache data, and when writing to disk, Buffer is used to cache data. So, returning to the question just now, although the document only mentions that Cache is a cache for file reading, in fact, Cache also caches data when writing files.

Scenario 2: Disk and file read case

Knowing the situation of disk and file writing, we think about it the other way around, what happens when the disk and file are read?

We return to the second terminal and run the following command. After clearing the cache, read data from the file/tmp/file and write to the empty device:

# First clear the cache

$ echo 3 >/proc/sys/vm/drop_caches

# Run the dd command to read the file data

$ dd if=/tmp/file of=/dev/null

Then, go back to terminal one and observe the changes in memory and I/O:

procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----

r b swpd free buff cache si so bi bo in cs us sy id wa st

0  1  0  7724164  2380  110844  0  0  16576  0  62  360  2  2  76  21  0

0  1  0  7691544  2380  143472  0  0  32640  0  46  439  1  3  50  46  0

0  1  0  7658736  2380  176204  0  0  32640  0  54  407  1  4  50  46  0

0  1  0  7626052  2380  208908  0  0  32640  40  44  422  2  2  50  46  0

Observe the output of vmstat, you will find that when reading the file (that is, when bi is greater than 0), the Buffer remains unchanged, and the Cache keeps growing. This is consistent with the definition we found that "Cache is a page cache for reading files".

So, what about disk reads? Let's run the second case to see it.

First, go back to the second terminal and run the following command. After clearing the cache, read data from the disk partition/dev/sda1 and write to the empty device:

# First clear the cache

$ echo 3 >/proc/sys/vm/drop_caches

# Run the dd command to read the file

$ dd if=/dev/sda1 of=/dev/null bs=1M count=1024

Then, go back to terminal one and observe the changes in memory and I/O:

procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----

r b swpd free buff cache si so bi bo in cs us sy id wa st

0  0  0  7225880  2716  608184  0  0  0  0  48  159  0  0  100  0  0

0  1  0  7199420  28644  608228  0  0  25928  0  60  252  0  1  65  35  0

0  1  0  7167092  60900  608312  0  0  32256  0  54  269  0  1  50  49  0

0  1  0  7134416  93572  608376  0  0  32672  0  53  253  0  0  51  49  0

0  1  0  7101484  126320  608480  0  0  32748  0  80  414  0  1  50  49  0

Observing the output of vmstat, you will find that when reading the disk (that is, when bi is greater than 0), both the Buffer and the Cache are growing, but obviously the Buffer is growing much faster. This means that when reading the disk, the data is buffered into the Buffer.

Of course, I think that after analyzing the two cases in the previous scenario, you can also draw this conclusion by comparison: data is cached in the Cache when reading files, and data is cached in the Buffer when reading disks.

At this point you should find that although the documentation provides descriptions of Buffer and Cache, it still cannot cover all the details. For example, today we learned two things:

  • Buffer can be used as both a "buffer to write data to disk" and a "buffer to read data from disk".
  • Cache can be used as both a "page cache for reading data from a file" and a "page cache for writing to a file".

In this way, we answered two questions before the case began.

In simple terms, Buffer is a buffer for disk data, and Cache is a buffer for file data. They are used in both read and write requests.

评论

此博客中的热门博文

D3js Data-binding basics

JavaScript 数据类型

Vue3新特性(1) —— Vite