Volatile and CPU caches

Question

Volatile and CPU caches

Hello, here is such a question. As far as I understand, Volatile variables are required to be permanently written to memory. And if it is a regular variable, it is usually stored in the processor cache for faster access. Actually, when using Volatile, it is written directly to RAM.

There is, say, some processor with 4 cores. Accordingly, 4 independent processors, each of which has its own caches, for example, the first and second level, and the third common. Well, actually, we started several threads. Let's say each of them runs on a separate core, which has its own cache. We make the variable Volatile, and it is written directly to memory. And now if you run the program on a single-core processor, the OS simulates the parallelism of threads. It turns out that the processor has one cache and all threads run on the same processor. The question is this. Is the regular variable cached at the processor level or at the software level? That is, if even one core and if multiple threads are running, they divide the processor caches into independent memory regions, simulating multicore. Or do they use the same cache together, so the Volatile variable doesn't make sense? Please explain how everything happens. Thanks.

EDIT

I meant that the variable for quick access is not written immediately to memory, for example, x=1 will not be immediately written to memory, since it takes a lot of CPU cycles, and will be over the information is written to the processor cache, and then written to memory. If you do not set volatile and if the threads will change it at the same time, the new value will be stored in the cache, and not directly written to memory. If the threads run on several cores, then everything is clear, each core has its own cache, and if you try to read, and the last value is stored in some cache, then we get the wrong thing, unless, of course, we read from the same thread that the last one wrote. But if on one core, then the cache is one and all threads write there, then the meaning of volatile? Or I don't get it right.

11

English java volatile

Author: Виталина, 2014-08-19

Source

3 answers

Volatile has one advantage - it requires that the value of the variable is always read from memory. And not only the processor can write to this memory...

Or the second option-there is jni code that modifies the variable, and java code only reads it. And so that the optimizer does not throw out the read, you need to add volatile.

4

Author: KoVadim, 2014-08-19 12:29:25

You need to break the flow of thoughts into questions. So far, the simple answer that can be given is - volatile is not related to the cache.

Well, I mean, the usual variable is stored in the cache for quick access?

No. It is correct to think of cache and memory as a single place

4

Author: ,

score 11 · Accepted Answer

That is, when trying to read, if many threads have changed it, then each has its own cached one, and when trying to read it, it will not be what you need.

When reading, each processor will see a local copy of the "variable in memory" (from its cache), but when writing to such a variable, it will tell the other processors that they should update their caches in the specified cache line before further reading.

Usually, such relations give the property coherence caches. There are processors that have this property, and there are those for which you need to adapt the code to maintain data integrity. And one volatile here is not always possible to do (on ARM'ah, for example).

P.S.: only this, again, has nothing to do with volatile. Roughly speaking, this is a property of the shared memory of the SMP architecture.

In this architecture, there is no place for caches and so on. the bores you write about with such enviable respect perseverance

@Barmaley , I've been thinking a little more on the topic " there is no place for caches, etc. and yet I come to the conclusion that without this knowledge, you will hardly lie even on java you can count on adequate performance characteristics of your code for SMP. Here is a example from real life, with arrays.

But when I was looking for this material, I actually thought not even about some features of the structure of the arrays, but about the banal parallel processing of the classical array: divide the array into n parts and process each in its own stream (with some share of writing to the array, of course, i.e., for example: "parallel data filling").

If you do not think about the possibility of preemption of the cache line by a neighboring processor, and the resulting cache miss, then you will not get adequate results for it (as far as I know, java is not yet available). it can read the programmer's thoughts and add padding'и under the cache line boundary for arrays that it decides to parallelize in this way).