Some optimizing compilers are capable of automatically restructuring loops to avoid this problem. ![]() As a result, the code is generating 8 times as much traffic as necessary. By the time the code got around to working on row 2 of the matrix, the A(2,1) entry might well be flushed out of the cache to make way for other needed data. The access to A(1,2) in the second iteration would then bring another 8 entries in from main memory, and so on. , A(8,1) into the cache from main memory, but 7/8 of these entries wouldn't be used. In this case, the first access would bring A(1,1), A(2,1). If in the alternative, we structure the loops so that the column index varies in the innermost loop, then the entries of A would be accessed in the order A(1,1), A(1,2), A(1,3). The next 8 iterations of the innermost loop work on this data without any additional main memory transfers. , A(8,1) into the cache from main memory. When the first entry A(1,1) is accessed, the system will bring a cache line containing A(1,1), A(2,1). If the loops are nested so that the innermost loop updates the row subscript, then the array entries will be accessed in the order A(1,1), A(2,1), A(3,1). Now, suppose that $A$ is an array with 10,000 rows and columns, and I'm looping over all of the entries. (In practice the cache architecture is now quite complicated with as many as 3 or 4 levels of cache memory, but the basic idea can be explained with a one-level cache of the sort that computers had in my younger days.) These chunks of data are temporarily stored in a fast memory cache and written back out as needed. The second important thing that you need to understand is that modern processors don't access memory one location at a time, but rather load and store "cache lines" of 64 or even 128 contiguous bytes (8 or 16 double precision floating point numbers) at a time from memory. This choice of column major order is arbitrary- we could just easily adopt a "row major order" convention, and in fact that's what is done in C and some other programming languages. ![]() if A is a 2 by 3 by 10 matrix, then the entries will be stored in memory in the order There are two key things that you need to understand.įirst, MATLAB (and Fortran, but not C and most other programming languages) stores arrays in memory in "column major order." e.g. ![]() A somewhat longer answer that explains why it's more efficient to have the left most index varying most rapidly.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |