PIO Cache Coherency Issue on Cortex A9

来源：互联网发布：淘宝代购申诉凭证编辑：程序博客网时间：2024/05/11 16:19

Phenomenon:

1) PIO read sdmmc mbr magic number error, prompt “unknown partition table” in Prima-II FPGA with new sdhci 2.0 driver, no error in sdhci 1.0 driver. DMA is OK for all.

2) Porting sdhci 2.0 to Prima EVB, OK for both PIO and DMA.

3) Configure kernel with I-Cache/D-Cache alternative, result as below:

I D result

OK OK OK

No NO OK

OK NO OK

No OK NO

Which means, D-Cache makes this issue. Simple memory test in uboot OK.

Common Knowledge:

Cacheline

I-Cache

D-Cache

Cache alias

Page cache

Analyse and Research:

Cache type

Alias issue means several VA refer to one PA(VAs<-->PA), and ambiguity issue means one VA have several PAs(VA<-->PAs), handle these by cache flush or disabling cache.

There are four type of cache: VIVT, VIPT, PIPT and PIVT(not currently used). Prima-II with 32KB PIPT D-Cache and 32KB alias VIPT I-Cache.

Physically indexed, physically tagged (PIPT) caches use the physical address for both the index and the tag. While this is simple and avoids problems with aliasing, it is also slow, as the physical address must be looked up (which could involve a TLB miss and access to main memory) before that address can be looked up in the cache.

Virtually indexed, virtually tagged (VIVT) caches use the virtual address for both the index and the tag. This caching scheme can result in much faster lookups, since the MMU doesn't need to be consulted first to determine the physical address for a given virtual address. However, VIVT suffers from aliasing problems, where several different virtual addresses may refer to the same physical address. The result is that such addresses would be cached separately despite referring to the same memory, causing coherency problems. Another problem is homonyms, where the same virtual address maps to several different physical addresses. It is not possible to distinguish these mappings by only looking at the virtual index, though potential solutions include: flushing the cache after a context switch, forcing address spaces to be non-overlapping, tagging the virtual address with an address space ID (ASID), or using physical tags. Additionally, there is a problem that virtual-to-physical mappings can change, which would require flushing cache lines, as the VAs would no longer be valid.

Virtually indexed, physically tagged (VIPT) caches use the virtual address for the index and the physical address in the tag. The advantage over PIPT is lower latency, as the cache line can be looked up in parallel with the TLB translation, however the tag can't be compared until the physical address is available. The advantage over VIVT is that since the tag has the physical address, the cache can detect homonyms. VIPT requires more tag bits, as the index bits no longer represent the same address.

Physically indexed, virtually tagged caches are only theoretical as they would basically be useless.

Page Cache

TBD

Harvard Architecture

The Harvard architecture is a computer architecture with physically separate storage and signal pathways for instructions and data. The term originated from the Harvard Mark I relay-based computer, which stored instructions on punched tape (24 bits wide) and data in electro-mechanical counters. These early machines had limited data storage, entirely contained within the central processing unit, and provided no access to the instruction storage as data. Programs needed to be loaded by an operator, the processor could not boot itself.

Three situation occur cache breadkdown:

1) In a system with a DMA controller that reads memory locations that are held in the data cache of a processor, a breakdown of coherency occurs when the processor has written new data in the data cache, but the DMA controller reads the old data held in memory.

2) In a Harvard architecture of caches, a breakdown of coherency occurs when new instruction data has been written into the data cache, but the instruction cache still contain the old instruction data. I-D cache incoherency issue.

3)alias cache, one PA mapped to several VAs, which indicates several same data cachelines with different tags. A breakdown would occur when data be updated in one VA, but dirty data read by another VA. Situation like kmalloc a VA1 buffer, and next mapping to user space VA2.

Kernel Management(2.6.32)

fluash_dcache_page() is used when the kernel has written to the page cache page at virtual address page->virtual, ensure cache coherency between kernel mapping and userspace mapping of this page.

flush_kernel_dcache_page() should be only be called for pages that maybe mapped to user space(page cache pages), dont need this for control buffer. So read_dev_setor() read mapped pages, currently no-op on CPU of VIPT non-aliasing data cache.

PG_arch_1: This flag is used to indicate that the page pointed to by a pte is dirty and requires cleaning before returning it to the user

In the ARM case with PIPT Harvard caches(new processors), the kernel reading from a page that may be mapped in user space shouldn't need cache flushing. The kernel writing to such page would require D-cache flushing because of coherency with the I-cache.

We could of course flush the caches every time we get a page fault but that's far from optimal, especially since DMA-capable drivers to do not pollute the D-cache and don't need this extra flushing. Note that the recent ARM processors have PIPT caches but separate for I and D and it's the PIO drivers that pollute the D-cache.

Community Discussion

should similar DMA API for PIO?

Solution:

To temp fix it, flush the D-cache in flush_kernel_dcache_page() even on VIPT non-aliasing caches.

--- a/arch/arm/include/asm/cacheflush.h

+++ b/arch/arm/include/asm/cacheflush.h

@@ -406,7 +406,7 @@ static inline void flush_anon_page(struct vm_area_struct *vma,

 static inline void flush_kernel_dcache_page(struct page *page)

                   /* highmem pages are always flushed upon kunmap already */

-                   if ((cache_is_vivt() || cache_is_vipt_aliasing()) && !PageHighMem(page))

+                  if (!PageHighMem(page))

                                       __cpuc_flush_dcache_area(page_address(page), PAGE_SIZE);

Reference:

[PATCH] Fix flush_kernel_dcache_page for VIPT non-aliasing caches

http://www.spinics.net/lists/arm-kernel/msg91343.html

one letter about VIPT cache alias

http://hi.baidu.com/systemsoftware/blog/item/23d3ced01892cd80a1ec9cf2.html

wiki CPU cache

http://en.wikipedia.org/wiki/CPU_cache

ARM Architecture Reference Manual ARMv7-A and ARMv7-R edtion.

Kernel Document: Documentation/cachetlb.txt