总线地址、物理地址、虚拟地址相关概念澄清

来源：互联网发布：北京蓝箭火箭公司知乎编辑：程序博客网时间：2024/05/16 12:37

Now, on normal PCs the bus address is exactly the same as the physical address, and things are very simple indeed.However,they are that simple because the memory and the devices share the same address space, and that is not generally necessarily true on other PCI /ISA setups.
Now, just as an example, on the PReP(PowerPC Reference Platform),the CPU sees a memory map something like this (this is from memory):
   0-2 GB          "real memory"
   2 GB-3 GB    "system IO" ( inb /out and similar accesses on x86)
   3 GB-4 GB    "IO memory" (shared memory over the IO bus)
Now, that looks simple enough. However, when you look at the same thing from the viewpoint of the devices, you have the reverse, andthe physical memory address 0 actually shows up as address 2 GB for any IO master.So when the CPU wants any bus master to write to physical memory 0, it has to give the master address 0x80000000 as the memory address.
So, for example, depending on how the kernel is actually mapped on the PPC, you can end up with a setup like this:
physical address:  0
virtual address:    0xC0000000
bus address:       0x80000000

where all the addresses actually point to the same thing.  It's just seen through different translations..Similarly, on theAlpha, the normal translation is
physical address:  0
virtual address:    0xfffffc0000000000
bus address:       0x40000000
(but there are also Alphas where the physical address and the bus address are the same).
Anyway, the way to look up all these translations, you do
   #include <asm/io.h>
   phys_addr = virt_to_phys(virt_addr);
   virt_addr = phys_to_virt(phys_addr);
      bus_addr = virt_to_bus(virt_addr);
   virt_addr = bus_to_virt(bus_addr);

Now, when do you need these?
there are actually _three_ different ways of looking at memory addresses, and in this case we actually want the third, the so-called "bus address".
Essentially, the three ways of addressing memory are (this is "real memory",
that is, normal RAM--see later about other details):
- CPU untranslated.  This is the "physical" address.  Physical address0 is what the CPU sees when it drives zeroes on the memory bus.

- CPU translated address. This is the "virtual" address, and is completely internal to the CPU itself with the CPU doing the appropriate translations into "CPU untranslated".

- bus address. This is the address of memory as seen by OTHER devices, not the CPU. Now, in theory there could be many different bus addresses, with each device seeing memory in some device-specific way,but happily most hardware designers aren't actually actively trying to make   things any more complex than necessary, so you can assume that all external hardware sees the memory the same way.
You want the _virtual_ address when you are actually going to access that pointer from the kernel. So you can have something like this:
   /*
      * this is the hardware "mailbox" we use to communicate with
      * the controller. The controller sees this directly.
      */
   struct mailbox {
            __u32 status;
            __u32 bufstart;
            __u32 buflen;
            ..
   } mbox;

            unsigned char * retbuffer;

            /* get the address from the controller */
            retbuffer = bus_to_virt(mbox.bufstart);
            switch (retbuffer[0]) {
                  case STATUS_OK:
                        ...
on the other hand, you want the bus address when you have a buffer that you want to give to the controller:
   /* ask the controller to read the sense status into "sense_buffer" */
   mbox.bufstart = virt_to_bus(&sense_buffer);
   mbox.buflen = sizeof(sense_buffer);
   mbox.status = 0;
   notify_controller(&mbox);
And you generally _never_ want to use the physical address, because you can't use that from the CPU (the CPU only uses translated virtual addresses), and you can't use it from the bus master.

So why do we care about the physical address at all? We do need the physical address in some cases, it's just not very often in normal code.  The physical address is needed if you use memory mappings, for example, because the "remap_page_range()" mm function wants the physical address of the memory to be remapped.

This memory is called "PCI memory" or "shared memory" or "IO memory" or whatever, and there is only one way to access it: the readb/writeb and related functions. You should never take the address of such memory,because there is really nothing you can do with such an address: it's not conceptually in the same memory space as "real memory" at all, so you cannot just dereference a pointer.

物理地址与总线地址

1)物理地址是与CPU相关的。在CPU的地址信号线上产生的就是物理地址。在程序指令中的虚拟地址经过段映射和页面映射后，就生成了物理地址，这个物理地址被放到CPU的地址线上。
2) 总线地址，顾名思义，是与总线相关的，就是总线的地址线或在地址周期上产生的信号。外设使用的是总线地址。
3) 物理地址与总线地址之间的关系由系统的设计决定的。在x86平台上，物理地址与PCI总线地址是相同的。在其他平台上，也许会有某种转换，通常是线性的转换。

比如：CPU需要访问物理地址是0xfa000的单元，那么在x86平台上，会产生一个PCI总线上对0xfa000地址的访问。这个单元或者是内存中，或者是某个卡上的存储单元，甚至可能这个地址上没有对应的存储器。而在另外一个平台上，或许在PCI总线上产生的访问是针对地址为0x1fa000的单元。

总线地址和物理地址例子分析
对于ppc处理器而言，从CPU角度看到的memory和PCI设备角度看到的地址可能不一样，所以virt_to_bus和bus_to_virt 定义为：
391 /* the local DRAM has a different
395  * address from the PCI point of view, thus buffer addresses also
396  * have to be modified [mapped] appropriately.
397  */
398 extern inline unsigned long virt_to_bus(volatile void * address)
399 {
400 #ifndef CONFIG_APUS
401       if (address == (void *)0)
402                return 0;
403       return (unsigned long)address - KERNEL BASE + PCI_DRAM_OFFSET;
404 #else
405       return iopa ((unsigned long) address);
406 #endif
407 }
408
409 extern inline void * bus_to_virt(unsigned long address)
410 {
411 #ifndef CONFIG_APUS
412       if (address == 0)
413                return NULL;
414       return (void *)(address - PCI_DRAM_OFFSET + KERNEL BASE);
415 #else
416       return (void*) mm_ptov (address);
417 #endif
418 }

ARM：
对于S3C2410平台而言，物理地址与总线地址一致：
142 /*
143  * Virtual <-> DMA view memory address translations
144  * Again, these are *only* valid on the kernel direct mapped RAM
145  * memory.  Use of these is *deprecated* (and that doesn't mean
146  * use the __ prefixed forms instead.)  See dma-mapping.h.
147  */
148 static inline __deprecated unsigned long virt_to_bus(void *x)
149 {
150       return __virt_to_bus((unsigned long)x);
151 }
152
153 static inline __deprecated void *bus_to_virt(unsigned long x)
154 {
155       return (void *)__bus_to_virt(x);
156 }
36 /*
37  * These are exactly the same on the S3C2410 as the
38  * physical memory view.
39 */
40
41 #define __virt_to_bus(x) __virt_to_phys(x)
42 #define __bus_to_virt(x) __phys_to_virt(x)

在arch-ixp23xx平台下，总线地址和物理地址关系如下：
33 #define __virt_to_bus(v)                                        \
34       ({ unsigned int ret;                                        \
35       ret = ((__virt_to_phys(v) - 0x00000000) +                   \
36       (*((volatile int *)IXP23XX_PCI_SDRAM_BAR) & 0xfffffff0)); \
37       ret; })
38
39 #define __bus_to_virt(b)                                           \
40       ({ unsigned int data;                                        \
41       data = *((volatile int *)IXP23XX_PCI_SDRAM_BAR);             \
42       __phys_to_virt((((b - (data & 0xfffffff0)) + 0x00000000))); })
对于arch-iop3xx平台：

20 /*
21  * Virtual view <-> PCI DMA view memory address translations
22  * virt_to_bus: Used to translate the virtual address to an
23  *             address suitable to be passed to set_dma_addr
24  * bus_to_virt: Used to convert an address for DMA operations
25  *             to an address that the kernel can use.
26  */
27 #if defined(CONFIG_ARCH_IOP321)
28
29 #define __virt_to_bus(x)       (((__virt_to_phys(x)) & ~(*IOP321_IATVR2)) | ((*IOP321_IABAR2) & 0xfffffff0))
30 #define __bus_to_virt(x) (__phys_to_virt(((x) & ~(*IOP321_IALR2)) | ( *IOP321_IATVR2)))
31
32 #elif defined(CONFIG_ARCH_IOP331)
33
34 #define __virt_to_bus(x)       (((__virt_to_phys(x)) & ~(*IOP331_IATVR2)) | ((*IOP331_IABAR2) & 0xfffffff0))
35 #define __bus_to_virt(x) (__phys_to_virt(((x) & ~(*IOP331_IALR2)) | ( *IOP331_IATVR2)))
36
37 #endif
所以感觉总线地址和物理地址的关系直接与平台的设计相关。感觉一般的ARM手持设备中由于没有PCI接口，所以总线地址和物理地址应该一样？

1、申请和释放DMA缓冲区
内存中用于与外设交互数据的一块区域被称作DMA缓冲区，在设备不支持scatter/gather（SG，分散/聚集）操作的情况下，DMA 缓冲区必须是物理上连续的。
对于ISA设备而言，其DMA操作只能在16MB以下的内存中进行，因此，在使用kmalloc()和__get_free_pages()及其类似函数申请DMA缓冲区时应使用GFP_DMA标志，这样能保证获得的内存是具备DMA能力的（DMA-capable）。内核中定义了__get_free_pages()针对DMA的“快捷方式”__get_dma_pages()，它在申请标志中添加了GFP_DMA：
#define __get_dma_pages(gfp_mask, order) \
               __get_free_pages((gfp_mask) | GFP_DMA,(order))
如果不想使用log2size即order为参数申请DMA内存，则可以使用另一个函数dma_mem_alloc()，其源代码如代码清单11.17。
代码清单11.17 dma_mem_alloc()函数
1 static unsigned long dma_mem_alloc(int size)
2 {
3    int order = get_order(size);//大小->指数
4    return __get_dma_pages(GFP_KERNEL, order);
5 }
基于DMA的硬件使用总线地址而非物理地址，虽然在PC上，对于ISA和PCI而言，总线地址即为物理地址，但并非每个平台都是如此。因为有时候接口总线被通过桥接电路连接，桥接电路会将I/O地址映射为不同的物理地址。还有一些系统提供了页面映射机制，它能将任意的页面映射为连续的外设总线地址。内核提供了如下函数用于进行简单的虚拟地址/总线地址转换：
unsigned long virt_to_bus(volatile void *address);
void *bus_to_virt(unsigned long address);
在必须使用IOMMU或反弹缓冲区的情况下，上述函数一般不会正常工作。而且，这2个函数并不建议被使用。如图11.13所示，IOMMU的工作原理与CPU内的MMU非常类似，不过它针对的是外设总线地址和内存地址之间的转化。由于IOMMU可以使得外设看到“虚拟地址”，因此在使用IOMMU的情况下，在修改映射寄存器后，可以使得SG中分段的缓冲区地址对外设变得连续。
图11.13 MMU与IOMMU
设备并不一定能在所有的内存地址上执行DMA操作，在这种情况下应该通过下列函数执行DMA地址掩码：
int dma_set_mask(struct device *dev, u64 mask);
譬如，对于只能在24位地址上执行DMA操作的设备而言，就应该调用dma_set_mask (dev, 0xffffff)。
DMA映射包括2个方面的工作：分配一片DMA缓冲区；为这片缓冲区产生设备可访问的地址。同时，DMA映射也必须考虑cache一致性问题。内核中提供了以下函数用于分配一个DMA一致性的内存区域：
void * dma_alloc_coherent(struct device *dev, size_t size, dma_addr_t *handle, gfp_t gfp);
上述函数的返回值为申请到的DMA缓冲区的虚拟地址，此外，该函数还通过参数handle返回DMA缓冲区的总线地址。handle的类型为dma_addr_t，代表的是总线地址。
dma_alloc_coherent()申请一片DMA缓冲区，进行地址映射并保证该缓冲区的cache一致性。与dma_alloc_coherent()对应的释放函数为：
void dma_free_coherent(struct device *dev, size_t size, void *cpu_addr, dma_addr_t handle);
以下函数用于分配一个写合并（writecombining）的DMA缓冲区：
void * dma_alloc_writecombine(struct device *dev, size_t size, dma_addr_t *handle, gfp_t gfp);
与dma_alloc_writecombine()对应的释放“函数”dma_free_writecombine()实际上就是dma_free_coherent()，因为它定义为：
#define dma_free_writecombine(dev,size,cpu_addr,handle) \
   dma_free_coherent(dev,size,cpu_addr,handle)
此外，Linux内核还提供了PCI 设备申请DMA缓冲区的函数pci_alloc_consistent()，其原型为：
void * pci_alloc_consistent(struct pci_dev *pdev, size_t size, dma_addr_t *dma_addrp);
对应的释放函数为pci_free_consistent()，其原型为：
void pci_free_consistent(struct pci_dev *pdev, size_t size, void *cpu_addr,                      dma_addr_t dma_addr);
相对于一致性DMA映射而言，流式DMA映射的接口较为复杂。对于单个已经分配的缓冲区而言，使用dma_map_single()可实现流式DMA映射，该函数原型为：
dma_addr_t dma_map_single(struct device *dev, void *buffer, size_t size,
enum dma_data_direction direction);
如果映射成功，返回的是总线地址，否则，返回NULL。第4个参数为DMA的方向，可能的值包括DMA_TO_DEVICE、DMA_FROM_DEVICE、DMA_BIDIRECTIONAL和DMA_NONE。
dma_map_single()的“反函数”为dma_unmap_single()，原型是：
void dma_unmap_single(struct device *dev, dma_addr_t dma_addr, size_t size,
enum dma_data_direction direction);
通常情况下，设备驱动不应该访问unmap的流式DMA缓冲区，如果一定要这么做，可先使用如下函数获得DMA缓冲区的拥有权：
void dma_sync_single_for_cpu(struct device *dev, dma_handle_t bus_addr,
size_t size, enum dma_data_direction direction);
在驱动访问完DMA缓冲区后，应该将其所有权返还给设备，通过如下函数完成：
void dma_sync_single_for_device(struct device *dev, dma_handle_t bus_addr,
size_t size, enum dma_data_direction direction);
如果设备要求较大的DMA缓冲区，在其支持SG模式的情况下，申请多个不连续的、相对较小的DMA缓冲区通常是防止申请太大的连续物理空间的方法。在Linux内核中，使用如下函数映射SG：
int dma_map_sg(struct device *dev, struct scatterlist *sg, int nents,
enum dma_data_direction direction);
nents是散列表（scatterlist）入口的数量，该函数的返回值是DMA缓冲区的数量，可能小于nents。对于scatterlist中的每个项目，dma_map_sg()为设备产生恰当的总线地址，它会合并物理上临近的内存区域。
scatterlist结构体的定义如代码清单11.18所示，它包含了scatterlist对应的page结构体指针、缓冲区在page中的偏移（offset）、缓冲区长度（length）以及总线地址（dma_address）。
代码清单11.18 scatterlist结构体
1 struct scatterlist
2 {
3 struct page *page;
4 unsigned int offset;
5 dma_addr_t dma_address;
6 unsigned int length;
7 };
执行dma_map_sg()后，通过sg_dma_address()可返回scatterlist对应缓冲区的总线地址，sg_dma_len()可返回scatterlist对应缓冲区的长度，这2个函数的原型为：
dma_addr_t sg_dma_address(struct scatterlist *sg);
unsigned int sg_dma_len(struct scatterlist *sg);
在DMA传输结束后，可通过dma_map_sg()的反函数dma_unmap_sg()去除DMA映射：
void dma_unmap_sg(struct device *dev, struct scatterlist *list,
int nents, enum dma_data_direction direction);
SG映射属于流式DMA映射，与单一缓冲区情况下的流式DMA映射类似，如果设备驱动一定要访问映射情况下的SG缓冲区，应该先调用如下函数：
void dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg,
int nents, enum dma_data_direction direction);
访问完后，通过下列函数将所有权返回给设备：
void dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
int nents, enum dma_data_direction direction);
Linux系统中可以有一个相对简单的方法预先分配缓冲区，那就是通过“mem=”参数预留内存。譬如对于内存为64MB的系统，通过给其传递mem=62MB命令行参数可以使得顶部的2MB内存被预留出来作为IO内存使用，这2MB内存可以被静态映射（11.5节），也可以被执行ioremap()。

0 0