hwloc 简介

来源:互联网 发布:淘宝客采集群软件 编辑:程序博客网 时间:2024/04/29 07:03
http://www.open-mpi.org/projects/hwloc/doc/hwloc-v1.9.1-letter.pdf


1.4 programming interface
each hwloc object contains a cpuset describing the list of processing units that it contains, these bitmaps may be used for CPU binding and Memory binding.

Note:  there are a variety of hardware topologies, and different OS have varying support of CPU & Memory bindings


---------------------------TERMS AND DEFINITINOS

object, kind of part of system, Core, Cache, Memory node ...


CPU set, set of logical processors logically indluced in an object, always expressed using physical logical processor numbers


NODE set, set of NUMA memory nodes logically included in an object, always expressed using physical node numbers


Bitmap, a possibly infinite set of bits used for describing sets of objects 


Parent object, the object logically containing the current object


Arity, the number of children of an object


Sibling objects, objects which have the same parent


Siblign rank,  index to uniquely identify object which have the same parent, and is always in range [0, parent_arity)


Cousin objects, objects of the same type (and depth) as the current objects, even if they don't have smae parent


Level, set of objects of the same type and depth, all these objects are cousins.


Depth, nesting level in the object tree


OS/physical index, the index that the OS uses to identify the object, maybe completely arbitrary, non-unique


Logical Index, index to uniquely identify objects of the same type and depth, automatically computed by hwloc according to the topolodg


Processing Unit, smallest processing element that can be represented by a hwloc object. maybe a single-core processor, a single thread


-----------------------topology-------------------


Machnie Level 

{ .depth = 0}


Socket Level{ .depth = 1}  


Cache Level {.depth = 2}


Core Level (.depth = 3)


PU Level (.depth = 4)


---------------------------------------------------


COMMAND LINE TOOLS


lstopo, display the hierarchical topology map of the current system, output maybe graphcial or textual, and can be exported to formats as PDF, PNG,XML. also can display the processes currently bound to a part of the machine(-ps)


hwloc-bind, binds processes to specific hardware objects through a flexible syntax


hwloc-calc, used to create bitmap strings to pass to the hwloc-bind


hwloc-info, dumps information about the given objects


hwloc-distrib, generates a set of bitmap strings that are uniformly distributed across the machine for the given number of processes


hwloc-ps,  to display the bindings of processes that are currently running on the local machine


hwloc-distances, displays all distances matrices attached to the topology


hwloc-annotate, add object attributes such as string information, it reads an input topology from xml file and outputs the annotated topology as anther xml


hwloc-diff, compute the difference between two topologies and output it to another XML 


hwloc-patch, reads such a difference fiel and applies to another topology


----------- CPU & Memory Binding Overview -------------


-------------IO Devices----------
hwloc usually manipulates processing units and memory but it can also discover IO devices and report their locality. especially useful for placing IO intensive applications on cores near the IO devices they use


----------MULTI-NODE TOPOLOGIES-----------
hwloc is usually used for consulting and manipulating single machine topologies, this includes large systems as long as a single instance of the OS manages the entire system. However it is sometimes desirable to have multiple independent hosts inside the same topology, hwloc therefore offers the ability to agregate multiple host topologies into a single global one


---------------OBJECT ATTRIBUTES---------------

OSName, OSRelease, OSVersion, HostName, Architecture(Machine Object)


Backend(Machine object or topology root object), the name of hwloc backend/component taht filled the topology, if serevel components were combined, multiple Backend keys may exist, e.g. x86, Linux pci


LinuxCgroup, the name the linux control group where the calling process is placed


SyntheticDescription(topology root object), the description string that was given to hwloc to build this synthetic topology


CPUModel(Socket or Machine),  the processor model name, usually added to Socket objects


CPUType(Socket)  a solaris specific general processor type name, e.g. "i86pc"


CPUVendor, CPUModelNumber, CPUFamilyNumber


CPURevision


PlatformName PlatformModel...


SystemVersionRegister,...


PCIVendor, PCIDevice


CoProcType e.g. MIC CUDA OpenCL


GPUVendor, ...


OpenCLDeviceType, OpenCLPlatformName


CUDAGlobalMemorySize, ..


MICSerialNumber, MICFamily, ...


DMIBoardVendor, DMIBoardName, .


Address, Port(Network interface OS devices), MAC address and the port number of a software network interface, e.g. eth4


举例


>> lstopo -p

Machine (256GB)

  NUMANode P#0 (64GB) + Socket P#0 + L3 (24MB)

    L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#0 + PU P#0

    L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#1 + PU P#4

    L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#2 + PU P#8

    L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#8 + PU P#12

    L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#17 + PU P#16

    L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#18 + PU P#20

    L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#24 + PU P#24

    L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#25 + PU P#28

  NUMANode P#1 (64GB) + Socket P#1 + L3 (24MB)

    L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#0 + PU P#1

    L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#1 + PU P#5

    L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#2 + PU P#9

    L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#8 + PU P#13

    L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#17 + PU P#17

    L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#18 + PU P#21

    L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#24 + PU P#25

    L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#25 + PU P#29

  NUMANode P#2 (64GB) + Socket P#2 + L3 (24MB)

    L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#0 + PU P#2

    L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#1 + PU P#6

    L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#2 + PU P#10

    L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#8 + PU P#14

    L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#17 + PU P#18

    L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#18 + PU P#22

    L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#24 + PU P#26

    L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#25 + PU P#30

  NUMANode P#3 (64GB) + Socket P#3 + L3 (24MB)

    L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#0 + PU P#3

    L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#1 + PU P#7

    L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#2 + PU P#11

    L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#8 + PU P#15

    L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#17 + PU P#19

    L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#18 + PU P#23

    L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#24 + PU P#27

    L2 (256KB) + L1d (32KB) + L1i (32KB) + Core P#25 + PU P#31

  HostBridge P#0

    PCIBridge

      PCI 1000:0072

        Block "sda"

        Block "sdb"

        Block "sdc"

        Block "sdd"

    PCIBridge

      PCI 14e4:1639

        Net "em1"

      PCI 14e4:1639

        Net "em2"

    PCIBridge

      PCI 102b:0532

    PCI 8086:3a20

      Block "sr0"

  HostBridge P#1

    PCIBridge

      PCI 8086:10fb

        Net "p3p1"

      PCI 8086:10fb

        Net "p3p2"

    PCIBridge

      PCI 1077:7322

        Net "ib0"

        OpenFabrics "qib0"


>> hwloc-info

depth 0:1 Machine (type #1)

 depth 1:4 NUMANode (type #2)

  depth 2:4 Socket (type #3)

   depth 3:4 L3Cache (type #4)

    depth 4:32 L2Cache (type #4)

     depth 5:32 L1dCache (type #4)

      depth 6:32 L1iCache (type #4)

       depth 7:32 Core (type #5)

        depth 8:32 PU (type #6)

Special depth -3:7 Bridge (type #9)

Special depth -4:8 PCI Device (type #10)

Special depth -5:11 OS Device (type #11)


0 0
原创粉丝点击