linux qos

来源:互联网 发布:宝宝计划软件 编辑:程序博客网 时间:2024/05/22 13:22

overview:

https://github.com/Mellanox/mlxsw/wiki/Quality-of-Service

packet被根据SP放入到端口的headroom buffer。端口的headroom buffer(PG buffer)用来存储端口的incoming packets(在packet被交换机的pipeline处理的过程中),也用来存储不允许进入shared buffer的lossless flow。使用lldp ets up2tc设置SP到PG buffer的映射关系。

一旦经过交换机的pipeline处理完成之后,packet的ingress port、Switch Priority (SP)、egress port、TC就被确认了。根据这些信息,packet被分类放到不同的ingress和engress shared buffer(只有一个shared buffer,ingress pool和engress pool只是一虚拟的种容器,让你可以方便的控制admission rules。当有packet匹配pool之后,它们就会增加该pool的使用情况计数。
Note that there’s one shared buffers and the pools are simply containers meant to help you formulate the admission rules to the shared buffer.
As I explained above, there’s really one shard buffer. The devlink command you typed simply means that the packet will be counted as part of pool 4.
You can have up to 4 pools for each direction.
)。在进入shared buffer之前,与packet相关的shared buffer quota会被检查,确定packet是否允许进入shared buffer。(devlink sb)

packet驻留在shared buffer中直到被出端口发送出去。packet被根据它的TC放到不同的队列中。然后根据各个TC的TSA进行调度(有Strict Priority algorithm和ets两种)。使用lldp ets设置SP到TC的映射和各个TC的TSA。

> Thank you, Ido.>> So,according to your explain, my understanding is:>> The packet only have one TC(i think the packet have different ingress TC and egress TC before),> and the TC is determined by egress port's up2tc setting ?Ingress TC = PG.When a packet arrives, it's classified to a PG buffer based on its802.1p priority and up2tc mapping you configured on the ingress port.The packet then goes through the switch's pipeline which determines itsegress port. The egress TC is determined based on the packet's 802.1ppriority and the egress port's up2tc mapping.You now have the following information about the packet: Ingress Port,Ingress PG, Egress Port, Egress TC, which the switch uses the checkadmission for the shared buffer, where the packet is stored prior totransmission.> Once out of the switch's pipeline, the packet's TC is known, we assume it is TC1.>> If the TC1 packet pass below Admission Rules, it will be sent to shared buffer.>     Ingress{Port}.Usage < Thres && Ingress{Port,PG}.Usage < Thres && Egress{Port}.Usage < Thres && Egress{Port,TC}.Usage < Thres>> We assume the packet is received from Port1 and egress port2.> And the mapping between port TC to pool is like this:>     devlink sb tc bind set Port1 tc 1 type ingress pool 0 th 9Packet will be mapped to ingress TC (PG) 1 according to up2tcconfiguration on Port1.>     devlink sb tc bind set Port2 tc 1 type egress pool 4 th 9Packet will be mapped to egress TC 1 according to up2tc configuration ofPort2.> Then the packet will be counted as part of pool 0(because the packet is received from Port1, and it's TC is 1, so map it to pool 0 according above setting),> and the packet will also be conuted as part of pool 4(because it will egress Port2, and it's TC is 1, so map it to pool 4 according above setting).>> Is this right ?Yes. When a packet is admitted to the shared buffer it increments fourquotas:Ingress{Port}, Ingress{Port, PG}, Egress{Port}, Egress{Port, TC}Please let me know if further clarifications are required.

如果packet属于lossless flow(它所属的priority开启了PFC,就是lossless flow),并且这个packet不允许进入shared buffer,那么它会被存放到headroom中。

lldptool -T -i sw1p5 -V ETS-CFG up2tc=0:0,1:1,2:2,3:3,4:4,5:5,6:6,7:7这个命令创建了SPTC的映射也创建了SPPG buffer的映射lldptool -T -i sw1p5 -V ETS-CFG tsa=0:ets,1:ets,2:ets,3:ets,4:ets,5:ets,6:ets,7:ets tcbw=12,12,12,12,13,13,13,13(和必须是100)devlink sb可以设置出方向每个接口每个tc使用的pool以及quota可以设置入方向每个接口每个tc使用的pool以及quota

ETS:

The transmit path of a network port is modeled as a set of queues called traffic classes which are numbered 0 through N-1, where N is in the range 1 to 8. The user priorities 0-7 are mapped to the set of traffic classes. Further details and definition of the default priority to traffic class mappings are provided in the IEEE Standard 802.1Q-2011.

A transmission selection algorithm is used to select which traffic class is chosen next to dequeue a frame and transmit to the LAN. The default transmission selection algorithm is the Strict Priority algorithm. This algorithm always selects the highest numbered traffic class which has frames to transmit first before a lower numbered traffic class is selected.

Since the Strict Priority algorithm could allow a traffic flow on a higher numbered traffic class to block a lower numbered traffic class from getting a chance to transmit, another traffic selection algorithm has been defined for DCB called the Enhanced Transmission Selection (ETS) algorithm. ETS works by assigning a percentage of available bandwidth to traffic classes. Available bandwidth is defined as the amount of bandwidth left after higher priority transmission algorithms (like Strict Priority) have executed. The bandwidth percentage allocated to an ETS traffic class is the guaranteed amount of available bandwidth which will be made available to that traffic class. If an ETS traffic class does not use all of the bandwidth allocated to it, then other ETS traffic classes may be able to exceed their bandwidth allocations.

ETS allows multiple traffic flows operating on different traffic classes to each receive their fair share of network bandwidth. Obviously, if the strict priority algorithm is used in combination with the ETS algorithm, then care should be taken to ensure that the traffic flows on the strict priority traffic classes are relatively low volume flows.

lldptool Priority-based Flow Control (PFC)

To enable PFC for priorities 1, 2 and 3, run:$ lldptool -T -i sw1p5 -V PFC enabled=1,2,3

To bind packets originating from a {Port, PG} to an ingress pool, run:
devlink sb tc bind set pci/0000:03:00.0/1 tc 0 type ingress pool 0 th 9
Similarly for egress, to bind packets directed to a {Port, TC} to an egress pool, run:
devlink sb tc bind set sw1p17 tc 0 type egress pool 4 th 9

0 0