About cpu freq governor fixing for periodic

来源:互联网 发布:概率矩阵分解 python 编辑:程序博客网 时间:2024/05/02 23:09

Hi Rafael,

 

> -----Original Message-----

> From: Wysocki, Rafael J

> Sent: Thursday, April 21, 2016 7:21 PM

> To: Zheng, Lv; Chen, Yu C

> Cc: Brown, Len

> Subject: Re: About the cpufreq fix

>

> On 4/21/2016 4:53 AM, Zheng, Lv wrote:

> > Hi,

> >

> >> From: Chen, YuC

> >> Subject: Re: About the cpufreq fix

> >>

> >> Hi,

> >> On 04/21/2016 06:18 AM, Rafael J. Wysockiwrote:

> >>> On 4/20/2016 3:38 PM, Yu Chen wrote:

> >>>> Hi,

> >>>>

> >>>> On 04/20/2016 07:53 PM, Rafael J.Wysocki wrote:

> >>>>> On 4/20/2016 10:54 AM, Zheng,Lv wrote:

> >>>>>> Hi, Rafael

> >>>>> Hi,

> >>>>>

> >>>>> I've dropped Thomas from theCC.  He doesn't need to be involved

> >>>>> in our internal discussions.

> >>>>>

> >>>>>> Yu asked me to take a lookat this patch.

> >>>>>> https://patchwork.kernel.org/patch/8884461/

> >>>>>>

> >>>>>> I didn't receive it so Icouldn't reply directly.

> >>>>>> Let me reply here:

> >>>>>> +        if ((s64)cur_idle_time <=(s64)j_cdbs->prev_cpu_idle)

> >>>>>> + {

> >>>>>>

> >>>>>> This looks wrong to me.

> >>>>>> It should be a wrap-overissue and the code should be look like:

> >>>>>> +        if ((s64)(cur_idle_time -j_cdbs->prev_cpu_idle) < 0)

> >>>>>> + {

> >>>>> I'm not sure this mattersreally.

> >>>> I guess Lv means this situation:

> >>>>

> >>>> if cur_idle = 0xf000 0001, andprev_idle = 0xf000 0000, since

> >>>> (s32) cur_idle  < (s32) prev_idle, we always get the idle0% util

> >>>> next overflow.

> >>>>

> >>> In this case cur_idle == prev_idle + 1regardless of the sign, is it not?

> >>>

> >>>>>> And according to Yu, weneed to handle 32-bit zapping done by a

> >>>>>> jiffies_to_cputime() APIinvoked previously.

> >>>>> Ah, OK.  I didn't take the zapping part into account.

> >>>>>

> >>>>>> So it doesn't look likethat it can be fixed by this change.

> >>>>>> This should be fixedfurther in this style:

> >>>>>> +        if (cur_idle_time -j_cdbs->prev_cpu_idle > (max/2 +

> >>>>>> + 1)) {

> >>>>>> "max" might bejiffies_to_cputime(UINT32_MAX).

> >>>>> Hmm, well.

> >>>>>

> >>>>> Something like that maybe, butI'll have a deeper look into it

> >>>>> later today.

> >>>> Hi Lv,

> >>>> I did not quite understand this solution,but there are some

> >>>> special cases that it can not dealwith:

> >>>> if cur_idle = 0x1, and prev_idle =0xffff ffff, then

> >>>> (u64)(cur_idle

> >>>> - prev_idle) = 0xffffffff00000002,which is  bigger than

> >>>> UNINT32_MAX, and we will always get0% idle percent, which is not

> >>>> expected.

> >>>>

> >>>> So we may need to change it to:

> >>>> if (((u32)cur_idle_time -(u32)j_cdbs->prev_cpu_idle > (max/2 +

> >>>> 1)) but this breaks the system withCONFIG_NO_HZ, because the

> >>>> latter uses

> >>>> u64 to save the idle time, weshould not only check the lower

> >>>> 32bit of idle time.

> >>>>

> >>>>

> >>> The problem in question only happensfor CONFIG_HZ_PERIODIC and it

> >>> happens becauseget_cpu_idle_time_jiffy() mishandles negative

> >>> jiffies AFAICS.

> >>>

> >>> On 64-bit, jiffies is cast to u64 inget_jiffies_64(), so if it is

> >>> negative, it turns into a relativelylarge positive number.

> >>> Subtracting busy_time from that to getthe idle time doesn't

> >>> really make

> sense.

> >>>

> >> Yes, the problem is here, and my previouspatch did not consider

> >> this situation:( After revert previouspatch, I'm thinking of the

> >> following solution, in this way theidle_time would always

> >> be'increasing':

> >>

> >> @@ -140,14 +140,10 @@ static inline u64get_cpu_idle_time_jiff

> >>

> >>       cur_wall_time =jiffies64_to_cputime64(get_jiffies_64());

> >>

> >> -    busy_time= kcpustat_cpu(cpu).cpustat[CPUTIME_USER];

> >> -    busy_time+= kcpustat_cpu(cpu).cpustat[CPUTIME_SYSTEM];

> >> -    busy_time+= kcpustat_cpu(cpu).cpustat[CPUTIME_IRQ];

> >> -    busy_time+= kcpustat_cpu(cpu).cpustat[CPUTIME_SOFTIRQ];

> >> -    busy_time+= kcpustat_cpu(cpu).cpustat[CPUTIME_STEAL];

> >> -    busy_time+= kcpustat_cpu(cpu).cpustat[CPUTIME_NICE];

> >> -

> >> -    idle_time= cur_wall_time - busy_time;

> >> +   idle_time= kcpustat_cpu(cpu).cpustat[CPUTIME_IDLE];

> >> +   idle_time+= kcpustat_cpu(cpu).cpustat[CPUTIME_IOWAIT];

> >> +   idle_time+= kcpustat_cpu(cpu).cpustat[CPUTIME_GUEST];

> >> +   idle_time+= kcpustat_cpu(cpu).cpustat[CPUTIME_GUEST_NICE];

> >>       if (wall)

> >>                       *wall =cputime_to_usecs(cur_wall_time);

> >>

> >>

> > [Lv Zheng]

> > If this can make the returning value monotoneincreasing, than this

> > could be

> a valid fix.

>

> No, it wouldn't, because it is shifted with respectto cur_wall_time

> and the user of this compares/subtracts the two.

 

[Yu]

 In above solution,we don't need wall_time to get the idle_time, and in governor we compare delta(wall_time )  with  delta( idle_time), for example, Ifcur_wall_time = 0xffff fffe, prev_wall_time = 0xffff ff00, (very big after convertedto unsigned int) and if cur_idle_time = 0x10, prev_idle_time = 0x11,

 

thus the idle percent is (0x11-0x10) / (0xffff fffe - 0xffff ff00) , so I assume there is noproblem?

 

>

> > My question is:

> > How does cpufreq module know about the statvalues required for idle

> calculation?

> > Why don't we let the timekeeping core provide afunction for us

> > instead of

> providing this here?

>

> Essentially, it looks like get_cpu_idle_time_jiffy()is old code that

> had been overlooked when some changes to thetimekeeping were made.

>

> The right long-term fix would be to makeget_cpu_idle_time_us() work

> with CONFIG_HZ_PERIODIC and dropget_cpu_idle_time_jiffy() entirely then.

>

> Short-term I think the right approach is to subtractINITIAL_JIFFIES

> from the return value of get_jiffies_64() beforeapplying

> jiffies64_to_cputime64(), because that would removethe shift between cur_wall_time and busy_time.

> Still, though, cputime_to_usecs() would beproblematic then, so I

> guess it could just be replaced with an open-codeddivision.

>

Yes  this is moreacceptable according to its original context. Kernel is dangerous, sorry for myregression :-0

 

0 0
原创粉丝点击