1451 - Average

来源:互联网 发布:人工智能 招聘 深圳 编辑:程序博客网 时间:2024/06/04 11:18

A DNA sequenceconsists of four letters, A, C, G, and T. The GC-ratio of a DNA sequence is thenumber of Cs and Gs of the sequence divided by the length of the sequence.GC-ratio is important in gene finding because DNA sequences with relativelyhigh GC-ratios might be good candidates for the starting parts of genes. Givena very long DNA sequence, researchers are usually interested in locating asubsequence whose GC-ratio is maximum over all subsequences of the sequence.Since short subsequences with high GC-ratios are sometimes meaningless in genefinding, a length lower bound is given to ensure that a long subsequence withhigh GC-ratio could be found. If, in a DNA sequence, a 0 is assigned to every Aand T and a 1 to every C and G, the DNA sequence is transformed into a binarysequence of the same length. GC-ratios in the DNA sequence are now equivalentto averages in the binary sequence.

 

Position

1

1

1

1

1

1

1

1

Index

1

2

3

4

5

6

7

8

9

0

1

2

3

4

5

6

7

Sequence

0

0

1

0

1

0

1

1

0

1

1

0

1

1

0

1

0


For the binary sequence above, if the length lower bound is 7, the maximumaverage is 6/8 which happens in the subsequence [7,14]. Its length is 8, whichis greater than the length lower bound 7. If the length lower bound is 5, thenthe subsequence [7,11] gives the maximum average 4/5. The length is 5 which isequal to the length lower bound. For the subsequence [7,11], 7 is its startingindex and 11 is its ending index.

Given a binarysequence and a length lower bound L, write a program to find asubsequence of the binary sequence whose length is at least L andwhose average is maximum over all subsequences of the binary sequence. If twoor more subsequences have the maximum average, then find the shortest one; andif two or more shortest subsequences with the maximum average exist, then findthe one with the smallest starting index.

Input 

Your program is toread from standard input. The input consists of T test cases.The number of test cases T is given in the first line of theinput. Each test case starts with a line containing two integers n (1n100, 000) and L (1L1, 000) which are thelength of a binary sequence and a length lower bound, respectively. In the nextline, a string, binary sequence, of length n is given.

Output 

Your program is towrite to standard output. Print the starting and ending index of thesubsequence.

The followingshows sample input and output for two test cases.

Sample Input 

2

17 5

00101011011011010

20 4

11100111100111110000

Sample Output 

7 11

6 9

代码:

#include<cstdio>

using namespacestd;

 

const int maxn =100000 + 5;

int n, L;

char s[maxn];

int sum[maxn],p[maxn];  //average of i~j is(sum[j]-sum[i-1])/(j-i+1)

 

 //compare average of x1~x2 and x3~x4

intcompare_average(int x1, int x2, int x3, int x4)

{

    return (sum[x2]-sum[x1-1]) * (x4-x3+1) -(sum[x4]-sum[x3-1]) * (x2-x1+1);

}

 

int main()

{

    int T;

    scanf("%d", &T);

    while(T--)

    {

        scanf("%d%d%s", &n,&L, s+1);

        sum[0] = 0;

        for(int i = 1; i <= n; i++)

        {

            sum[i] = sum[i-1] + s[i] - '0';

        }

        int ansL = 1, ansR = L;

//p[i..j) is thesequence of candidate start points

        int i = 0, j = 0;

        for (int t = L; t <= n; t++)   // end point

        {

            while (j-i > 1 && compare_average(p[j-2],t-L, p[j-1], t-L) >= 0)

            {

                j--;   // remove concave points

            }

            p[j++] = t-L+1;  //new candidate

            while (j-i > 1 &&compare_average(p[i], t, p[i+1], t) <= 0)

            {

               i++;   // update tangent point

            }

//compare andupdate solution

            int c = compare_average(p[i], t,ansL, ansR);

            if (c > 0 || c == 0 && t- p[i] < ansR - ansL)

            {

                ansL = p[i];

                ansR = t;

            }

        }

        printf("%d %d\n", ansL,ansR);

    }

    return 0;

}

0 0
原创粉丝点击