Chapter06-Phylogenetic Trees Inherited(POJ 2414)(状态压缩DP)

来源:互联网 发布:口袋竞拍源码 编辑:程序博客网 时间:2024/06/06 15:39

Phylogenetic Trees Inherited

Time Limit: 3000MS

Memory Limit: 65536K

Total Submissions: 480

Accepted: 297

Special Judge

Description

Among other things,Computational Molecular Biology deals with processing genetic sequences.Considering the evolutionary relationship of two sequences, we can say thatthey are closely related if they do not differ very much. We might representthe relationship by a tree, putting sequences from ancestors above sequencesfrom their descendants. Such trees are called phylogenetic trees.
Whereas one task of phylogenetics is to infer a tree from given sequences,we'll simplify things a bit and provide a tree structure - this will be acomplete binary tree. You'll be given the n leaves of the tree. Sure you know,n is always a power of 2. Each leaf is a sequence of amino acids (designated bythe one-character-codes you can see in the figure). All sequences will be ofequal length l. Your task is to derive the sequence of a common ancestor withminimal costs.

Amino Acid

Alanine

Ala

A

Arginine

Arg

R

Asparagine

Asn

N

Aspartic Acid

Asp

D

Cysteine

Cys

C

Glutamine

Gln

Q

Glutamic Acid

Glu

E

Glycine

Gly

G

Histidine

His

H

Isoleucine

Ile

I

Amino Acid

Leucine

Leu

L

Lysine

Lys

K

Methionine

Met

M

Phenylalanine

Phe

F

Proline

Pro

P

Serine

Ser

S

Threonine

Thr

T

Tryptophan

Trp

W

Tyrosine

Tyr

Y

Valine

Val

V

The costs are determined asfollows: every inner node of the tree is marked with a sequence of length l,the cost of an edge of the tree is the number of positions at which the twosequences at the ends of the edge differ, the total cost is the sum of the costsat all edges. The sequence of a common ancestor of all sequences is then foundat the root of the tree. An optimal common ancestor is a common ancestor withminimal total costs.

Input

The input file containsseveral test cases. Each test case starts with two integers n and l, denotingthe number of sequences at the leaves and their length, respectively. Input isterminated by n=l=0. Otherwise, 1<=n<=1024 and 1<=l<=1000. Thenfollow n words of length l over the amino acid alphabet. They represent the leavesof a complete binary tree, from left to right.

Output

For each test case, output aline containing some optimal common ancestor and the minimal total costs.

Sample Input

4 3

AAG

AAA

GGA

AGA

 

4 3

AAG

AGA

AAA

GGA

 

4 3

AAG

GGA

AAA

AGA

 

4 1

A

R

A

R

 

2 1

W

W

 

2 1

W

Y

 

1 1

Q

 

0 0

Sample Output

AGA 3
AGA 4
AGA 4
R 2
W 0
Y 1
Q 0

Source

UlmLocal 2000

 

【题目大意】给出完全二叉树的所有叶子节点,每个叶子节点上的字符串的长度是固定长度的,分别比较两个孩子的字符串,选取合适的字符串作为这个孩子的父节点,假如两个叶子节点分别是AAG,GAG,那么我们的父节点可以选择AAG或者GAG,因为他们的权值为1(权值是由父节点根据字符串字母的顺序依次与两个孩子比较的差别之和),由叶子节点依次往根节点求值,求得最小值,并打印出字符串,如果存在多个字符串具有相同的值,只需打印出其中一个即可;

【分析:】因为每个字符串的长度都是一样的,那么我们可以每次取一个字母进行求值,然后遍历字符串的长度,依次获得字符串每一位的和;针对一个字母,每次求父节点的最大值为1,就是两个都不相同,而且只要两个都相同,我们必须取相同的字母即可获得最小的值;

       看到完全二叉树,显然是要用数组来实现,与之前的一层一层稍有不同的是,每一层分别要算出所有父节点的状态值;

       因为不需要遍历状态值,所以所有的元素只需要按照其自然顺序即可,即Z的下标为Z-‘A’;

 

Java代码如下:

import java.util.Arrays;import java.util.Scanner;public class Main {// 节点的最大个数是2*1024-1,private int dp[][]= new int[1000][2*1024];private int arrayLen;private int stringLen;public void initial(int arrayLen, int stringLen) {this.arrayLen = arrayLen;this.stringLen = stringLen;for (int i = 0; i < stringLen; i++) {Arrays.fill(dp[i], 0,arrayLen,0);}//Arrays.fill(value, 0,arrayLen,0);}public static void main(String[] args) {// TODO Auto-generated method stubScanner cin = new Scanner(System.in);int m, n;// 节点数和字符串长度m = cin.nextInt();n = cin.nextInt();Main ma = new Main();String str = null;while (!(m == 0 && n == 0)) {ma.initial(2 * m, n);for (int i = 0; i < m; i++) {str = cin.next();for (int j = 0; j < n; j++) {ma.dp[j][m + i] = 1 << ((int) (str.charAt(j) - 'A'));}}if (m == 1) {System.out.println(str + " 0");} else {ma.getMinValueAndPrint();}m = cin.nextInt();n = cin.nextInt();}}private void getMinValueAndPrint() {// 从第0个字母开始int temp;int countSum=0;for (int i = 0; i < stringLen; i++) {temp = arrayLen / 4;while (temp >= 1) {for (int j = temp; j < temp * 2; j++) {if ((dp[i][2 * j] & dp[i][2 * j + 1]) == 0) {dp[i][j] = dp[i][2 * j] | dp[i][2 * j + 1];countSum++;} else {dp[i][j] = dp[i][2 * j] & dp[i][2 * j + 1];}}temp = temp/2;}}temp = 0;StringBuilder sb = new StringBuilder();for (int i = 0; i < stringLen; i++) {// 26表示26个字母;temp = dp[i][1];for (int count = 0; count < 26; count++) {if ((temp & 1) != 0) {sb.append((char)(count + 'A'));break;}temp = temp>>1;}}System.out.println(sb.toString() + " " + countSum);}}











0 0