编程珠玑第10章--性能之节省空间

来源：互联网发布：linux 命令行刻录光盘编辑：程序博客网时间：2024/06/18 04:59

1.稀疏矩阵问题。
因为从文件更方便以为行为单位来读取，所以与书上的有少许不同
原理：这一行中结束的不为0元素刚好在下一行开始的不为0元素的前面，中间没有其他不为0元素。
这里按照行优先进行保存；
这里count保存迄今为止不为0的矩阵元素的个数
int firstInRow[]记录某一行第一个不为0的元素是第几个元素
int col[], 当前不为0的元素所在的列；
int pointNum[] 当前不为0的元素对应的值

package chapter10;import java.io.File;import java.io.FileNotFoundException;import java.util.Scanner;public class SparseMatrix {    public static void saveToSparseMatrix(int firstInRow[],int col[],int pointNum[]){        try {            Scanner scanner=new Scanner(new File("E://MyEclipse/WorkSpace/Pearls/DataOfNeighbour.txt"));            int rowIndex=0;//记录稀疏矩阵行的下标            int count=0;//记录每一行的不为0数的开始下标，从0开始计算；            while(scanner.hasNextLine()){                firstInRow[rowIndex]=count;                String line=scanner.nextLine();                String values[]=line.split(" ");                for(int i=0;i<values.length;i++){                    int value=Integer.valueOf(values[i]);                    if(value!=0){                        col[count]=i;                        pointNum[count]=value;                        count++;                    }                }                rowIndex++;            }            firstInRow[rowIndex]=count;//为了下面寻找方便        } catch (FileNotFoundException e) {            // TODO Auto-generated catch block            e.printStackTrace();        }    }    //查找第i行，第j列的值    public static int findInSparseMatrix(int firstInRow[],int col[],int pointNum[],int i,int j){        for(int startIndex=firstInRow[i];startIndex<firstInRow[i+1];startIndex++){            if(col[startIndex]==j){                return pointNum[startIndex];            }        }        return 0;    }    public static void main(String args[]){        //这里采用的是6*6的矩阵，因为是稀疏矩阵，假定我们知道矩阵里面存储的值不可能超过10个        int firstInRow[]=new int[7];        int col[]=new int[10];        int pointNum[]=new int[10];        saveToSparseMatrix(firstInRow, col, pointNum);        int i=1;        int j=2;        int result=findInSparseMatrix(firstInRow, col, pointNum, i, j);        if(result!=0)            System.out.println("找到第 "+i+" 行"+",第 "+j+" 列元素为"+result);        else{            System.out.println("该位置元素值为0");        }    }}

E://MyEclipse/WorkSpace/Pearls/DataOfNeighbour.txt的文件内容如下

0 2 0 0 0 00 0 1 0 0 73 4 0 0 0 00 0 0 0 6 00 9 0 0 0 80 0 0 5 0 0

2.数据空间技术
一。不存储，重新计算。这种方法只适用于需要“存储”的对象可以根据其描述重新计算得到的情况。此时只保存相应的生成器程序和对应特定对象的参数，如此便可在用到的时候进行恢复。例如质数表与检验质数的函数
二。稀疏数据结构
如果我们使用的关键字作为索引存储到表中，那么我们就没有必要再存储关键字本身了，只需要存储其相关的属性。如在第一个例子里，firstInRow数组使用row作为索引，然后我们存储了其对应的col 数组，pointNum数组。附录A 关键字索引的应用
三。数据压缩
通过压缩的方式对对象进行编码，以减少存储空间。
进制的压缩。根据数据的大小范围，long到int，再到short
数据特殊的表示法。c=10*a+b;用一个c表示了a和b
四分配策略
动态分配，用则分配，不用则不分配，ArrayList替代固定长度的数组。
垃圾回收。将不用的对象置空，方便JVM回收在利用。
四。代码空间技术
函数定义；解释程序；翻译成机器语言（终极大招，哈哈哈）
五。原理
空间开销。
空间的“热点”。单个数据结构的大小
空间度量。性能监视器
折中。程序的性能，功能，可维护性与内存的关系
编程环境
简单性。以及上数方法
课后习题
2.其他的数据结构
对相同的行的列数据进行排序，记得同时交换pointNum。简单起见，我直接另起了一个二元组类，用来保存列和pointNum

package chapter10;import java.io.File;import java.io.FileNotFoundException;import java.util.Arrays;import java.util.Scanner;class TwoTuple implements Comparable<TwoTuple>{    public  int col;    public  int pointNum;    @Override    public int compareTo(TwoTuple other) {        // TODO Auto-generated method stub        return this.col-other.col;    }    @Override    public String toString() {        // TODO Auto-generated method stub        return pointNum+" ";    }}public class t2 {    public static void saveToSparseMatrix(int firstInRow[],TwoTuple twoTuples[]){        try {            Scanner scanner=new Scanner(new File("E://MyEclipse/WorkSpace/Pearls/DataOfNeighbour.txt"));            int rowIndex=0;//记录稀疏矩阵行的下标            int count=0;//记录每一行的不为0数的开始下标，从0开始计算；            while(scanner.hasNextLine()){                firstInRow[rowIndex]=count;                String line=scanner.nextLine();                String values[]=line.split(" ");                for(int i=0;i<values.length;i++){                    int value=Integer.valueOf(values[i]);                    if(value!=0){                        twoTuples[count].col=i;                        twoTuples[count].pointNum=value;                        count++;                    }                }                System.out.println(Arrays.toString(twoTuples));                Arrays.sort(twoTuples,firstInRow[rowIndex],count-1);                System.out.println(Arrays.toString(twoTuples));//排序                rowIndex++;            }            firstInRow[rowIndex]=count;//为了下面寻找方便        } catch (FileNotFoundException e) {            // TODO Auto-generated catch block            e.printStackTrace();        }    }    public static int binarySearch(TwoTuple[] twoTuples,int number,int l,int r){        int m=0;        int resultValue=0;        while(true){            if(l>r){                return resultValue;            }            m=(l+r)/2;            if(twoTuples[m].col>number){                r=m-1;            }            if(twoTuples[m].col==number){                resultValue=twoTuples[m].pointNum;                return resultValue;            }            if(twoTuples[m].col<number){                l=m+1;            }        }    }    //查找第i行，第j列的值    public static int findInSparseMatrix(int firstInRow[],TwoTuple[] twoTuples,int i,int j){        return binarySearch(twoTuples, j, firstInRow[i], firstInRow[i+1]-1);    }    public static void main(String args[]){//      //这里采用的是6*6的矩阵，因为是稀疏矩阵，假定我们知道矩阵里面存储的值不可能超过10个        int firstInRow[]=new int[7];        TwoTuple[] twoTuples=new TwoTuple[10];        for(int i=0;i<twoTuples.length;i++){            twoTuples[i]=new TwoTuple();        }        saveToSparseMatrix(firstInRow, twoTuples);        int i=1;        int j=2;        int result=findInSparseMatrix(firstInRow, twoTuples, i, j);        if(result!=0)            System.out.println("找到第 "+i+" 行"+",第 "+j+" 列元素为"+result);        else{            System.out.println("该位置元素值为0");        }    }}

运行结果

[2 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 ][2 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 ][2 , 1 , 7 , 0 , 0 , 0 , 0 , 0 , 0 ][2 , 1 , 7 , 0 , 0 , 0 , 0 , 0 , 0 ][2 , 1 , 7 , 3 , 4 , 0 , 0 , 0 , 0 ][2 , 1 , 7 , 3 , 4 , 0 , 0 , 0 , 0 ][2 , 1 , 7 , 3 , 4 , 6 , 0 , 0 , 0 ][2 , 1 , 7 , 3 , 4 , 6 , 0 , 0 , 0 ][2 , 1 , 7 , 3 , 4 , 6 , 9 , 8 , 0 ][2 , 1 , 7 , 3 , 4 , 6 , 9 , 8 , 0 ][2 , 1 , 7 , 3 , 4 , 6 , 9 , 8 , 5 ][2 , 1 , 7 , 3 , 4 , 6 , 9 , 8 , 5 ]找到第 1 行,第 2 列元素为1

4.对于任意一年，1月1号是星期几有7中可能，是闰年还是非闰年有2种可能，所以14种日历可以表示所有年份了。这就是空间节省。

5.可忽略。只是个人推测。存储一些关键的值或者易出错的值，并且用MATLAB等数值分析软件匹配该表的程序。
6.移位与掩码通常比乘除法快。
8.
1. 日期
对于MM来讲，前面的M只有1和0两种可能，用1位足矣，后面的M可以是0-9,4位足矣；
DD来讲，第一个D只有0,1，2,3四种可能，2位就可以表示，后一个D可以是0-9,4位
YYYY来讲，每一个Y都可以0-9，每个都有4位，（其实第一个Y 0-7足矣，如果你的系统可以用几千年，那是不太可能的，这样就是省了一位）；所以对于日期来讲我们需要1+4+2+4+4*4=27位足矣，四个字节都有富余呢。又因为Java的int是32位，除去一个符号位，仍有31位可能，足够了，所以一个int就可以保存了，只需要将指定的数据左移相应的位。其他语言也类似。
2. 社会保障号
社会保障号的每一个D至多在0-9变化，也就是4位足矣（如果你知道社会保障号每一个D具有特定的范围，应该还可以减少用来表示的位）；4*9/8=4.5 所以至多5个字节即可。
3. 名字
名字每一个字母a-z映射为0-25，每一个5个字节，25*5/8=15.625 最多16个字节，其他有没有规律就不造了。
总结：若是空间要求特别高，就要尽可能利用所有已知的规则减少数据的可能性，并且优化算法来充分利用规则及存储空间。

0 0