我的rails项目从sqlserver到postgres迁移之路

来源：互联网发布：js 产生2 8随机数编辑：程序博客网时间：2024/05/29 16:42

近期想把我们的某些rails项目从sqlserver迁移到postgres，由于我们的rails项目里夹杂很多的find_by_sql，而sqlserver与postgreSQL里很多语法/方法不同，需要sql的翻译处理。没有找到特别合适的插件做这件事情，只好自己动手。

一. 环境：

rails版本：3.0.7； ruby版本：1.8.7；

二. 主要步骤和方法：

1.导数据

利用Navicat for PostgreSQL将数据从sqlserver拷贝到postgreSQL

2.导完数据发现表无id，添加id

添加id有各种方法，你都可以尝试。当然你也可以用我在下面写的ruby代码。

3.添加id后发现id不自增长，为id添加sequence。

具体sql操作方法参可以参考http://tian-wang.iteye.com/blog/1051227。

由于我们已有数据，所以设置sequence时要指定start with n,其中n = max(id) + 1 。

特别要注意的是postgreSQL表名、字段名、序列名等最好不要用大写；序列名大写时会报错： relation "sequence_name" does not exist。

还有一点需要ruby程序员注意的是postgreSQL中的单双引号与ruby中单双引号意义完全不同，一不小心容易出错。

由于项目表数目太多，我写了一段ruby代码来完成这件事情（第二步和第三步），没有优化，先将就用着，至于怎么调用，不用多说了吧，哈哈：

module CommonExtensions  module OneTimeUsedForMigrate    extend ActiveSupport::Concern    #注意：postgreSQL的单双引号   和   ruby的单双引号不同，要特别注意    #postgreSQL表名、字段名、序列名等最好不要用大写    #序列名大写时会报错：ActiveRecord::StatementInvalid: PG::Error: ERROR:  relation "tables_id_seq" does not exist    module ClassMethods      def fixed_for_pk        tables = fixed_all_table_names        Common.fixed_deal_with_primary_keys(tables)      end      def fixed_deal_with_primary_keys(tables)        tables.each do |table|          if fixed_has_primary_key?(table)            pk = fixed_get_pk_name(table)          else            pk = fixed_set_primary_key(table)          end          if pk            sequence_name = fixed_add_sequence(table, pk, true, true)            fixed_associate_pk_and_sequence(table, pk, sequence_name)          end        end      end      def fixed_add_sequence(table,pk,reset_seq=false,force=false)        sequence_name = fixed_get_pk_sequence_name(table,pk)        if reset_seq  #强制删除主键上的sequence？          begin          delete_sql = "DROP SEQUENCE \"#{sequence_name}\" CASCADE "          Article.find_by_sql(delete_sql)          rescue          end        end        if force #删除主键为空的列          delete_null_id_sql = "delete from \"public\".\"#{table}\" where \"#{pk}\" is null"          Article.find_by_sql(delete_null_id_sql)        end        #设置主键从多少开始自增长        base_num =  fixed_get_sequence_based_num(table,pk)        add_sql = "CREATE SEQUENCE \"#{sequence_name}\"    START WITH :base_num    INCREMENT BY 1    NO MINVALUE    NO MAXVALUE    CACHE 1 "        Article.find_by_sql([add_sql, {:base_num => base_num}])        sequence_name      end      #序列从？开始增长？      def fixed_get_sequence_based_num(table,pk)        base_num_record = Article.find_by_sql("select max(\"#{pk}\") as max_num from \"#{table}\"").first        base_num =  base_num_record ? (base_num_record.max_num.to_i + 1) : 1      end      def fixed_associate_pk_and_sequence(table,pk,sequence_name)        sql = "alter table \"#{table}\" alter column \"#{pk}\" set default nextval('#{sequence_name}') "        Article.find_by_sql(sql)        true      end      #生成序列名      def fixed_get_pk_sequence_name(table,pk = false)        adaptive_pk = pk || fixed_get_pk_name(table)        "#{table}_#{adaptive_pk}_seq".downcase      end      #获取主键所在的列的名字（并不是主键名）      def fixed_get_pk_name(table)        return all_primary_keys[table][:col_name] unless all_primary_keys[table].blank?        fixed_get_assumed_pk(table)        #{:col_name => pk_record.colname, :type => pk_record.typename, :pk_name => pk_record.pk_name}      end      #如果已有主键，会出错      def fixed_set_primary_key(table)        assumed_pk = fixed_get_assumed_pk(table)        return false if assumed_pk.nil?        sql = "alter table \"#{table}\" add primary key (\"#{assumed_pk}\") "        Article.find_by_sql(sql)        assumed_pk      end      #猜测可能会成为主键的名字      def fixed_get_assumed_pk(table)        sql = "select table_schema,table_name,column_name,data_type,column_default,is_nullablefrom information_schema.columns where table_name = '#{table}' "        columns = Article.find_by_sql(sql).map{|col| [col.column_name, col.data_type]}        possible_pk_cols = ["id","#{table}ID","#{table}_id"]        possible_model_name = table.singularize        possible_pk_cols << "#{possible_model_name}_id" << "#{possible_model_name}ID"        possible_pk_cols.uniq!        possible_pk_cols.each do |possible_pk_col|          return possible_pk_col if fixed_test_column_is_pk?(possible_pk_col,columns)        end        nil      end      def fixed_test_column_is_pk?(test_col,columns)        columns.select{|col| col[0] == test_col && col[1].downcase == "integer"}.size > 0      end      def fixed_has_primary_key?(table)        !Common.all_primary_keys[table].blank?      end      def fixed_all_table_names        system_tables = ["sysdiagrams",  #sqlserver系统表                         "othertables"]         special_tables = [] #特殊表，或者不适用/不想加入本次处理的表        sql = "SELECT    pg_catalog.pg_relation_filenode(c.oid) as \"Filenode\",    relname as \"TableName\"FROM    pg_class c    LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace    LEFT JOIN pg_catalog.pg_database d ON d.datname = 'postgres'WHERE    relkind IN ('r')    AND n.nspname NOT IN ('pg_catalog', 'information_schema')    AND n.nspname !~ '^pg_toast'ORDER BY     relname"        Article.find_by_sql(sql).map(&:TableName) - system_tables - special_tables      end      #{table_name => {col_name,type,pk_name}}      def all_primary_keys        sql = "select pg_constraint.conname as pk_name,pg_class.relname as table_name,pg_attribute.attname as colname,pg_type.typname as typename frompg_constraint  inner join pg_classon pg_constraint.conrelid = pg_class.oidinner join pg_attribute on pg_attribute.attrelid = pg_class.oidand  pg_attribute.attnum = pg_constraint.conkey[1]inner join pg_type on pg_type.oid = pg_attribute.atttypidwhere pg_constraint.contype='p'--and pg_class.relname = 'yourtablename'  "        res = {}        pk_records = Article.find_by_sql(sql)        pk_records.each do |pk_record|          res[pk_record.table_name] = {:col_name => pk_record.colname, :type => pk_record.typename, :pk_name => pk_record.pk_name}        end        res      end    end  endend

4.开始大头活，做翻译工作

a. 调整top n

top n => limit n

      #todo:目前对"top ?"支持不好,sql最好不要写成这种形式，尽量用 top :top 这种形式      def adjust_top_part(sql)        focus_reg = /\s+top\s+(\d+|:\w+|\?)/        focus_match_sql = sql.match focus_reg        return sql.gsub(focus_reg,"") + " limit #{focus_match_sql[1]} " if focus_match_sql        sql      end

b. 调整getdate

getdate => CURRENT_DATE

      def adjust_get_date(sql)        sql.gsub(/getdate\(\s*\)/i," CURRENT_DATE ")      end

c. 调整is_null

is_null => COALESCE

      def adjust_is_null(sql)        sql.gsub(/isnull/i,"COALESCE")      end

d. 调整datediff

datediff => 我采取的方法是调整rails中的sql的同时，为postgreSQL添加datediff函数

      def adjust_date_diff(sql)        focus_reg = /datediff\s*\(.*?(\,)/        sql.gsub(focus_reg){|match|          "datediff(" + self.adjust_date_str_for_date_diff(match.gsub(/datediff\s*\(/,"").gsub(/\s*\,$/,"")) + ","        }      end

-- Function: datediff(character varying, timestamp without time zone, timestamp without time zone)-- DROP FUNCTION datediff(character varying, timestamp without time zone, timestamp without time zone);CREATE OR REPLACE FUNCTION datediff(arg_mode character varying, arg_d2 timestamp without time zone, arg_d1 timestamp without time zone)  RETURNS integer AS$BODY$BEGINif arg_mode = 'dd' or arg_mode = 'd' or arg_mode = 'day' or arg_mode = 'dy' or arg_mode = 'w' then  return cast(arg_d1 as date) - cast(arg_d2 as date);elsif arg_mode = 'ww' or arg_mode = 'week' then  return ceil( ( cast(arg_d1 as date) - cast(arg_d2 as date) ) / 7.0);elsif arg_mode = 'mm' or arg_mode = 'm' or arg_mode = 'month' then  return 12 * (date_part('year',arg_d1) - date_part('year',arg_d2)) + date_part('month',arg_d1) - date_part('month',arg_d2) +    case when date_part('day',arg_d1) > date_part('day',arg_d2)    then 0    when date_part('day',arg_d1) = date_part('day',arg_d2) and cast(arg_d1 as time) >= cast(arg_d2 as time)    then 0    else -1    end;elsif arg_mode = 'yy' OR arg_mode = 'y' OR arg_mode = 'yyyy' or arg_mode = 'year' then  return (cast(arg_d1 as date) - cast(arg_d2 as date)) / 365;end if;END $BODY$ LANGUAGE plpgsql;

f. 调整column like ‘%condition%’

column like ‘%condition%’ => column::text like '%condition%' （这里用sql标准函数cast(column as type)可能会更好）

      #todo: to modify this method to accept more db_column_name      def adjust_like_str(sql)        db_column_name = "columncode"        focus_reg = /#{db_column_name}\s+like/        sql.gsub(focus_reg,"#{db_column_name}::text like" )      end

g. 重写存储过程

rails项目中尽量不要用存储过程,不便于维护,如果是处于性能考虑可以在采取各种优化手段，这个没啥好说的，动手干吧。

h. 重写标量值函数

和上一步一样，体力活。

i. 其他sql函数

根据项目中用到的sql情况添加，比如left():

-- Function: "left"(integer, integer)-- DROP FUNCTION "left"(integer, integer);CREATE OR REPLACE FUNCTION "left"(integer, integer)  RETURNS integer AS$BODY$DECLARE   num ALIAS FOR $1;  str text := '';  n ALIAS FOR $2;BEGINstr := num::text;if n >= length(str) thenreturn str::int;end if;return substring(str from 1 for n)::int;END;$BODY$  LANGUAGE plpgsql;

5.调整find_by_sql、count_by_sql等

将翻译sql的工作隐藏到find_by_sql、count_by_sql等方法中

module ActiveRecord  class Base    class << self      def find_by_sql(sql)        return nil if sql.blank?        adjusted_sql = adjust_sql_of_find_by_sql(sql) if need_adjusted?        connection.select_all(sanitize_sql(adjusted_sql || sql), "#{name} Load").collect! { |record| instantiate(record) }      end      def count_by_sql(sql)        return nil if sql.blank?        adjusted_sql = adjust_sql_of_find_by_sql(sql) if need_adjusted?        connection.select_value(sanitize_conditions(adjusted_sql || sql), "#{name} Count").to_i      end      private      def need_adjusted?        connection.class.to_s.downcase.include? 'postgres'      end      def adjust_sql_of_find_by_sql(sql)        case sql          when Array;            sql_dup = sql.dup            sql_dup[0] = Common.adjust_sql_to_postgre(sql[0])            sql_dup          when Hash; sql          else            Common.adjust_sql_to_postgre(sql)        end      end    end  endend

6.其他地方的小调整

目前已经通过本地测试，等待预上线测试，更期待测试报告，哈哈。

pp.prog.pro@hotmail.com (hanpp@fwxgx.com已停用)