kaldi monophone training outline

来源:互联网 发布:php while break 编辑:程序博客网 时间:2024/05/17 08:50

This outline is based on the steps/train_mono.sh script.

Use a subset of the training set to init the monophone model

synopsis

gmm-init-mono [options] Topo-in Dim Model-out Tree-out 

commands

gmm-init-mono "--train-feats=$feats subset-feat --n=10 ark:- ark:-|"\    $lang/topo $feat_dim $dir/0.mdl $dir/tree

Compiling training graphs

synopsis

compile-train-graphs [options] Tree-in Model-in L-fst-in Trans-rspec Graph-wspec

commands

trans_rspec="ark:sym2int.pl --map-oov $oov_sym -f 2- $lang/words.txt < $sdata/JOB/text|"out_graphs="ark:|gzip -c>$dir/fsts.JOB.gz"compile-train-graphs $dir/tree $dir/0.mdl $lang/L.fst $trans_rspec $out_graph

Align data equally

synopsis

align-equal-compiled Graph-rspec Feats-rspec Align-wspecgmm-acc-stats-ali [options] Model-in Feat-rspec Align-rspec Stats-out

commands

graph="ark:gunzip -c $dir/fsts.JOB.gz|"align-equal-compiled $graph "$feats" ark,t:- | \gmm-acc-stats-ali --binary=true $dir/0.mdl "$feats" ark:-

Estimate GMM with equally aligned frames

synopsis

gmm-est [options] Model-in Stats-in Model-out

commands

gmm-est [options] $dir/0.mdl "gmm-sum-accs - $dir/0.*.acc|" $dir/1.mdl

Alternatively align data and update model

  • align data according to $realign_iters
  • increase the amount of GMMs until $max_iter_inc, with (totgauss - numgauss) / max_iter_inc each iteration
  • update GMMs in each iteration
0 0
原创粉丝点击