kaldi monophone training outline

来源：互联网发布：php while break 编辑：程序博客网时间：2024/05/17 08:50

This outline is based on the steps/train_mono.sh script.

Use a subset of the training set to init the monophone model

synopsis

gmm-init-mono [options] Topo-in Dim Model-out Tree-out

commands

gmm-init-mono "--train-feats=$feats subset-feat --n=10 ark:- ark:-|"\    $lang/topo $feat_dim $dir/0.mdl $dir/tree

Compiling training graphs

synopsis

compile-train-graphs [options] Tree-in Model-in L-fst-in Trans-rspec Graph-wspec

commands

trans_rspec="ark:sym2int.pl --map-oov $oov_sym -f 2- $lang/words.txt < $sdata/JOB/text|"out_graphs="ark:|gzip -c>$dir/fsts.JOB.gz"compile-train-graphs $dir/tree $dir/0.mdl $lang/L.fst $trans_rspec $out_graph

Align data equally

synopsis

align-equal-compiled Graph-rspec Feats-rspec Align-wspecgmm-acc-stats-ali [options] Model-in Feat-rspec Align-rspec Stats-out

commands

graph="ark:gunzip -c $dir/fsts.JOB.gz|"align-equal-compiled $graph "$feats" ark,t:- | \gmm-acc-stats-ali --binary=true $dir/0.mdl "$feats" ark:-

Estimate GMM with equally aligned frames

synopsis

gmm-est [options] Model-in Stats-in Model-out

commands

gmm-est [options] $dir/0.mdl "gmm-sum-accs - $dir/0.*.acc|" $dir/1.mdl

Alternatively align data and update model

align data according to $realign_iters
increase the amount of GMMs until $max_iter_inc, with (totgauss - numgauss) / max_iter_inc each iteration
update GMMs in each iteration

0 0