v8世界探险(2) - 词法和语法分析

来源：互联网发布：小米电商平台知乎编辑：程序博客网时间：2024/05/18 09:13

v8世界探险(2) - 词法和语法分析

上节我们学习了API的概况，这节开始我们就循着API来分析实现。
对于解释器或者编译器来说，我们第一个感兴趣的当然是编译的过程。
上节我们学习过了，编译调用的API是Script::Compile函数:

    // Compile the source code.    Local<Script> script = Script::Compile(context, source).ToLocalChecked();

Script::Compile

API的实现，大部分都位于src/api.cc中，比如Script::Compile就是如此。

如果指定了ScriptOrigin对象，就用它构造ScriptCompiler::Source对象，否则就用String指定的。
不管哪一支，最后都调用ScriptCompiler::Compile函数去做编译。

MaybeLocal<Script> Script::Compile(Local<Context> context, Local<String> source,                                   ScriptOrigin* origin) {  if (origin) {    ScriptCompiler::Source script_source(source, *origin);    return ScriptCompiler::Compile(context, &script_source);  }  ScriptCompiler::Source script_source(source);  return ScriptCompiler::Compile(context, &script_source);}

ScriptCompiler::Compile

ScriptCompiler::Compile函数仍然在api.cc中。
我们之前讲过，没有绑定到Context的编译脚本叫做UnboundScript，ScriptCompiler::Compile首先调用CompileUnboundInternal来编译生成一个UnboundScript，最后再将其BindToCurrentContext()跟上下文绑定。

MaybeLocal<Script> ScriptCompiler::Compile(Local<Context> context,                                           Source* source,                                           CompileOptions options) {  auto isolate = context->GetIsolate();  auto maybe = CompileUnboundInternal(isolate, source, options, false);  Local<UnboundScript> result;  if (!maybe.ToLocal(&result)) return MaybeLocal<Script>();  v8::Context::Scope scope(context);  return result->BindToCurrentContext();}

ScriptCompiler::CompileUnboundInternal

最主要的会调用Isolate的Compiler的CompileScript。

MaybeLocal<UnboundScript> ScriptCompiler::CompileUnboundInternal(    Isolate* v8_isolate, Source* source, CompileOptions options,    bool is_module) {...    result = i::Compiler::CompileScript(        str, name_obj, line_offset, column_offset, source->resource_options,        source_map_url, isolate->native_context(), NULL, &script_data, options,        i::NOT_NATIVES_CODE, is_module);    has_pending_exception = result.is_null();    if (has_pending_exception && script_data != NULL) {      // This case won't happen during normal operation; we have compiled      // successfully and produced cached data, and but the second compilation      // of the same source code fails.      delete script_data;      script_data = NULL;    }    RETURN_ON_FAILED_EXECUTION(UnboundScript);    if ((options == kProduceParserCache || options == kProduceCodeCache) &&        script_data != NULL) {      // script_data now contains the data that was generated. source will      // take the ownership.      source->cached_data = new CachedData(          script_data->data(), script_data->length(), CachedData::BufferOwned);      script_data->ReleaseDataOwnership();    } else if (options == kConsumeParserCache || options == kConsumeCodeCache) {      source->cached_data->rejected = script_data->rejected();    }    delete script_data;  }  RETURN_ESCAPED(ToApiHandle<UnboundScript>(result));}

Compiler::CompileScript

这个函数定义在src/compiler.cc中，我们先暂时略过细节，编译会调用到CompileTolevel函数。

 Handle<SharedFunctionInfo> Compiler::CompileScript(     Handle<String> source, Handle<Object> script_name, int line_offset,     int column_offset, ScriptOriginOptions resource_options,     Handle<Object> source_map_url, Handle<Context> context,     v8::Extension* extension, ScriptData** cached_data,     ScriptCompiler::CompileOptions compile_options, NativesFlag natives,     bool is_module) {...        static_cast<LanguageMode>(info.language_mode() | language_mode));    result = CompileToplevel(&info);    if (extension == NULL && !result.is_null()) {      compilation_cache->PutScript(source, context, language_mode, result);      if (FLAG_serialize_toplevel &&          compile_options == ScriptCompiler::kProduceCodeCache) {        HistogramTimerScope histogram_timer(            isolate->counters()->compile_serialize());        *cached_data = CodeSerializer::Serialize(isolate, result, source);        if (FLAG_profile_deserialization) {          PrintF("[Compiling and serializing took %0.3f ms]\n",                 timer.Elapsed().InMillisecondsF());        }      }    }...

CompileToplevel

这是个static函数，定义在compiler.cc中。
这中间主要经过两个步骤：
* Parser::ParseStatic - 词法分析，语法分析生成抽象语法树
* CompileBaselineCode - 代码生成

我们先看前半部分：

static Handle<SharedFunctionInfo> CompileToplevel(CompilationInfo* info) {  Isolate* isolate = info->isolate();  PostponeInterruptsScope postpone(isolate);  DCHECK(!isolate->native_context().is_null());  ParseInfo* parse_info = info->parse_info();  Handle<Script> script = parse_info->script();...  isolate->debug()->OnBeforeCompile(script);...  Handle<SharedFunctionInfo> result;  { VMState<COMPILER> state(info->isolate());    if (parse_info->literal() == NULL) {      // Parse the script if needed (if it's already parsed, literal() is      // non-NULL). If compiling for debugging, we may eagerly compile inner      // functions, so do not parse lazily in that case.      ScriptCompiler::CompileOptions options = parse_info->compile_options();      bool parse_allow_lazy = (options == ScriptCompiler::kConsumeParserCache ||                               String::cast(script->source())->length() >                                   FLAG_min_preparse_length) &&                              !info->is_debug();      parse_info->set_allow_lazy_parsing(parse_allow_lazy);      if (!parse_allow_lazy &&          (options == ScriptCompiler::kProduceParserCache ||           options == ScriptCompiler::kConsumeParserCache)) {        // We are going to parse eagerly, but we either 1) have cached data        // produced by lazy parsing or 2) are asked to generate cached data.        // Eager parsing cannot benefit from cached data, and producing cached        // data while parsing eagerly is not implemented.        parse_info->set_cached_data(nullptr);        parse_info->set_compile_options(ScriptCompiler::kNoCompileOptions);      }      if (!Parser::ParseStatic(parse_info)) {        return Handle<SharedFunctionInfo>::null();      }    }

后半部分是编译的部分，调用CompileBaselineCode.

    DCHECK(!info->is_debug() || !parse_info->allow_lazy_parsing());    info->MarkAsFirstCompile();    FunctionLiteral* lit = parse_info->literal();    LiveEditFunctionTracker live_edit_tracker(isolate, lit);...    // Compile the code.    if (!CompileBaselineCode(info)) {      return Handle<SharedFunctionInfo>::null();    }...  return result;}

Parser::ParseStatic

下面我们开始进入Parser的世界，入口在Parser::ParseStatic这个静态工厂函数。它定义在src/parsing/parser.cc中：

ParseStatic会构造一个Parser对象，然后调用parser的Parse函数去做解析。

bool Parser::ParseStatic(ParseInfo* info) {  Parser parser(info);  if (parser.Parse(info)) {    info->set_language_mode(info->literal()->language_mode());    return true;  }  return false;}

Parser::Parse

Parse开始解析脚本源代码，有两种情况，分别是:
* ParseLazy
* ParseProgram

bool Parser::Parse(ParseInfo* info) {  DCHECK(info->literal() == NULL);  FunctionLiteral* result = NULL;  // Ok to use Isolate here; this function is only called in the main thread.  DCHECK(parsing_on_main_thread_);  Isolate* isolate = info->isolate();  pre_parse_timer_ = isolate->counters()->pre_parse();  if (FLAG_trace_parse || allow_natives() || extension_ != NULL) {    // If intrinsics are allowed, the Parser cannot operate independent of the    // V8 heap because of Runtime. Tell the string table to internalize strings    // and values right after they're created.    ast_value_factory()->Internalize(isolate);  }  if (info->is_lazy()) {    DCHECK(!info->is_eval());    if (info->shared_info()->is_function()) {      result = ParseLazy(isolate, info);    } else {      result = ParseProgram(isolate, info);    }  } else {    SetCachedData(info);    result = ParseProgram(isolate, info);  }  info->set_literal(result);  Internalize(isolate, info->script(), result == NULL);  DCHECK(ast_value_factory()->IsInternalized());  return (result != NULL);}

Parser::ParseProgram

我们先看Parser::ParseProgram，主要干活的会调用Parser::DoParseProgram.

FunctionLiteral* Parser::ParseProgram(Isolate* isolate, ParseInfo* info) {  // TODO(bmeurer): We temporarily need to pass allow_nesting = true here,  // see comment for HistogramTimerScope class.  // It's OK to use the Isolate & counters here, since this function is only  // called in the main thread.  DCHECK(parsing_on_main_thread_);  HistogramTimerScope timer_scope(isolate->counters()->parse(), true);  Handle<String> source(String::cast(info->script()->source()));  isolate->counters()->total_parse_size()->Increment(source->length());  base::ElapsedTimer timer;  if (FLAG_trace_parse) {    timer.Start();  }  fni_ = new (zone()) FuncNameInferrer(ast_value_factory(), zone());  // Initialize parser state.  CompleteParserRecorder recorder;  if (produce_cached_parse_data()) {    log_ = &recorder;  } else if (consume_cached_parse_data()) {    cached_parse_data_->Initialize();  }  source = String::Flatten(source);  FunctionLiteral* result;  if (source->IsExternalTwoByteString()) {    // Notice that the stream is destroyed at the end of the branch block.    // The last line of the blocks can't be moved outside, even though they're    // identical calls.    ExternalTwoByteStringUtf16CharacterStream stream(        Handle<ExternalTwoByteString>::cast(source), 0, source->length());    scanner_.Initialize(&stream);    result = DoParseProgram(info);  } else {    GenericStringUtf16CharacterStream stream(source, 0, source->length());    scanner_.Initialize(&stream);    result = DoParseProgram(info);  }  if (result != NULL) {    DCHECK_EQ(scanner_.peek_location().beg_pos, source->length());  }  HandleSourceURLComments(isolate, info->script());  if (FLAG_trace_parse && result != NULL) {    double ms = timer.Elapsed().InMillisecondsF();    if (info->is_eval()) {      PrintF("[parsing eval");    } else if (info->script()->name()->IsString()) {      String* name = String::cast(info->script()->name());      base::SmartArrayPointer<char> name_chars = name->ToCString();      PrintF("[parsing script: %s", name_chars.get());    } else {      PrintF("[parsing script");    }    PrintF(" - took %0.3f ms]\n", ms);  }  if (produce_cached_parse_data()) {    if (result != NULL) *info->cached_data() = recorder.GetScriptData();    log_ = NULL;  }  return result;}

Parser::DoParseProgram

下面的代码虽然多，但是我们现在只主要关注两个函数就好了：

    if (info->is_module()) {      ParseModuleItemList(body, &ok);    } else {      ParseStatementList(body, Token::EOS, &ok);    }

ParseModuleItemList: 解析ES6支持的module语句的列表
ParseStatementList: 解析普通的语句

FunctionLiteral* Parser::DoParseProgram(ParseInfo* info) {...  Mode parsing_mode = FLAG_lazy && allow_lazy() ? PARSE_LAZILY : PARSE_EAGERLY;  if (allow_natives() || extension_ != NULL) parsing_mode = PARSE_EAGERLY;  FunctionLiteral* result = NULL;  {    // TODO(wingo): Add an outer SCRIPT_SCOPE corresponding to the native    // context, which will have the "this" binding for script scopes.    Scope* scope = NewScope(scope_, SCRIPT_SCOPE);    info->set_script_scope(scope);    if (!info->context().is_null() && !info->context()->IsNativeContext()) {      scope = Scope::DeserializeScopeChain(info->isolate(), zone(),                                           *info->context(), scope);      // The Scope is backed up by ScopeInfo (which is in the V8 heap); this      // means the Parser cannot operate independent of the V8 heap. Tell the      // string table to internalize strings and values right after they're      // created. This kind of parsing can only be done in the main thread.      DCHECK(parsing_on_main_thread_);      ast_value_factory()->Internalize(info->isolate());    }    original_scope_ = scope;    if (info->is_eval()) {      if (!scope->is_script_scope() || is_strict(info->language_mode())) {        parsing_mode = PARSE_EAGERLY;      }      scope = NewScope(scope, EVAL_SCOPE);    } else if (info->is_module()) {      scope = NewScope(scope, MODULE_SCOPE);    }    scope->set_start_position(0);    // Enter 'scope' with the given parsing mode.    ParsingModeScope parsing_mode_scope(this, parsing_mode);    AstNodeFactory function_factory(ast_value_factory());    FunctionState function_state(&function_state_, &scope_, scope,                                 kNormalFunction, &function_factory);    // Don't count the mode in the use counters--give the program a chance    // to enable script/module-wide strict/strong mode below.    scope_->SetLanguageMode(info->language_mode());    ZoneList<Statement*>* body = new(zone()) ZoneList<Statement*>(16, zone());    bool ok = true;    int beg_pos = scanner()->location().beg_pos;    if (info->is_module()) {      ParseModuleItemList(body, &ok);    } else {      ParseStatementList(body, Token::EOS, &ok);    }    // The parser will peek but not consume EOS.  Our scope logically goes all    // the way to the EOS, though.    scope->set_end_position(scanner()->peek_location().beg_pos);    if (ok && is_strict(language_mode())) {      CheckStrictOctalLiteral(beg_pos, scanner()->location().end_pos, &ok);    }    if (ok && is_sloppy(language_mode()) && allow_harmony_sloppy_function()) {      // TODO(littledan): Function bindings on the global object that modify      // pre-existing bindings should be made writable, enumerable and      // nonconfigurable if possible, whereas this code will leave attributes      // unchanged if the property already exists.      InsertSloppyBlockFunctionVarBindings(scope, &ok);    }    if (ok && (is_strict(language_mode()) || allow_harmony_sloppy() ||               allow_harmony_destructuring_bind())) {      CheckConflictingVarDeclarations(scope_, &ok);    }    if (ok && info->parse_restriction() == ONLY_SINGLE_FUNCTION_LITERAL) {      if (body->length() != 1 ||          !body->at(0)->IsExpressionStatement() ||          !body->at(0)->AsExpressionStatement()->              expression()->IsFunctionLiteral()) {        ReportMessage(MessageTemplate::kSingleFunctionLiteral);        ok = false;      }    }    if (ok) {      ParserTraits::RewriteDestructuringAssignments();      result = factory()->NewFunctionLiteral(          ast_value_factory()->empty_string(), scope_, body,          function_state.materialized_literal_count(),          function_state.expected_property_count(), 0,          FunctionLiteral::kNoDuplicateParameters,          FunctionLiteral::kGlobalOrEval, FunctionLiteral::kShouldLazyCompile,          FunctionKind::kNormalFunction, 0);    }  }...  return result;}

Parser::ParseModuleItemList

module语句是ES6中引入的新feature，针对每一条，调用ParseModuleItem语句去解析。

void* Parser::ParseModuleItemList(ZoneList<Statement*>* body, bool* ok) {  // (Ecma 262 6th Edition, 15.2):  // Module :  //    ModuleBody?  //  // ModuleBody :  //    ModuleItem*  DCHECK(scope_->is_module_scope());  RaiseLanguageMode(STRICT);  while (peek() != Token::EOS) {    Statement* stat = ParseModuleItem(CHECK_OK);    if (stat && !stat->IsEmpty()) {      body->Add(stat, zone());    }  }  // Check that all exports are bound.  ModuleDescriptor* descriptor = scope_->module();  for (ModuleDescriptor::Iterator it = descriptor->iterator(); !it.done();       it.Advance()) {    if (scope_->LookupLocal(it.local_name()) == NULL) {      // TODO(adamk): Pass both local_name and export_name once ParserTraits      // supports multiple arg error messages.      // Also try to report this at a better location.      ParserTraits::ReportMessage(MessageTemplate::kModuleExportUndefined,                                  it.local_name());      *ok = false;      return NULL;    }  }  scope_->module()->Freeze();  return NULL;}

Parser::ParseModuleItem

根据token是import，export还是普通语句，分别调用ParseImportDeclaration，ParseExportDeclaration或ParseStatementListItem.

Statement* Parser::ParseModuleItem(bool* ok) {  // (Ecma 262 6th Edition, 15.2):  // ModuleItem :  //    ImportDeclaration  //    ExportDeclaration  //    StatementListItem  switch (peek()) {    case Token::IMPORT:      return ParseImportDeclaration(ok);    case Token::EXPORT:      return ParseExportDeclaration(ok);    default:      return ParseStatementListItem(ok);  }}

Parser::ParseImportDeclaration

Statement* Parser::ParseImportDeclaration(bool* ok) {  // ImportDeclaration :  //   'import' ImportClause 'from' ModuleSpecifier ';'  //   'import' ModuleSpecifier ';'  //  // ImportClause :  //   NameSpaceImport  //   NamedImports  //   ImportedDefaultBinding  //   ImportedDefaultBinding ',' NameSpaceImport  //   ImportedDefaultBinding ',' NamedImports  //  // NameSpaceImport :  //   '*' 'as' ImportedBinding  int pos = peek_position();  Expect(Token::IMPORT, CHECK_OK);  Token::Value tok = peek();  // 'import' ModuleSpecifier ';'  if (tok == Token::STRING) {    const AstRawString* module_specifier = ParseModuleSpecifier(CHECK_OK);    scope_->module()->AddModuleRequest(module_specifier, zone());    ExpectSemicolon(CHECK_OK);    return factory()->NewEmptyStatement(pos);  }  // Parse ImportedDefaultBinding if present.  ImportDeclaration* import_default_declaration = NULL;  if (tok != Token::MUL && tok != Token::LBRACE) {    const AstRawString* local_name =        ParseIdentifier(kDontAllowRestrictedIdentifiers, CHECK_OK);    VariableProxy* proxy = NewUnresolved(local_name, IMPORT);    import_default_declaration = factory()->NewImportDeclaration(        proxy, ast_value_factory()->default_string(), NULL, scope_, pos);    Declare(import_default_declaration, DeclarationDescriptor::NORMAL, true,            CHECK_OK);  }  const AstRawString* module_instance_binding = NULL;  ZoneList<ImportDeclaration*>* named_declarations = NULL;  if (import_default_declaration == NULL || Check(Token::COMMA)) {    switch (peek()) {      case Token::MUL: {        Consume(Token::MUL);        ExpectContextualKeyword(CStrVector("as"), CHECK_OK);        module_instance_binding =            ParseIdentifier(kDontAllowRestrictedIdentifiers, CHECK_OK);        // TODO(ES6): Add an appropriate declaration.        break;      }      case Token::LBRACE:        named_declarations = ParseNamedImports(pos, CHECK_OK);        break;      default:        *ok = false;        ReportUnexpectedToken(scanner()->current_token());        return NULL;    }  }  ExpectContextualKeyword(CStrVector("from"), CHECK_OK);  const AstRawString* module_specifier = ParseModuleSpecifier(CHECK_OK);  scope_->module()->AddModuleRequest(module_specifier, zone());  if (module_instance_binding != NULL) {    // TODO(ES6): Set the module specifier for the module namespace binding.  }  if (import_default_declaration != NULL) {    import_default_declaration->set_module_specifier(module_specifier);  }  if (named_declarations != NULL) {    for (int i = 0; i < named_declarations->length(); ++i) {      named_declarations->at(i)->set_module_specifier(module_specifier);    }  }  ExpectSemicolon(CHECK_OK);  return factory()->NewEmptyStatement(pos);}

Parser::ParseStatementList

终于开始做语句的词法和语法分析了，它将继续调用ParseStatementListItem去处理每条语句，后面有一些细节我们先略过：

void* Parser::ParseStatementList(ZoneList<Statement*>* body, int end_token,                                 bool* ok) {  // StatementList ::  //   (StatementListItem)* <end_token>  // Allocate a target stack to use for this set of source  // elements. This way, all scripts and functions get their own  // target stack thus avoiding illegal breaks and continues across  // functions.  TargetScope scope(&this->target_stack_);  DCHECK(body != NULL);  bool directive_prologue = true;     // Parsing directive prologue.  while (peek() != end_token) {    if (directive_prologue && peek() != Token::STRING) {      directive_prologue = false;    }    Scanner::Location token_loc = scanner()->peek_location();    Scanner::Location old_this_loc = function_state_->this_location();    Scanner::Location old_super_loc = function_state_->super_location();    Statement* stat = ParseStatementListItem(CHECK_OK);    if (is_strong(language_mode()) && scope_->is_function_scope() &&        IsClassConstructor(function_state_->kind())) {      Scanner::Location this_loc = function_state_->this_location();      Scanner::Location super_loc = function_state_->super_location();      if (this_loc.beg_pos != old_this_loc.beg_pos &&          this_loc.beg_pos != token_loc.beg_pos) {        ReportMessageAt(this_loc, MessageTemplate::kStrongConstructorThis);        *ok = false;        return nullptr;      }      if (super_loc.beg_pos != old_super_loc.beg_pos &&          super_loc.beg_pos != token_loc.beg_pos) {        ReportMessageAt(super_loc, MessageTemplate::kStrongConstructorSuper);        *ok = false;        return nullptr;      }    }    if (stat == NULL || stat->IsEmpty()) {      directive_prologue = false;   // End of directive prologue.      continue;    }...    body->Add(stat, zone());  }  return 0;}

Parser::ParseStatement

只管空语句，其余的交给ParseSubStatement去处理。

1720Statement* Parser::ParseStatement(ZoneList<const AstRawString*>* labels,1721                                  bool* ok) {1722  // Statement ::1723  //   EmptyStatement1724  //   ...17251726  if (peek() == Token::SEMICOLON) {1727    Next();1728    return factory()->NewEmptyStatement(RelocInfo::kNoPosition);1729  }1730  return ParseSubStatement(labels, ok);1731}

AstNodeFactory::NewEmptyStatement

语法分析的输出结果，会生成一棵Ast树。AstNodeFactory就是生成AstNode的Helper函数的工厂类。
我们先看下它的定义：

3086// ----------------------------------------------------------------------------3087// AstNode factory30883089class AstNodeFactory final BASE_EMBEDDED {3090 public:3091  explicit AstNodeFactory(AstValueFactory* ast_value_factory)3092      : local_zone_(ast_value_factory->zone()),3093        parser_zone_(ast_value_factory->zone()),3094        ast_value_factory_(ast_value_factory) {}30953096  AstValueFactory* ast_value_factory() const { return ast_value_factory_; }30973098  VariableDeclaration* NewVariableDeclaration(3099      VariableProxy* proxy, VariableMode mode, Scope* scope, int pos,3100      bool is_class_declaration = false, int declaration_group_start = -1) {3101    return new (parser_zone_)3102        VariableDeclaration(parser_zone_, proxy, mode, scope, pos,3103                            is_class_declaration, declaration_group_start);3104  }

我们先看一个最简单的例子：NewEmptyStatement：

  EmptyStatement* NewEmptyStatement(int pos) {    return new (local_zone_) EmptyStatement(local_zone_, pos);  }

这些具体的AST类，定义于src/ast/ast.h:

class EmptyStatement final : public Statement { public:  DECLARE_NODE_TYPE(EmptyStatement) protected:  explicit EmptyStatement(Zone* zone, int pos): Statement(zone, pos) {}};

Parser::ParseSubStatement

针对不同的语句，分别有不同的Parse函数来处理，我们选其中的三个例子继续看一下：
* 代码块：ParseBlock
* if语句：ParseIfStatement
* do-while循环：ParseDoWhileStatement

其余的我们看一下解析子语句的完整实现，代码不长，很清晰，不言自明，就不多解释了：

Statement* Parser::ParseSubStatement(ZoneList<const AstRawString*>* labels,                                     bool* ok) {  // Statement ::  //   Block  //   VariableStatement  //   EmptyStatement  //   ExpressionStatement  //   IfStatement  //   IterationStatement  //   ContinueStatement  //   BreakStatement  //   ReturnStatement  //   WithStatement  //   LabelledStatement  //   SwitchStatement  //   ThrowStatement  //   TryStatement  //   DebuggerStatement  // Note: Since labels can only be used by 'break' and 'continue'  // statements, which themselves are only valid within blocks,  // iterations or 'switch' statements (i.e., BreakableStatements),  // labels can be simply ignored in all other cases; except for  // trivial labeled break statements 'label: break label' which is  // parsed into an empty statement.  switch (peek()) {    case Token::LBRACE:      return ParseBlock(labels, ok);    case Token::SEMICOLON:      if (is_strong(language_mode())) {        ReportMessageAt(scanner()->peek_location(),                        MessageTemplate::kStrongEmpty);        *ok = false;        return NULL;      }      Next();      return factory()->NewEmptyStatement(RelocInfo::kNoPosition);    case Token::IF:      return ParseIfStatement(labels, ok);    case Token::DO:      return ParseDoWhileStatement(labels, ok);    case Token::WHILE:      return ParseWhileStatement(labels, ok);    case Token::FOR:      return ParseForStatement(labels, ok);    case Token::CONTINUE:    case Token::BREAK:    case Token::RETURN:    case Token::THROW:    case Token::TRY: {      // These statements must have their labels preserved in an enclosing      // block      if (labels == NULL) {        return ParseStatementAsUnlabelled(labels, ok);      } else {        Block* result =            factory()->NewBlock(labels, 1, false, RelocInfo::kNoPosition);        Target target(&this->target_stack_, result);        Statement* statement = ParseStatementAsUnlabelled(labels, CHECK_OK);        if (result) result->statements()->Add(statement, zone());        return result;      }    }    case Token::WITH:      return ParseWithStatement(labels, ok);    case Token::SWITCH:      return ParseSwitchStatement(labels, ok);    case Token::FUNCTION: {      // FunctionDeclaration is only allowed in the context of SourceElements      // (Ecma 262 5th Edition, clause 14):      // SourceElement:      //    Statement      //    FunctionDeclaration      // Common language extension is to allow function declaration in place      // of any statement. This language extension is disabled in strict mode.      //      // In Harmony mode, this case also handles the extension:      // Statement:      //    GeneratorDeclaration      if (is_strict(language_mode())) {        ReportMessageAt(scanner()->peek_location(),                        MessageTemplate::kStrictFunction);        *ok = false;        return NULL;      }      return ParseFunctionDeclaration(NULL, ok);    }    case Token::DEBUGGER:      return ParseDebuggerStatement(ok);    case Token::VAR:      return ParseVariableStatement(kStatement, NULL, ok);    case Token::CONST:      // In ES6 CONST is not allowed as a Statement, only as a      // LexicalDeclaration, however we continue to allow it in sloppy mode for      // backwards compatibility.      if (is_sloppy(language_mode()) && allow_legacy_const()) {        return ParseVariableStatement(kStatement, NULL, ok);      }    // Fall through.    default:      return ParseExpressionOrLabelledStatement(labels, ok);  }}

构造一个代码块 Parser::ParseBlock

Block* Parser::ParseBlock(ZoneList<const AstRawString*>* labels,                          bool finalize_block_scope, bool* ok) {  // The harmony mode uses block elements instead of statements.  //  // Block ::  //   '{' StatementList '}'

下面是遇到左大括号时，调用AstNodeFactory的NewBlock函数生成一个Block类的AST节点。

  // Construct block expecting 16 statements.  Block* body =      factory()->NewBlock(labels, 16, false, RelocInfo::kNoPosition);  Scope* block_scope = NewScope(scope_, BLOCK_SCOPE);  // Parse the statements and collect escaping labels.  Expect(Token::LBRACE, CHECK_OK);  block_scope->set_start_position(scanner()->location().beg_pos);  { BlockState block_state(&scope_, block_scope);    Target target(&this->target_stack_, body);

下面如果没遇到右括号，就处理语句列表，递归：

    while (peek() != Token::RBRACE) {      Statement* stat = ParseStatementListItem(CHECK_OK);      if (stat && !stat->IsEmpty()) {        body->statements()->Add(stat, zone());      }    }  }  Expect(Token::RBRACE, CHECK_OK);  block_scope->set_end_position(scanner()->location().end_pos);  if (finalize_block_scope) {    block_scope = block_scope->FinalizeBlockScope();  }  body->set_scope(block_scope);  return body;}

下面是src/ast/ast.h中AstNodeFactory::NewBlock的实现：

  Block* NewBlock(ZoneList<const AstRawString*>* labels, int capacity,                  bool ignore_completion_value, int pos) {    return new (local_zone_)        Block(local_zone_, labels, capacity, ignore_completion_value, pos);  }

Block是一个BreakableStatement:

class Block final : public BreakableStatement { public:  DECLARE_NODE_TYPE(Block)  ZoneList<Statement*>* statements() { return &statements_; }  bool ignore_completion_value() const { return ignore_completion_value_; }  static int num_ids() { return parent_num_ids() + 1; }  BailoutId DeclsId() const { return BailoutId(local_id(0)); }  bool IsJump() const override {    return !statements_.is_empty() && statements_.last()->IsJump()        && labels() == NULL;  // Good enough as an approximation...  }  void MarkTail() override {    if (!statements_.is_empty()) statements_.last()->MarkTail();  }  Scope* scope() const { return scope_; }  void set_scope(Scope* scope) { scope_ = scope; } protected:  Block(Zone* zone, ZoneList<const AstRawString*>* labels, int capacity,        bool ignore_completion_value, int pos)      : BreakableStatement(zone, labels, TARGET_FOR_NAMED_ONLY, pos),        statements_(capacity, zone),        ignore_completion_value_(ignore_completion_value),        scope_(NULL) {}  static int parent_num_ids() { return BreakableStatement::num_ids(); } private:  int local_id(int n) const { return base_id() + parent_num_ids() + n; }  ZoneList<Statement*> statements_;  bool ignore_completion_value_;  Scope* scope_;};

if语句 - Parser::ParseIfStatement

if比前面的Block更简单，但是可能遇到表达式，遇到就调用ParseExpression，然后处理then块和else块。没什么技术含量哈。

IfStatement* Parser::ParseIfStatement(ZoneList<const AstRawString*>* labels,                                      bool* ok) {  // IfStatement ::  //   'if' '(' Expression ')' Statement ('else' Statement)?  int pos = peek_position();  Expect(Token::IF, CHECK_OK);  Expect(Token::LPAREN, CHECK_OK);  Expression* condition = ParseExpression(true, CHECK_OK);  Expect(Token::RPAREN, CHECK_OK);  Statement* then_statement = ParseSubStatement(labels, CHECK_OK);  Statement* else_statement = NULL;  if (peek() == Token::ELSE) {    Next();    else_statement = ParseSubStatement(labels, CHECK_OK);  } else {    else_statement = factory()->NewEmptyStatement(RelocInfo::kNoPosition);  }  return factory()->NewIfStatement(      condition, then_statement, else_statement, pos);}

do-while循环 - Parser::ParseDoWhileStatement

调用AstNodeFactory的NewDoWhileStatement生成ASTNode对象。然后处理do和while中间的语句，最后解析while中的表达式。

DoWhileStatement* Parser::ParseDoWhileStatement(    ZoneList<const AstRawString*>* labels, bool* ok) {  // DoStatement ::  //   'do' Statement 'while' '(' Expression ')' ';'  DoWhileStatement* loop =      factory()->NewDoWhileStatement(labels, peek_position());  Target target(&this->target_stack_, loop);  Expect(Token::DO, CHECK_OK);  Statement* body = ParseSubStatement(NULL, CHECK_OK);  Expect(Token::WHILE, CHECK_OK);  Expect(Token::LPAREN, CHECK_OK);  Expression* cond = ParseExpression(true, CHECK_OK);  Expect(Token::RPAREN, CHECK_OK);  // Allow do-statements to be terminated with and without  // semi-colons. This allows code such as 'do;while(0)return' to  // parse, which would not be the case if we had used the  // ExpectSemicolon() functionality here.  if (peek() == Token::SEMICOLON) Consume(Token::SEMICOLON);  if (loop != NULL) loop->Initialize(cond, body);  return loop;}

我们来一张UML图来复习一下上面的过程：
v8 parsing

0 0