clang源码——CompilerInstance和Preprocessor(二)

来源:互联网 发布:js 删除预览图片 编辑:程序博客网 时间:2024/06/08 17:13

继续之前的思路,我们详细看一下Preprocessor的初始化

  PP = std::make_shared<Preprocessor>(      Invocation->getPreprocessorOptsPtr(), getDiagnostics(), getLangOpts(),      getSourceManager(), *HeaderInfo, *this, PTHMgr,      /*OwnsHeaderSearch=*/true, TUKind);

第一个参数来自Invocation,我们知道这里面保存的是CompilerInstance的data和options,那么它是什么时候构造完成的呢。
CompilerInvocation有一个内容为空的构造函数,在cc1_main函数构造Clang指针变量的时候调用,

  CompilerInvocation() : AnalyzerOpts(new AnalyzerOptions()) {}

然后cc1_main函数中调用CompilerInvocation::CreateFromArgs,填充内容

  bool Success = CompilerInvocation::CreateFromArgs(      Clang->getInvocation(), Argv.begin(), Argv.end(), Diags);

CreateFromArgs当中解析参数的一段

  std::unique_ptr<OptTable> Opts(createDriverOptTable());  const unsigned IncludedFlagsBitmask = options::CC1Option;  unsigned MissingArgIndex, MissingArgCount;  InputArgList Args =      Opts->ParseArgs(llvm::makeArrayRef(ArgBegin, ArgEnd), MissingArgIndex,                      MissingArgCount, IncludedFlagsBitmask);  LangOptions &LangOpts = *Res.getLangOpts();

Preprocessor构造函数定义

Preprocessor::Preprocessor(std::shared_ptr<PreprocessorOptions> PPOpts,                           DiagnosticsEngine &diags, LangOptions &opts,                           SourceManager &SM, HeaderSearch &Headers,                           ModuleLoader &TheModuleLoader,                           IdentifierInfoLookup *IILookup, bool OwnsHeaders,                           TranslationUnitKind TUKind)    : PPOpts(std::move(PPOpts)), Diags(&diags), LangOpts(opts), Target(nullptr),      AuxTarget(nullptr), FileMgr(Headers.getFileMgr()), SourceMgr(SM),      ScratchBuf(new ScratchBuffer(SourceMgr)), HeaderInfo(Headers),      TheModuleLoader(TheModuleLoader), ExternalSource(nullptr),      Identifiers(opts, IILookup),      PragmaHandlers(new PragmaNamespace(StringRef())),      IncrementalProcessing(false), TUKind(TUKind), CodeComplete(nullptr),      CodeCompletionFile(nullptr), CodeCompletionOffset(0),      LastTokenWasAt(false), ModuleImportExpectsIdentifier(false),      CodeCompletionReached(false), CodeCompletionII(nullptr),      MainFileDir(nullptr), SkipMainFilePreamble(0, true), CurPPLexer(nullptr),      CurDirLookup(nullptr), CurLexerKind(CLK_Lexer), CurSubmodule(nullptr),      Callbacks(nullptr), CurSubmoduleState(&NullSubmoduleState),      MacroArgCache(nullptr), Record(nullptr), MIChainHead(nullptr),      DeserialMIChainHead(nullptr) {

庞大的初始化列表

  OwnsHeaderSearch = OwnsHeaders;  CounterValue = 0; // __COUNTER__ starts at 0.  // Clear stats.  NumDirectives = NumDefined = NumUndefined = NumPragma = 0;  NumIf = NumElse = NumEndif = 0;  NumEnteredSourceFiles = 0;  NumMacroExpanded = NumFnMacroExpanded = NumBuiltinMacroExpanded = 0;  NumFastMacroExpanded = NumTokenPaste = NumFastTokenPaste = 0;  MaxIncludeStackDepth = 0;  NumSkipped = 0;  // Default to discarding comments.  KeepComments = false;  KeepMacroComments = false;  SuppressIncludeNotFoundError = false;  // Macro expansion is enabled.  DisableMacroExpansion = false;  MacroExpansionInDirectivesOverride = false;  InMacroArgs = false;  InMacroArgPreExpansion = false;  NumCachedTokenLexers = 0;  PragmasEnabled = true;  ParsingIfOrElifDirective = false;  PreprocessedOutput = false;  CachedLexPos = 0;  // We haven't read anything from the external source.  ReadMacrosFromExternalSource = false;  // "Poison" __VA_ARGS__, which can only appear in the expansion of a macro.  // This gets unpoisoned where it is allowed.  (Ident__VA_ARGS__ = getIdentifierInfo("__VA_ARGS__"))->setIsPoisoned();  SetPoisonReason(Ident__VA_ARGS__,diag::ext_pp_bad_vaargs_use);  // Initialize the pragma handlers.  RegisterBuiltinPragmas();  // Initialize builtin macros like __LINE__ and friends.  RegisterBuiltinMacros();  ...

然后是各个成员变量
构造函数调用完成后是Preprocessor的Initialize函数和InitializePreprocessor函数,上次说过了。


在之后还有多个用于填充成员的函数,createPreprocessor完成后返回到上一级

    if (Act.BeginSourceFile(*this, FIF)) {      Act.Execute();      Act.EndSourceFile();

开始执行Act.Execute

bool FrontendAction::Execute() {  CompilerInstance &CI = getCompilerInstance();  if (CI.hasFrontendTimer()) {    llvm::TimeRegion Timer(CI.getFrontendTimer());    ExecuteAction();  }  else ExecuteAction();  // If we are supposed to rebuild the global module index, do so now unless  // there were any module-build failures.  if (CI.shouldBuildGlobalModuleIndex() && CI.hasFileManager() &&      CI.hasPreprocessor()) {    StringRef Cache =        CI.getPreprocessor().getHeaderSearchInfo().getModuleCachePath();    if (!Cache.empty())      GlobalModuleIndex::writeIndex(CI.getFileManager(),                                    CI.getPCHContainerReader(), Cache);  }  return true;}

调用ExecuteAction
ExecuteAction首先检查文件是不是中间语言IR File,然后是不是一些其他标识符,如果是一个正常的AST,执行

  this->ASTFrontendAction::ExecuteAction();

在确保该有的东西都有的情况下执行

  ParseAST(CI.getSema(), CI.getFrontendOpts().ShowStats,           CI.getFrontendOpts().SkipFunctionBodies);

在ParseAST中生成一个Parse实例

  std::unique_ptr<Parser> ParseOP(      new Parser(S.getPreprocessor(), S, SkipFunctionBodies));  Parser &P = *ParseOP.get();

然后调用Initialize函数,Initialize函数调用ConsumeToken
ConsumeToken函数的注释这样说到:
ConsumeToken - Consume the current ‘peek token’ and lex the next one.
开始我们通常认为的词法分析了,到这里又看到了之前一直关注的Preprocessor

  SourceLocation ConsumeToken() {    assert(!isTokenSpecial() &&           "Should consume special tokens with Consume*Token");    PrevTokLocation = Tok.getLocation();    PP.Lex(Tok);    return PrevTokLocation;  }
void Preprocessor::Lex(Token &Result) {  // We loop here until a lex function returns a token; this avoids recursion.  bool ReturnedToken;  do {    switch (CurLexerKind) {    case CLK_Lexer:      ReturnedToken = CurLexer->Lex(Result);      break;    case CLK_PTHLexer:      ReturnedToken = CurPTHLexer->Lex(Result);      break;    case CLK_TokenLexer:      ReturnedToken = CurTokenLexer->Lex(Result);      break;    case CLK_CachingLexer:      CachingLex(Result);      ReturnedToken = true;      break;    case CLK_LexAfterModuleImport:      LexAfterModuleImport(Result);      ReturnedToken = true;      break;    }  } while (!ReturnedToken);  if (Result.is(tok::code_completion))    setCodeCompletionIdentifierInfo(Result.getIdentifierInfo());  LastTokenWasAt = Result.is(tok::at);}

CurLexer是Preprocessor重要的成员变量

  /// \brief The current top of the stack that we're lexing from if  /// not expanding a macro and we are lexing directly from source code.  ///  /// Only one of CurLexer, CurPTHLexer, or CurTokenLexer will be non-null.  std::unique_ptr<Lexer> CurLexer;

这个成员变量开始是一个为空的对象
在Parser.Initialize调用的上一行

  S.getPreprocessor().EnterMainSourceFile();
/// EnterMainSourceFile - Enter the specified FileID as the main source file,/// which implicitly adds the builtin defines etc.void Preprocessor::EnterMainSourceFile() {  // We do not allow the preprocessor to reenter the main file.  Doing so will  // cause FileID's to accumulate information from both runs (e.g. #line  // information) and predefined macros aren't guaranteed to be set properly.  assert(NumEnteredSourceFiles == 0 && "Cannot reenter the main file!");  FileID MainFileID = SourceMgr.getMainFileID();  // If MainFileID is loaded it means we loaded an AST file, no need to enter  // a main file.  if (!SourceMgr.isLoadedFileID(MainFileID)) {    // Enter the main file source buffer.    EnterSourceFile(MainFileID, nullptr, SourceLocation());
/// EnterSourceFile - Add a source file to the top of the include stack and/// start lexing tokens from it instead of the current buffer.bool Preprocessor::EnterSourceFile(FileID FID, const DirectoryLookup *CurDir,                                   SourceLocation Loc) {  assert(!CurTokenLexer && "Cannot #include a file inside a macro!");  ++NumEnteredSourceFiles;  if (MaxIncludeStackDepth < IncludeMacroStack.size())    MaxIncludeStackDepth = IncludeMacroStack.size();  if (PTH) {    if (PTHLexer *PL = PTH->CreateLexer(FID)) {      EnterSourceFileWithPTH(PL, CurDir);      return false;    }  }  // Get the MemoryBuffer for this FID, if it fails, we fail.  bool Invalid = false;  const llvm::MemoryBuffer *InputFile =     getSourceManager().getBuffer(FID, Loc, &Invalid);  if (Invalid) {    SourceLocation FileStart = SourceMgr.getLocForStartOfFile(FID);    Diag(Loc, diag::err_pp_error_opening_file)      << std::string(SourceMgr.getBufferName(FileStart)) << "";    return true;  }  if (isCodeCompletionEnabled() &&      SourceMgr.getFileEntryForID(FID) == CodeCompletionFile) {    CodeCompletionFileLoc = SourceMgr.getLocForStartOfFile(FID);    CodeCompletionLoc =        CodeCompletionFileLoc.getLocWithOffset(CodeCompletionOffset);  }  EnterSourceFileWithLexer(new Lexer(FID, InputFile, *this), CurDir);  return false;}

在返回前,调用EnterSourceFileWithLexer,第一个参数新构造一个Lexer

/// EnterSourceFileWithLexer - Add a source file to the top of the include stack///  and start lexing tokens from it instead of the current buffer.void Preprocessor::EnterSourceFileWithLexer(Lexer *TheLexer,                                            const DirectoryLookup *CurDir) {  // Add the current lexer to the include stack.  if (CurPPLexer || CurTokenLexer)    PushIncludeMacroStack();  CurLexer.reset(TheLexer);  ...

这个新构造的Lexer就会成为CurLexer所指向的对象


到这里也就知道为什么说Preprocessor,预处理器,是clang词法分析的核心了。Preprocessor循环调用词法分析器的Lex,解析出一个一个Token,从而完成的词法分析。而这个过程是在Parse语法分析器的init过程中完成的。

0 0
原创粉丝点击