after read mono

来源:互联网 发布:dubstep软件官网 编辑:程序博客网 时间:2024/06/06 10:40
use the token () method from the cs-tokenizer.cs , i can recognize how the csharp compiler to work.follow this,that is the most important part of the driver :static void tokenize_file (SourceFile file)firstly ,the mcs used some expression to replace the general parser which is got from book.(like as " if ( is_identifier || is_identifier_numeric){...} ");Each time a token is returned, the location for the token isrecorded into the `Location' property, that can be accessed bythe parser. The parser retrieves the Location properties asit builds its internal representation to allow the semanticanalysis phase to produce error messages that can pin pointthe location of the problem. Some tokens have values associated with it, for example whenthe tokenizer encounters a string, it will return aLITERAL_STRING token, and the actual string parsed will beavailable in the `Value' property of the tokenizer. The samemechanism is used to return integers and floating pointnumbers. ////i can not understand that why design the location.//** LocationsLocations are encoded as a 32-bit number (the Locationstruct) that map each input source line to a linear number.As new files are parsed, the Location manager is informed ofthe new file, to allow it to map back from an int constant toa file + line number.Prior to parsing/tokenizing any source files, the compilergenerates a list of all the source files and then reserves thelow N bits of the location to hold the source file, where N islarge enough to hold at least twice as many source files as werespecified on the command line (to allow for a #line in each file).The upper 32-N bits are the line number in that file.The token 0 is reserved for ``anonymous'' locations, ie. if wedon't know the location (Location.Null).The tokenizer also tracks the column number for a token, butthis is currently not being used or encoded. It couldprobably be encoded in the low 9 bits, allowing for columnsfrom 1 to 512 to be encoded.