Clang AST parsing for automated code generation
来源:互联网 发布:斯维尔是什么软件 编辑:程序博客网 时间:2024/06/06 01:25
原文地址:http://www.seethroughskin.com/blog/?p=2172
Syntax traversal is a powerful tool. With it you can automate repetitive tasks, search for semantic errors, generate wrappers, and so much more. A few months ago I hit a hump (read: a f***ing mountain) of an issue with some legacy code that has been on my plate for awhile now.
Having killed a small forest’s worth of paper I decided that manually tracing paths through code was an inefficient use of my time. Instead I went in search an automatic method for generating an abstract syntax tree(AST) for C++ code. My idea was that I could use the AST to generate something like a direct graph to better visualize code flow.
There are a few flavors of readable syntax generation out there (and likely more):
- pycparser (Supports on C, I believe)
- gccxml
- Clang (via AST dump)
I’ve been a fan of Clang for awhile now and they have a very robust and active community making it a natural choice for my AST generation needs. Clang also has decent articles on getting started in both Windows and Linux. If you don’t have Clang installed, I suggest reading that linked article. You’ll need compiled versions of clang.exe and libclang.dll to follow along with the Python binding below.
[Caveat]
Clang at revision 183352 (2013-06-05) has a slight issue in that it won’t identify Linkage specifications (e.g. extern “C” void foo()). To fix this issue, follow these steps from my SO answer:
//Bit of a necroanswer but if you go in to \llvm\tools\clang\lib\Sema\SemaCodeComplete.cpp and add the following line: case Decl::LinkageSpec: return CXCursor_LinkageSpec; //To the switch in:CXCursorKind clang::getCursorKindForDecl(const Decl *D) //It should resolve the issue of clang's Python binder //returning UNEXPOSED_DECL instead of the correct LINKAGE_SPEC.//This change was made at revision 183352(2013-06-05). //Example from my version:CXCursorKind clang::getCursorKindForDecl(const Decl *D) {if (!D) return CXCursor_UnexposedDecl; switch (D->getKind()) { case Decl::Enum: return CXCursor_EnumDecl; case Decl::LinkageSpec: return CXCursor_LinkageSpec; // ......
[Libclang]
Libclang is Clang’s dynamic binding that is used in conjunction w/ Python to allow for interpreted code evaluation. Eli Bendersky has a great post on using libclang that I referenced frequently while writing code. Clang documentation can be very lacking in some areas and Eli’s post does a good job of explaining the steps to getting libclang working with Python. If you follow his steps the basic pipeline is:
- Compile libclang
- Add libclang to your PATH environment variable
- On *Nix it’s LD_LIBRARY_PATH
- On Windows it’s the standard PATH
- Or do it in python: os.environ['PATH'] = ‘/path/to/libclang’
- Copy the Clang/Python bindings from /llvm/tools/clang/bindings/python to your python installation or however you’d prefer to install it.
- Verify it works by opening a python console and typing: improt clang.cindex
- Squee when it works
[Example]
Once libclang is tied to Python it’s time to test your code. When I got to this step I had trouble finding any good examples. There are really only 2 and they can be found in your Clang installation folder: llvm\tools\clang\bindings\python\examples\cindex. Others can be gleaned from blog posts and StackOverflow. Here is a simple example I adapted that looks specifically for the LINKAGE_SPEC cursor type. LINKAGE_SPEC refers to code like `extern “C”`
#!/usr/bin/env python import osimport sysfrom pprint import pprintimport clang.cindexos.environ['PATH'] = os.environ['PATH'] + os.getcwd() def get_info(node, depth=0):return { 'kind' : node.kind, 'usr' : node.get_usr(), 'spelling' : node.spelling, 'location' : node.location, 'extent.start' : node.extent.start, 'extent.end' : node.extent.end, 'is_definition' : node.is_definition()} def output_cursor_and_children(cursor, level=0): #LINKAGE_SPEC (http://clang.llvm.org/doxygen/classclang_1_1LinkageSpecDecl.html)#Represents code of the type: extern "C" void foo()if cursor.kind == clang.cindex.CursorKind.LINKAGE_SPEC:pprint(('nodes', get_info(cursor))) # Recurse for children of this cursorhas_children = False;for c in cursor.get_children():if not has_children:has_children = Trueoutput_cursor_and_children(c, level+1) def main():from clang.cindex import Indexfrom pprint import pprint from optparse import OptionParser, OptionGroup global opts parser = OptionParser("usage: %prog {filename} [clang-args*]")parser.disable_interspersed_args()(opts, args) = parser.parse_args() if len(args) == 0:print 'invalid number arguments' index = Index.create()tu = index.parse(None, args) if not tu:print "unable to load input" output_cursor_and_children(tu.cursor) if __name__ == '__main__': main()
#include "test.h" int main(){Foo f;return 0;}
#ifndef TEST_H#define TEST_H class Foo {int data_;public:Foo(){} void bar(int data){data_ = data;}}; extern "C" __declspec( dllexport )void test1(){} #endif
How to run:
python linkage_dump.py test.cpp
[Conclusion]
There are so many other ways to make use of ASTs and I wish I had more time to include some of them. Suffice it to say I’ll probably end up posting about ASTs a few more times. At least until I work through enough examples to meet my immediate needs.
- Clang AST parsing for automated code generation
- Automated Level of Detail Generation for Halo:Reach
- Exception occurred during code generation for WSDL
- Automated Test Generation for Access Control Policies via Change-Impact Analysis
- Clang AST 介绍 (updating)
- Clang之语法抽象树AST
- Clang之语法抽象树AST
- Parsing C++ in Python with Clang
- Parsing C++ in Python with Clang
- Symfony based code generation
- Simple Error Code Generation
- Code Generation in Action
- link time code generation
- code generation的问题
- XIB Code Generation
- Link Time Code Generation
- B. Code Parsing
- hdu B. Code Parsing
- PHP基础加强(第十三天)
- 在无clipboard 的vim 中, 如何使用系统剪切板
- Windows界面编程第六篇 动画启动效果(动画效果显示及隐藏窗口)
- jeasyui datagrid 常用操作
- 关于Dialog 俩次点击报错 :The specified child already has a parent. You must call removeView() on the child's
- Clang AST parsing for automated code generation
- FreeRTOS 复习
- python当中关于字符串的处理
- 如何将cmd中的输入写成文本保存下来
- JAVA计算文件的MD5及SHA1等值
- Effective C++ 条款09 绝不在构造和析构函数中调用虚函数
- 手机网站禁止缩放页面的代码
- 使用CSerialPort类编写串口通信程序的问题
- Linux 添加C++ Man