Debugging WebKit with GDB 7.x

Posted by admin on May 7, 2010 in 浏览器 |

如需转载,请注明出处!
WebSite: http://www.jjos.org/
作者: 姜江 linuxemacs@gmail.com
QQ: 457283

从GDB 7.0版本开始逐渐加入了Python脚本的支持(可以通过Python直接访问Frame,Block等信息)。最近,使用该特性编写了一个WebKit
调试脚本。

1. 下载安装最新版本的GDB

按照GDB官方文档的说明
(http://sourceware.org/gdb/current/onlinedocs/gdb/Python.html#Python)7.1版本已经支持
所有手册上定义的Python对象。但是在实际的使用中,发现有部分对象的方法和属性是没有实现的,比如gdb.Frame.block()函数,在官方发布的GDB
7.1中是不存在的。因此,要使用完整的功能,请下载并编译cvs版本:

a) 获取源码

cvs -d :pserver:anoncvs@sourceware.org:/cvs/src login
{enter “anoncvs” as the password}
cvs -d :pserver:anoncvs@sourceware.org:/cvs/src co gdb

b) 编译GDB并支持python

./configure –with-python
make && make install

2. 使用webkit-tool.py (见附件)

有三种方式可以加载webkit-tool.py

a) 启动gdb后,使用source命令

source /home/jelly/Work/gdb/webkit-tool.py

b) 在调试目标目录下将脚本更名为xxx-gdb.py,然后启动gdb

比如要调试GTKLauncher则将webkit-tool.py更名成GTKLauncher-gdb.py并且放在同一个目录下

c) 在.gdbinit中导入脚本

python
import sys
sys.path.insert(0, “/path/to/tools/gdb/”)
import webkit-tool

3. 调试方法

a) webkit-dump-document

Dump指定的Document对象
由于Document对象中成员变量很多,并不是每个成员变量都会被DUMP出来,这里使用WEBCORE_DOCUMENT_MEMBER_VAR变量限定了
DUMP的内容。如果需要支持更多的成员变量DUMP,可以将成员变量名称字符串,添加到WEBCORE_DOCUMENT_MEMBER_VAR中。
由于变量大多数都是WebCore中定义的对象,而如何描述和解析这些对象GDB本身是不知道的。因此,需要将对象的结构描述和对象名称对应起来。比如,需要支持
WebCore::Node对象的DUMP,可以先创建一个WebCore::Node的序列对象,然后将类名称字符串和这个序列对象关联起来,保存在
DUMP_CLASS_TABLE全局变量中。
usage: webkit_dump-document this

b) webkit-dump-node

DUMP指定的Node对象
usage: webkit-dump-node this

c) webkit-read-breakpoint-from-file

从指定的文件中读取并且设置断点
usage:  webkit-read-breakpoint-from-file /tmp/bt.txt
目前支持两种格式:
namespace::class::function
file:line
e.g.
WebCore::Node::hasTagName()
/home/jelly/work/WebKit-git/WebCore/dom/Node.cpp:222

d) webkit-backtrace

和breaktrace类似,只是以更规整的方式倒序输出
usage: webkit-backtrace
main [../../WebKitTools/GtkLauncher/main.c:209]
0x4aa8a419
0x4a699b9f
0x4a69956b
0x4a6a654b
0x43ab4ba6
0xdb3422

e) webkit-setup-collect-node

初始化Node对象收集,每次GDB回话只需要运行一次即可
usage: webkit-setup-collect-node

f) webkit-get-node-objects

获得当前创建的所有Node信息
usage: webkit-get-node-object [filename]
如果指定了filename则会将数据输出到文件中,否则只是输出到console

4. 样列

比如,我想知道浏览器打开http://www.google.com.hk时,引擎中创建了多少Node(所有的HTMLElement都是继承Node)。首先,
启动GDB,并且导入python调试脚本。然后将断点设置到FrameLoader::fnishedLoading(),或者其他你认为比较重要的观测点。接着,
使用webkit-setup-collect-node初始化节点收集变量。最后,run
当碰到断点时,或者运行过程中你强行使用CTRL + C暂停调试时,可以用webkit-get-node-objects或者webkit-get-node-
objects /tmp/2.log获取引擎中所有存在的节点信息。
我用以上方式抓取打开google和baidu首页时,webkit引擎所创建的Node对象信息可以访问下面这个连接:
在使用过程中发现什么问题,或有什么好的提议可以告诉我:)

###################################################################
# The author of this script is Jiang Jiang <linuxemacs@gmail.com>
# Copyright (C) 2010 by http://www.jjos.org
#                       http://blog.csdn.net/jznsmail
#
# Permission to use, copy, modify, and distribute this script for
# any purpose without fee this hereby granted, provided that this
# entire notice is included in all copies of any software which is
# or includes a copy or modification of this script and in all
# copies of the supporting documentation for such software.
#
# Version: 0.1
#
# Note:
# How to use this gdb script for WebKit debugging?
# 1. you must checkout the latest gdb source code from cvs as below
# cvs -d :pserver:anoncvs@sourceware.org:/cvs/src login
# {enter "anoncvs" as the password}
# cvs -d :pserver:anoncvs@sourceware.org:/cvs/src co gdb
# 2. build latest gdb with python support
# cd gdb-src
# ./configure --with-python
# 3. set up gdb environment
# there are three methods for setup
# a) rename this script to binary-gdb.py (binary replace with your
#    target for debugging)
# b) add this  to  ~/.gdbinit file as follows:
#    python
#    import sys
#    sys.path.insert(0, "/path/to/tools/gdb/")
#    import webkit-tool
# c) start gdb and read this file with source command as follows:
#    source /path/to/tools/gdb/webkit-tool.py
#
# Todo:
# 1. support dump object info to file
# 2. support graphviz for drawing diagram
# 3. support automatically record dump information with file
###################################################################
import gdb
import sys
import re

# WebCore::KURL
WEBCORE_KURL_MEMBER_VAR = [
    'm_string'
]

# WebCore::Frame
WEBCORE_FRAME_MEMBER_VAR = [
    'm_view',
    'm_doc'
]

# WebCore::Document
WEBCORE_DOCUMENT_MEMBER_VAR = [
    'm_frame',
    'm_url',
    'm_docLoader',
    'm_docID'
]

# WebCore::FrameView
WEBCORE_FRAMEVIEW_MEMBER_VAR = [
#    'm_size',
#    'm_margins',
    'm_frame',
#    'm_doFullRepaint',
#    'm_canHaveScrollbars',
#    'm_useSlowRepaints',
#    'm_isOverlapped',
#    'm_contentIsOpaque',
#    'm_slowRepaintObjectCount',
#    'm_layoutTimer',
#    'm_delayedLayout',
#    'm_layoutSchedulingEnabled',
#    'm_midLayout',
#    'm_layoutCount',
#    'm_nestedLayoutCount',
#    'm_postLayoutTasksTimer',
#    'm_firstLayoutCallbackPending',
#    'm_firstLayout',
#    'm_isTransparent',
#    'm_baseBackgroundColor',
#    'm_lastLayoutSize',
#    'm_lastZoomFactor',
#    'm_mediaType',
#    'm_enqueueEvents',
#    'm_overflowStatusDirty',
#    'm_horizontalOverflow',
#    'm_verticalOverflow',
#    'm_wasScrolledByUser',
#    'm_inProgrammaticScroll',
#    'm_deferringRepaints',
#    'm_repaintCount',
#    'm_repaintRects',
#    'm_deferredRepaintTimer',
#    'm_deferredRepaintDelay',
#    'm_lastPaintTime',
#    'm_shouldUpdateWhileOffscreen',
#    'm_deferSetNeedsLayouts',
#    'm_setNeedsLayoutWasDeferred',
#    'm_nodeToDraw',
#    'm_paintBehavior',
#    'm_isPainting',
#    'm_isVisuallyNonEmpty',
#    'm_firstVisuallyNonEmptyLayoutCallbackPending',
#    'm_maintainScrollPositionAnchor',
]

# WebCore::Node
WEBCORE_NODE_MEMBER_VAR = [
    "m_document",
    "m_inDocument",
    "m_document",
    "m_previous",
    "m_next",
    "m_renderer",
    "m_styleChange",
    "m_hasId",
    "m_hasClass",
    "m_attached",
    "m_childNeedsStyleRecalc",
    "m_inDocument",
    "m_isLink",
    "m_active",
    "m_hovered",
    "m_inActiveChain",
    "m_inDetach",
    "m_hasRareData",
    "m_parsingChildrenFinished",
]

DUMP_CLASS_TABLE = {
    "WebCore::Frame":WEBCORE_FRAME_MEMBER_VAR,
    "WebCore::Document":WEBCORE_DOCUMENT_MEMBER_VAR,
    "WebCore::FrameView":WEBCORE_FRAMEVIEW_MEMBER_VAR,
    "WebCore::KURL":WEBCORE_KURL_MEMBER_VAR,
    "WebCore::Node":WEBCORE_NODE_MEMBER_VAR,
}

#===========================================================
# Get the pointer of class member variable
# @param thisPtr - the this pointer of specified class
# @param varName - the name of variable
# @return if success then return the pointer of variable
#         else return None
#===========================================================
def getMemberPtr(thisPtr, varName=""):
    try:
        ptr = thisPtr[varName]
        if ptr.type.code == gdb.TYPE_CODE_STRUCT and \
           re.match(r'WTF::(Ref|Own)Ptr<.*>', str(ptr.type)):
            return ptr['m_ptr']
        return ptr
    except RuntimeError:
        if thisPtr.type.code == gdb.TYPE_CODE_STRUCT:
            return thisPtr.address
#        print "No such member variables: %s [%s]" % (varName, thisPtr.type)
        return None

#===========================================================
# Dump the variable of specified class
# @param varName - the variable name
# @param memberPtr - the pointer of variable
# @param prefixStr - the format string for output
# @param depth - the max depth for recursed dump
#===========================================================
def dumpMemberVariable(varName, memberPtr, prefixStr="\t", depth=10):

    if depth <= 0 or str(memberPtr) == "0x0":
        return

    className = hasClassSymbols(memberPtr)
    if None == className:
        return

    if className == "WebCore::String":
        print "%s%s: %s [%s]" %(prefixStr, varName, memberPtr['m_impl']['m_ptr']['m_data'], className)
        return
    elif className == "int" or \
         className == "bool":
        print "%s%s: %s [%s]" % (prefixStr, varName, memberPtr, className)
        return

    if None == getMemberPtr(memberPtr):
        print "%s%s (%s) [%s]" % (prefixStr, varName, memberPtr, str(memberPtr.type))
    else:
        print "%s%s (%s) [%s]" % (prefixStr, varName, getMemberPtr(memberPtr), str(memberPtr.type))
    depth -= 1
    symbols = DUMP_CLASS_TABLE[className]
    for item in symbols:
        ptr = getMemberPtr(memberPtr, item)
        if (None != ptr):
            dumpMemberVariable(item, ptr, prefixStr+"\t", depth)
    pass

#===========================================================
# Dump specified object
# @param className - the class name
# @param thisPtr - the this pointer of specified class
# @param prefixStr - the format string for output
#===========================================================
def dumpObject(className, thisPtr, prefixStr="\t", depth=5):
    symbols = DUMP_CLASS_TABLE[className]
    for item in symbols:
        memberPtr = getMemberPtr(thisPtr, item)
        if None != memberPtr:
            dumpMemberVariable(item, memberPtr, prefixStr, depth)
    pass

#===========================================================
# Utility function for check symbols table
# @param ptr - the pointer of object
# @return if success then return the name of symbol else
#         return None
#===========================================================
def hasClassSymbols(ptr):
    className = str(ptr.type)
    if ptr.type.code == gdb.TYPE_CODE_PTR: # pointer type
        className = str(ptr.type.target())
        if className.startswith("const "):
            className = className[6:]
    if True == DUMP_CLASS_TABLE.has_key(className):
        return className
    if className == "WebCore::String":
        return className
    if className == "int":
        return className
    if className == "bool":
        return className
#    print "@"+className
    return None

#===========================================================
# webkit-dump-document commands for WebKit debugging
# usage: webkit-dump-document pointer
#===========================================================
class WebKitDumpDocument(gdb.Command):
    """
    Dump document object
    """
    def __init__(self):
        super(WebKitDumpDocument, self).__init__("webkit-dump-document",
                                                 gdb.COMMAND_SUPPORT,
                                                 gdb.COMPLETE_NONE,
                                                 True);
    def invoke(self, arg, from_tty):
        try:
            callFrame = gdb.selected_frame()
            if arg == "": # use default value
                arg = 'this'
            ptr = callFrame.read_var(arg)
            print "%s (%s) [%s]" % ("WebCore::Document", ptr, callFrame.find_sal())
            # could not found any symbols for specified class
            className = hasClassSymbols(ptr)
            if None == className:
                return
            dumpObject(className, ptr, "\t")
        except ValueError:
            print "No such variables: %s" % arg
            return

#===========================================================
# webkit-dump-node commands for WebKit debugging
# usage: webkit-dump-node pointer
#===========================================================
class WebKitDumpNode(gdb.Command):
    """
    Dump Node object
    """
    def __init__(self):
        super(WebKitDumpNode, self).__init__("webkit-dump-node",
                                             gdb.COMMAND_SUPPORT,
                                             gdb.COMPLETE_NONE,
                                             True);
    def invoke(self, arg, from_tty):
        try:
            callFrame = gdb.selected_frame()
            if arg == "": # use default value
                arg = 'this'
            ptr = callFrame.read_var(arg)
            print "%s (%s) [%s]" % ("WebCore::Node", ptr, callFrame.find_sal())
            # could not found any symbols for specified class
            className = hasClassSymbols(ptr)
            if None == className:
                return
            dumpObject(className, ptr, "\t")
        except ValueError:
            print "No such variables: %s" % arg
            return

#===========================================================
# webkit-read-bt-file
# Read break point information from file
# usage: webkit-read-bt-file file
#===========================================================
class WebKitReadBreakpointFile(gdb.Command):
    """
    Read breakpoint from file
    """
    def __init__(self):
        super(WebKitReadBreakpointFile, self).__init__("webkit-read-breakpoint-from-file",
                                             gdb.COMMAND_SUPPORT,
                                             gdb.COMPLETE_NONE,
                                             True);
    def invoke(self, arg, from_tty):
        try:
            if arg == "":
                print "no input file"
                return
            btFile = open(arg, "r")
            while 1:
                line = btFile.readline()
                if not line: break
                self.parse(line)
            btFile.close()
            pass
        except:
            print "Can not open %s file" % arg
            return

    def parse(self, line):
        gdb.execute("break " + line.strip())

#===========================================================
# Dump backtrace info
#===========================================================
class WebKitBacktrace(gdb.Command):
    """
    Dump Node object
    """
    def __init__(self):
        super(WebKitBacktrace, self).__init__("webkit-backtrace",
                                             gdb.COMMAND_SUPPORT,
                                             gdb.COMPLETE_NONE,
                                             True);
    def invoke(self, arg, from_tty):
        callFrame = gdb.selected_frame()
        tmpCalls = [""]
        while (callFrame != None):
            if callFrame.is_valid():
                if callFrame.function() == None:
                    tmpCalls.append("0x%x" % (callFrame.pc()))
                else:
                    tmpCalls.append("%s [%s:%s]" % (callFrame.function(), callFrame.find_sal().symtab.filename, callFrame.find_sal().line))
            callFrame = callFrame.older()
        callLen = len(tmpCalls)
        calls = []
        i = callLen - 1
        while i >= 1:
            tabsN = callLen - i - 1
            fmt = ""
            while tabsN > 0:
                fmt += " "
                tabsN -= 1;
            calls.append(fmt + tmpCalls[i])
            i -= 1
        for item in calls:
            print item

NODE_LIVE_OBJECTS = {}

#===========================================================
# utility function to print stack
#===========================================================
def getBacktraceStr(frame):
    callFrame = frame
    tmpCalls = [""]
    while (callFrame != None):
        if callFrame.is_valid():
            if callFrame.function() == None:
                tmpCalls.append("0x%x" % (callFrame.pc()))
            else:
                tmpCalls.append("%s [%s:%s]" % (callFrame.function(), callFrame.find_sal().symtab.filename, callFrame.find_sal().line))
            callFrame = callFrame.older()
        else:
            continue
    callLen = len(tmpCalls)
    calls = []
    i = callLen - 1
    while i >= 1:
        tabsN = callLen - i - 1
        fmt = ""
        while tabsN > 0:
            fmt += " "
            tabsN -= 1;
        calls.append(fmt + tmpCalls[i])
        i -= 1
    return_str = ""
    for item in calls:
        return_str += item + "\n"
    return return_str

#===========================================================
# Utility function to get this pointer from frame
#===========================================================
def getThisPointer(frame):
    """
    Find the this pointer of the frame
    """
    for sym in frame.block():
        if not sym.is_argument:
            continue
        if sym.print_name == "this":
            return frame.read_var(sym)
    return None

#===========================================================
# Live Node object
#===========================================================
class WebKitNodeLiveObject:
    """
    Runtime Node object
    """
    def __init__(self, backtrace, addr):
        self.addr = addr
        self.backtrace = backtrace
    def __repr__(self):
        gdb.execute("print (WebCore::Node*)%s" % self.addr)
        return_val = ""
        node_obj = gdb.parse_and_eval("(WebCore::Node*) %s" % self.addr)
        return_val += "\tprevious: %s\n" % node_obj.dereference()['m_previous']
        return_val += "\tnext: %s\n" % node_obj.dereference()['m_next']
        return_val += "\trender: %s\n" % node_obj.dereference()['m_renderer']
        return_val += "\tdocument: %s\n" % node_obj.dereference()['m_document']
        return_val += "\tactive: %s\n" % node_obj.dereference()['m_active']
        return return_val

#===========================================================
# webkit-setup-collect-node command to prepare recording
# initialization
#===========================================================
class WebKitNodeCollectSetup(gdb.Command):
    def __init__(self):
        super(WebKitNodeCollectSetup, self).__init__("webkit-setup-collect-node",
                                                     gdb.COMMAND_SUPPORT,
                                                     gdb.COMPLETE_NONE,
                                                     True)
    def invoke(self, arg, from_tty):
        NODE_LIVE_OBJECTS = {}
        gdb.execute("break WebCore::Node::Node() if $_webkit_node_object_create_object()")
        gdb.execute("break WebCore::Node::~Node() if $_webkit_node_object_delete_object()")

#===========================================================
# webkit-get-node-objects commad to get all recording node
#===========================================================
class WebKitGetNodeObjects(gdb.Command):
    """
    Dump Node object
    """
    def __init__(self):
        super(WebKitGetNodeObjects, self).__init__("webkit-get-node-objects",
                                             gdb.COMMAND_SUPPORT,
                                             gdb.COMPLETE_NONE,
                                             True);
    def invoke(self, arg, from_tty):
        if arg != "":
            gdb.execute("set logging file %s" % arg)
            gdb.execute("set logging on")
        for obj in NODE_LIVE_OBJECTS.keys():
            print NODE_LIVE_OBJECTS[obj]
        if arg != "":
            gdb.execute("set logging off")

#===========================================================
# internal function to recording created node object
#===========================================================
class WebKitNodeObjectCreated(gdb.Function):
    def __init__(self):
        super(WebKitNodeObjectCreated, self).__init__("_webkit_node_object_create_object")
    def invoke(self):
        thisPtr = getThisPointer(gdb.selected_frame())
        if thisPtr and NODE_LIVE_OBJECTS.has_key(str(thisPtr)) == False:
            NODE_LIVE_OBJECTS[str(thisPtr)] = WebKitNodeLiveObject(gdb.selected_frame(), thisPtr)
        return False

#===========================================================
# internal function for recording removed node object
#===========================================================
class WebKitNodeObjectDeleted(gdb.Function):
    def __init__(self):
        super(WebKitNodeObjectDeleted, self).__init__("_webkit_node_object_delete_object")
    def invoke(self):
        thisPtr = getThisPointer(gdb.selected_frame())
        if thisPtr and NODE_LIVE_OBJECTS.has_key(str(thisPtr)) == True:
            print "DEL: %s" % (str(thisPtr))
            del NODE_LIVE_OBJECTS[str(thisPtr)]
        return False

def initialize():
    gdb.execute("set print object on")
    WebKitNodeCollectSetup()
    WebKitGetNodeObjects()
    WebKitNodeObjectCreated()
    WebKitNodeObjectDeleted()
    WebKitBacktrace()
    WebKitReadBreakpointFile()
    WebKitDumpDocument()
    WebKitDumpNode()

initialize()
~~~ END ~~~

Post Footer automatically generated by wp-posturl plugin for wordpress.

Tags: , , ,

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Copyright © 2010 Jelly's Blog All rights reserved.
Desk Mess Mirrored v1.7.2 theme from BuyNowShop.com.

Jelly's Blocnzz&51la for wordpress,cnzz for wordpress,51la for wordpress