Android Debugging

来源:互联网 发布:json.parseobject源码 编辑:程序博客网 时间:2024/06/06 09:31

http://omappedia.org/wiki/Android_Debugging


Android Debugging

There are many different ways of debugging various parts of the Android software stack (ie: bootloader, kernel, applications etc.). We will cover a few tools that we have used. Please feel free to update this list or provide more information about other methods that may be available.

Contents

[hide]
  • 1Eclipse ADT
    • 1.1Note on Installing Eclipse Plugins
    • 1.2Debugging on Zoom2 with Eclipse ADT
    • 1.3Troubleshooting
  • 2Debugging with GDB and DDD
    • 2.1GDB (the GNU Debugger)
    • 2.2DDD (Data Display Debugger)
  • 3Lauterbach TRACE32
  • 4CodeComposer
  • 5Selectively Enable Opencore Debug Print
  • 6Experimenting CpuFreq Governors to profile at different ARM-MHz
  • 7Profiling
  • 8OProfile on OMAP3
  • 9OProfile on OMAP4 K35 (pre-ICS)
    • 9.1Note for OProfile on Pandaboard
    • 9.2Changing to Performance Governor
    • 9.3Steps to rebuild Kernel
    • 9.4Steps to rebuild USERSPACE Component
    • 9.5Work-around for recognizing CPU
    • 9.6Installation
    • 9.7Execution
    • 9.8Starting and Stopping the profiler
    • 9.9Generating the Results
    • 9.10Post-process OProfile results
    • 9.11Steps to Re-run OProfile
    • 9.12Other Pointers - OProfile in OMAP4
    • 9.13IDLE Mhz Calculation
    • 9.14Manually Post-process the results (for advanced users)
  • 10OProfile on OMAP4 ICS (K3.0)
    • 10.1Kernel Build Changes
    • 10.2Linux Host Setup
    • 10.3Setup Oprofile on the OMAP4 Target Device
    • 10.4Run Oprofile on the OMAP4 Target Device
    • 10.5Pull the opreport to the Linux Host
  • 11Strace
  • 12Using INST2 for Performance Measurement on DSP
  • 13Debugging segmentation fault
  • 14OPP Level Measurement
  • 15Other Tools & references

[edit]Eclipse ADT

[edit]Note on Installing Eclipse Plugins

Before installing the Android Development Tools, be sure to put your proxy server (if applicable) into the General preferences under Network Connections and to "install" the software update link to the version of Eclipse you have (3.5 is "http://download.eclipse.org/releases/galileo/).

You will likely have to install several required plugins from Eclipse before ADT will install (possibly GEF and WST plugins).

[edit]Debugging on Zoom2 with Eclipse ADT

The Android Development Tools (ADT) plugin for Eclipse adds powerful extensions to the Eclipse integrated development environment. It allows you to create and debug Android applications easier and faster. Details on ADT can be obtained fromhttp://developer.android.com/guide/developing/eclipse-adt.html.

It is assumed that ADT plugin has already been setup to work with Eclipse environment as describedhttp://developer.android.com/sdk/1.1_r1/installing.html#installingplugin.


Step 1: Upon installing the ADT plugin for eclipse, Dalvik Debug Monitor Service (DDMS) should have been setup. DDMS configuration should be changed as in below:

 Click on Window->Preferences; Select Android -> DDMS Change - ADB debugger base port: 8700; Logging Level: Verbose Click on Apply

Step 2: DDMS perspective can now be opened from the eclipse menu via:

 Window -> Open Perspective -> Other -> DDMS;  Click on OK

Step 3: Get Eclipse to attach to your Zoom2 board.

Bootup the zoom2 board and find the IP address of the board. If you havent added ip=dhcp in the bootargs, you can start the ethernet and obtain an IP address using dhcp using following commands

   # netcfg eth0 up   # netcfg eth0 dhcp

Using the command below you can verify that the board did obtain an IP address

  # netcfg

NOTE: If you boot via NFS, then uboot will typically print out the board's IP address to console.


On the host machine run the following commands from terminal shell:

  $ export ADBHOST=<IP_ADDRESS_OF_YOUR_ZOOM2_BOARD>  $ adb kill-server  $ adb start-server

Check if you are now connected to the Zoom2 device by running the following command on the Host Terminal console:

  $ adb devices

It should output something like:

  emulator-5554 device

This confirms that Zoom2 board is connected. With this setup, you should be able to use Android Debug Bridge, Logcat, DDMS and other tools directly from Eclipse ADT environment for creating your applications for Android on Zoom2.

[edit]Troubleshooting

Issue: ADB is not in the path, where should I find it?

Resolution: ADB command line tool is found at: <MYDROID_PATH>/out/host/linux-x86/bin/

Issue: ADB is having a problem connecting over Ethernet.

Resolution: This is because the ADB stub on target defaults to USB. To fix this, in the Zoom console:

      # setprop service.adb.tcp.port 5555


This will avoid ADBD defaulting to USB transport. Restart ADBD on Zoom to take the changed settings.


      # stop adbd      # start adbd

Alternatively, the setprop command can be included in init.rc so that system property is set at start up, before starting ADB stub.

[edit]Debugging with GDB and DDD

The user space programs can be debugged using various debug commands). Here are some gnu apps that can be used to ease the debugging of binary files on the android platform. GDB, allows you to see what is going on `inside' another program while it executes -- or what another program was doing at the moment it crashed.


[edit]GDB (the GNU Debugger)

Following are the instructions to enable GDB on Android:

1. Obtain the IP address of the target. This can be found by adding “ip=dhcp” in the bootargs, which will obtain and print the IP automatically during boot. Alternatively if you have the busybox command line tools available on the target you can type "ifconfig eth0" to obtain the IP address of the target.

2. On the host, perform the following (once per new console window):Go to mydroid directory and run

       source build/envsetup.sh       setpaths       export ADBHOST=<ip addr of target obtained above>

Ensure that above setup works by running

       adb kill-server ; adb shell

You should see a command prompt of the target on your host. Verify this by running "ps" or similar commands. Exit the adb shell by typing “exit”

3. Start GDB using the following command

       gdbclient <executable name> <port number> <task name>       executable name: file name in system/bin dir       port number: default is :5039 (need the colon before the number)       task name: obtained by running "ps" on the target. GDB uses it to identify the PID internally.

E.g. for video playback, use (note the space after mediaserver and colon):

       gdbclient mediaserver :5039 mediaserver

Then you can run commands like “info threads”, “break”, “step” etc.

For a full listing of GDB commands refer to: http://www.yolinux.com/TUTORIALS/GDB-Commands.html


You may have to run the following after each target reboot:

       adb kill-server


[edit]DDD (Data Display Debugger)

DDD is a graphical front-end for GDB and other command-line debuggers like GDB.

Following are the instructions to enable DDD on Android:

DDD.jpg

The steps are almost same as GDB:

1. Obtain the IP address of the target. This can be found by adding "ip=dhcp" in the bootargs, which will obtain and print the IP automatically during boot. Alternatively if you have the busybox command line tools available on the target you can type "ifconfig eth0" to obtain the IP address of the target.

2. Install DDD: in the shell run:

  sudo apt-get install ddd

3. Add the following function to build/envsetup.sh:

 function dddclient(){  local OUT_ROOT=$(get_abs_build_var PRODUCT_OUT)  local OUT_SYMBOLS=$(get_abs_build_var TARGET_OUT_UNSTRIPPED)  local OUT_SO_SYMBOLS=$(get_abs_build_var TARGET_OUT_SHARED_LIBRARIES_UNSTRIPPED)  local OUT_EXE_SYMBOLS=$(get_abs_build_var TARGET_OUT_EXECUTABLES_UNSTRIPPED)  local PREBUILTS=$(get_abs_build_var ANDROID_PREBUILTS)   if [ "$OUT_ROOT" -a "$PREBUILTS" ]; then      local EXE="$1"       if [ "$EXE" ] ; then          EXE=$1      else          EXE="app_process"      fi      local PORT="$2"       if [ "$PORT" ] ; then          PORT=$2      else          PORT=":5039"      fi      local PID      local PROG="$3"       if [ "$PROG" ] ; then          PID=`pid $3`          adb forward "tcp$PORT" "tcp$PORT"          adb shell gdbserver $PORT --attach $PID &          sleep 2      else              echo ""              echo "If you haven't done so already, do this first on the device:"              echo "    gdbserver $PORT /system/bin/$EXE"                  echo " or"              echo "    gdbserver $PORT --attach $PID"              echo ""      fi      echo >|"$OUT_ROOT/gdbclient.cmds" "set solib-absolute-prefix $OUT_SYMBOLS"      echo >>"$OUT_ROOT/gdbclient.cmds" "set solib-search-path $OUT_SO_SYMBOLS"      echo >>"$OUT_ROOT/gdbclient.cmds" "target remote $PORT"      echo >>"$OUT_ROOT/gdbclient.cmds" ""      ddd --debugger arm-eabi-gdb -x "$OUT_ROOT/gdbclient.cmds" "$OUT_EXE_SYMBOLS/$EXE"  else      echo "Unable to determine build system output dir."  fi}

4. On the host, perform the following (once per new console window):Go to mydroid directory and run

       source build/envsetup.sh       setpaths       export ADBHOST=<ip addr of target obtained above>

Ensure that above setup works by running

       adb kill-server ; adb shell

You should see a command prompt of the target on your host. Verify this by running "ps" or similar commands. Exit the adb shell by typing “exit”

5. Start DDD using the following command

       dddclient <executable name> <port number> <task name>       executable name: file name in system/bin dir       port number: default is :5039 (need the colon before the number)       task name: obtained by running "ps" on the target. GDB uses it to identify the PID internally.

E.g. for video playback, use (note the space after mediaserver and colon):

       dddclient mediaserver :5039 mediaserver

For the DDD manual, refer to: http://www.gnu.org/manual/ddd/html_mono/ddd.html

You may have to run the following after each target reboot:

       adb kill-server

[edit]Lauterbach TRACE32

Lauterbach TRACE32 could be used to debug bootloaders, kernel and user space.
Instructions on using Lauterbach TRACE32 for debugging on Zoom2:


Install Lauterbach TRACE32 software on your PC (the below screenshot is from Oct 10 2008 release). Connect emulator cable to J5 (20 pin header) on Zoom2 debug board and power the emulator. Connect USB cable from the emulator to PC

T32 startup.jpg


Run zoom2_startup.cmm script to select your target as OMAP3430 and attach from File -> Run Batchfile. If the script is not run, some of the settings will have to be manually selected from CPU -> System Settings

Ensure that the emulator is “running” by the green status indicator (seen at the bottom of the below screenshot) before exercising any use cases that need to be debugged.

T32 emu status.jpg


Run the use case (ex: audio/video playback)Halt the processor by clicking on the “pause” button and view registers (View -> Registers), list source (View -> List Source) etc.

T32 view.jpg


Make sure to load the symbols for files that you’re interested in debugging and set source path for source code correlation to work correctly. Also you may have to ensure that options such as –g is added during compiling your code to generate symbolic debugging directives. In some instances consider reducing the level of optimization used as the compiler will re-arrange instructions and hence it may be difficult to match the order of execution in the source code.

Examples of setting the source search path and loading symbols:

        sYmbol.SourcePATH.Set "V:\mydroid\kernel\"        data.load.elf V:\mydroid\kernel\vmlinux /nocode /strippart "kernel"

These commands can be directly entered from either the debugger command prompt or by using a *.cmm script.
Adapt changed base directories with the "/strippart" option; do not use recursive directory search, due to performance reasons and equal source file names.

For user space debugging, TRACE32 needs some help as it needs to be told where some of the modules you're interested in debugging are loaded. To do this you will have to run "ps" on the target and get PIDs for the application.

Then run "cat /proc/PID/maps > logfile" where PID is the process ID retrieved from "ps" in the above step. There is an avplayback_symbols.cmm file attached that exhibits how to do this. Below screenshot demonstrates being halted in user space during running of an AV playback use case.

T32 us halt.jpg


zoom2_startup.cmm
avplayback_symbols.cmm

[edit]CodeComposer

This could be used to debug bootloaders. Previous versions of CCS (v3.3 and older) did not contain Linux awareness but it is currently being added to CCSv4. It should be possible to debug the kernel and user space once CCSv4 is released. SeeLinux_Aware_Debug for more information.

[edit]Selectively Enable Opencore Debug Print

To utilize the existing log statements without rebuilding the whole PV library, you can do this:
1. In the beginning of the file, after the last "#include" line, add following:

  1. include <utils/Log.h>
  2. undef LOG_TAG
  3. define LOG_TAG "YOUR_MODULE_NAME"
  4. undef PVLOGGER_LOGMSG
  5. define PVLOGGER_LOGMSG(IL, LOGGER, LEVEL, MESSAGE) JJLOGE MESSAGE
  6. define JJLOGE(id, ...) LOGE(__VA_ARGS__)


2. In the end of the file, add these:

  1. undef PVLOGGER_LOGMSG
  2. define PVLOGGER_LOGMSG(IL, LOGGER, LEVEL, MESSAGE) OSCL_UNUSED_ARG(LOGGER);


You can play with the macro to filter based on level too.

[edit]Experimenting CpuFreq Governors to profile at different ARM-MHz

Below governors are functional in L27x K35 kernel

Performance governor (system in turbo mode. 1008MHz ARM, 200MHz L3)

echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governorecho 1 > /sys/devices/system/cpu/cpu1/online

Ondemand governor (Cpu MHz as per system load)

echo ondemand > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor

Hotplug (ondemand+cpu1 hotplug handling based on cpu0 MHz. enabled by default in L27x)

echo hotplug > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor

UserSpace governor (can be used to profile usecase in different ARM MHz)

echo userspace > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor

In userspace governor, useful commands to switch MPU MHz

# Check Available frequenciescat /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_frequencies300000 600000 800000 1008000
# e.g. Switch to 800Mhzecho 800000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed
# e.g. check CPU0 Frequencycat /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq800000

To control CPU1 manually,

echo 0 > /sys/devices/system/cpu/cpu1/onlineecho 1 > /sys/devices/system/cpu/cpu1/online

[edit]Profiling

There is a simple profiling mechanism implemented in the kernel, implemented by storing the current instruction pointer at each clock tick.

To enable profiling, pass the boot argument

           profile=N 

where N is a number which determines the granularity of profiling. The lesser the number, the more the granularity of profiling.

A busybox utility named 'readprofile' is available to process the profiled data. 'readprofile' requires the kernel symbol table file 'System.map' to resolve the symbols.

To clear the profiled data:

    $readprofile -r

To display the profiled data:

    $readprofile -m /System.map|sort -nr    You should see an output similar to the following:    1415 total                                      0.0003     1153 omap3_enter_idle                           3.3132       28 schedule                                   0.0304       15 omap_i2c_isr                               0.0179       12 v7wbi_flush_user_tlb_range                 0.1579       11 copy_page                                  0.1146       10 __memzero                                  0.0781        8 __copy_to_user                             0.0085        6 update_mmu_cache                           0.0341        5 sub_preempt_count                          0.0260        5 mmc_queue_map_sg                           0.0305        5 handle_IRQ_event                           0.0431        4 unmap_vmas                                 0.0027        4 filemap_fault                              0.0038        4 __do_fault                                 0.0042        3 vsnprintf                                  0.0013        3 up_read                                    0.1500        3 omap_hsmmc_enable_clks                     0.0069        3 kmem_cache_alloc                           0.0208        3 get_page_from_freelist                     0.0025      ....

The first column gives the number of ticks and the last column gives the number of ticks divided by function size.


This profiler covers only the kernel. For system wide profiling and advanced options, OProfile can be used.


[edit]OProfile on OMAP3

OProfile is a system-wide profiler for Linux systems, capable of profiling all running code at low overhead. It consists of a kernel driver and a daemon for collecting sample data, and several post-profiling tools for turning data into information

OProfile is optional component during KERNEL build. It may have been enabled by default. You can confirm that the kernel has OProfile support, by looking for following lines in the <mydroid_folder>/kernel/.config file

       CONFIG_OPROFILE_OMAP_GPTIMER=y       CONFIG_OPROFILE=y       CONFIG_HAVE_OPROFILE=y

Hardware Configuration
The Hardware Configuration required to execute the test cases includes:

       Linux machine (can be with your favorite distro)       TCP/IP configuration on Zoom2 board       Zoom2 Board

Software Configuration
The Software Configuration required to execute the test cases includes:

       Tera Term (or any terminal program)       Graphviz on Linux machine (Use this command on Host terminal         $ sudo apt-get install graphviz       GPROF2DOT python script (Copy the script to any location in your path (e.g. in ~/bin of your Linux machine);        Ensure that ~/bin is exported in the PATH             Run the following command -       $ cd ~/bin && chmod 777 gprof2dot.py

Installation
This step should be done after the android file system has been built.

$MYDROID is the location where the android SDK is installed.eg: export MYDROID=/home/$user/Lxx.x/mydroid


Edit the $MYDROID/external/oprofile/opimport_pull script as follows:

  Remove the python version number from the first line eg. change      #!/usr/bin/python2.4 -E  to      #!/usr/bin/python -E  Append the following lines at the end of the file to generate cpuloads.txt and callgraph.png for further analysis       os.system(oprofile_event_dir + "/bin/opreport --session-dir=. >> cpuloads.txt")       os.system(oprofile_event_dir + "/bin/opreport --session-dir=. -p $OUT/symbols -l -t 0.1 >> cpuloads.txt")       os.system(oprofile_event_dir + "/bin/opreport -cg --session-dir=. -p $OUT/symbols > callgraph.txt")       os.system("cat callgraph.txt | gprof2dot.py -s -w -f oprofile -n 0.1 -e 0.1 | dot -Tpng -o callgraph.png")   

On Eclair we have seen the Android tools opannotate, oparchive, opimport and opreport tools inprebuilt/linux-x86/oprofile/bin/ folder are not working properly.

These binaries (tar balled) from donut are available @ Oprofile.tar.gz. Download this .gz file to $MYDROID folder

     $ cd $MYDROID     $ tar xvf Oprofile.tar.gz

Since we perform the post-processing on host, we don't need the actual vmlinux file (~40 MB) on target. Make sure that you create a dummy file named "vmlinux" in the root directory to satisfy opcontrol arguments.

      # echo 0 > /vmlinux

Execution

Set-up OProfile directories

Make sure that you have created an empty file and named it vmlinux as described in above section. Run the following command on the target

      # opcontrol --setup

By default there should be no output.

In case you see, "Cannot create directory /dev/oprofile: File exists do_setup failed#", it means that, OProfile is not built in the Kernel. Verify that you have selected OProfile in make menuconfig step of Kernel build (Refer Configuration Section)

Initialize the OProfile daemon

The kernel range start and end addresses need to be verified on the setup for each release using:

          # grep " _text" /proc/kallsyms           c0030000 T _text          # grep " _etext" /proc/kallsyms           c03e1000 A _etext      

Note: You need busybox installed for this command to work. Refer here if you haven't set-up busybox.

Using the above addresses, run the following command

      # opcontrol --vmlinux=/vmlinux --kernel-range=0xC0030000,0xC03e1000 --event=CPU_CYCLES:64

You should see the following output on your terminal

          Cannot open /dev/oprofile/1/enabled: No such file or directory           Cannot open /dev/oprofile/2/enabled: No such file or directory           Using 2.6+ OProfile kernel interface. Reading module info.           Using log file /data/oprofile/samples/oprofiled.log           # init: untracked pid 914 exited

Increase the Back trace depth, so that more details can be captured in the log

          # echo 16 > /dev/oprofile/backtrace_depth      

To ensure that everything is ready, you can run the following command

          # opcontrol --status      

The following output should be seen. Note that the PID will change depending on your system.

          Driver directory: /dev/oprofile           Session directory: /data/oprofile       Counter 0:           name: CPU_CYCLES           count: 64       Counter 1 disabled       Counter 2 disabled       oprofiled pid: 915 profiler is not running            0 samples received           0 samples lost overflow

Starting and Stopping the profiler

Run the following command to start the profiler

          # opcontrol --start

and use the command below to stop the profiler

          # opcontrol --stop

Generating the Results
We need to run the following steps on the Host machine (that has android SDK/build) to generate the results.

On command prompt of Host machine (that has android SDK/build), do the following

          $ cd $MYDROID           $ source build/envsetup.sh           $ setpaths           $ export ADBHOST=<ip address of ZOOM2 board>

Note: This should be done @ $MYDROID level (where the build was set-up otherwise, it wouldn't work)

If ADB over Ethernet is not working refer to Troubleshooting here

You can use the following commands to start DHCP on Zoom board and set-up ADB over ethernet

          # netcfg eth0 up ; sleep 3 ; netcfg eth0 dhcp ; sleep 2 ; netcfg           # setprop service.adb.tcp.port 5555 ; stop adbd ; start adbd

Post-process OProfile results

This needs to be done from the PC where Android SDK is installed. Go to the terminal on host PC and do the following:

If you are using OProfile package with pre-build binaries, symbol files and vmlinux, you can follow the steps below:In case, you are using OPROFILE binaries that were not build on your machine, you might have to create a symbolic link to zoom2 folder, since OProfile looks there

          $ mkdir ~/oprofilepackage && cd ~/oprofilepackage           $ tar xvjf <Path_to_Oprofile package>          $ cd mydroid          $ mv ../kernel .          $ source build/envsetup.sh          $ sed -i -e 's_$(call inherit-product, frameworks/base/data/sounds/OriginalAudio.mk)_#$(call inherit-product, frameworks/base/data/sounds/OriginalAudio.mk)_g' build/target/product/full.mk          $ setpaths          $ export MYDROID=${PWD}          $ ln -s $MYDROID/out/target/product/zoom2 $MYDROID/out/target/product/generic

NOTE: The Kernel path needs to be updated for your KERNEL build

          $ cp $MYDROID/kernel/android-2.6.32/vmlinux $OUT/symbols/vmlinux 

Generate the OPROFILE results using the command below

          $ opimport_pull <new_dir_to_store_dump_and_results_on_Linux_machine> 


The following files and the Callgraph image can be referred for OProfile results. They will be generated in the <new_dir_to_store_dump_and_results_on_Linux_machine> in step above

cpuloads.txtcallgraph.txt

Note: If there are some binaries that are compiled on WINDOWS and linked in to your build :( - you will see the message below

          Traceback (most recent call last):            File "/home/user/bin/gprof2dot.py", line 1965, in <module>              Main().main()            File "/home/user/bin/gprof2dot.py", line 1890, in main              self.profile = parser.parse()            File "/home/user/bin/gprof2dot.py", line 1062, in parse              self.parse_entry()            File "/home/user/bin/gprof2dot.py", line 1112, in parse_entry              function = self.parse_subentry()            File "/home/user/bin/gprof2dot.py", line 1136, in parse_subentry              filename, lineno = source.split(':')          ValueError: too many values to unpack          cat: write error: Broken pipe

In this case, you can open callgraph.txt in <new_dir_to_store_dump_and_results_on_Linux_machine> in your favourite editor. Search for ":\" and delete the "<any_letter:>" in front of that line.

For eg. if you have E:\workspaces\ make it \workspaces\

Now, cd to <new_dir_to_store_dump_and_results_on_Linux_machine> and run the following command to generate callgraph.png manually

         # cd <new_dir_to_store_dump_and_results_on_Linux_machine>         # cat callgraph.txt | gprof2dot.py -s -w -f oprofile -n 0.1 -e 0.1 | dot -Tpng -o callgraph.png

The guidelines and caveats while interpreting Oprofile results are available atOprofile source forge Wiki

Callgraph.jpg



[edit]OProfile on OMAP4 K35 (pre-ICS)

We forward ported the K32 patches for OProfile in K35. Please pull in the patches8192 (for the kernel) and 8193 (for Android), per the instructions in the following sections.

Note that the instructions for using OProfile on OMAP4 with a K35 release (pre-ICS) and a K3.0 release (ICS) are slightly different due to the patches needed. Thus, please follow the instructions in the appropriate section below.

[edit]Note for OProfile on Pandaboard

To use OProfile on Pandaboard, you must use a kernel with CONFIG_PM enabled. Otherwise any accesses to CTI unit will cause system hang. The root cause is when PM is enabled, prcm_setup_regs is called, which enables all the DPLL autoidle and autogating. Then, CTI can be initialized normally.

The kernel source is at TI Ubuntu git tree:, ti-ubuntu-2.6.35-ti903.8+ti+release3.pm or newer. To get the patch please check topic "Oprofile on Pandaboard / Omap4" in Pandaboard google group.

[edit]Changing to Performance Governor

It is recommended that you use Performance Governor to ensure that the OPP doesn't change during the Profiling

Command to check the GOVERNOR (on TARGET console, type the following command )

  # cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor

If the output of the above command is "ondemand" or "hotplug" you need to change to performance governor. Use the commands below

  # echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor  # echo 1 > /sys/devices/system/cpu/cpu1/online

[edit]Steps to rebuild Kernel

Step 1: Apply the patch 8192

Step 2: Clean up the KERNEL build

 ~/omap4/mydroid/kernel/android-2.6.35$ make ARCH=arm CROSS_COMPILE=arm-none-linux-gnueabi- distclean  ~/omap4/mydroid/kernel/android-2.6.35$ make ARCH=arm CROSS_COMPILE=arm-none-linux-gnueabi- android_4430_defconfig

Step 3: Enable these options in the .config file (via menuconfig) to enable oprofile:

 CONFIG_PROFILING=y CONFIG_OPROFILE=y

Step 4: Rebuild the KERNEL

 ~/omap4/mydroid/kernel/android-2.6.35$ make ARCH=arm CROSS_COMPILE=arm-none-linux-gnueabi- uImage

[edit]Steps to rebuild USERSPACE Component

Step 1: Apply the patch 8193

Step 2: Source the build environment (Assuming that you have built Android previously, go to your MYDROID folder)

 ~/omap4/mydroid$ cd $MYDROID ~/omap4/mydroid$ source build/envsetup.sh  ~/omap4/mydroid$ setpaths

Step 3: Build OPCONTROL source

 ~/omap4/mydroid$ cd $MYDROID/external/oprofile ~/omap4/mydroid/external/oprofile$ mm

Step 4: Push the built libraries in File system

 ~/omap4/mydroid$ adb push $OUT/system/xbin/oprofiled system/xbin/ ~/omap4/mydroid$ adb push $OUT/system/xbin/opcontrol system/xbin/

PS: In case the system is read-only (for eMMC Builds), you can remount as rw using command below on console

 # mount -o rw,remount -t ext3 /dev/block/mmcblk0p1 /system/

[edit]Work-around for recognizing CPU

The post-processing utils don't recognize the CPU for some reason, so workaround as follows (one-time).

Step 1: Extract the unit_masks.gz and events.gz files into the "invalid cpu type" directory. Please copy-paste "as-is"

  ~/omap4/mydroid$ mkdir $MYDROID/prebuilt/linux-x86/oprofile/invalid\ cpu\ type/  ~/omap4/mydroid$ cd $MYDROID/prebuilt/linux-x86/oprofile/invalid\ cpu\ type/  ~/omap4/mydroid/prebuilt/linux-x86/oprofile/invalid cpu type$ wget http://omappedia.org/images/b/b6/Unit_masks.gz  ~/omap4/mydroid/prebuilt/linux-x86/oprofile/invalid cpu type$ wget http://omappedia.org/images/d/d7/Events.gz  ~/omap4/mydroid/prebuilt/linux-x86/oprofile/invalid cpu type$ gzip -d Events.gz  ~/omap4/mydroid/prebuilt/linux-x86/oprofile/invalid cpu type$ gzip -d Unit_masks.gz  ~/omap4/mydroid/prebuilt/linux-x86/oprofile/invalid cpu type$ mv Unit_masks unit_masks  ~/omap4/mydroid/prebuilt/linux-x86/oprofile/invalid cpu type$ mv Events events

Step 2: Create a folder path and softlink on TARGET

  # mkdir -p /system/usr/local/share/oprofile/arm/armv7 

If you are using eMMC File system, this step needs to be done everytime board is rebooted:

  # ln -s /system/usr /usr

Step 3: Push the events and unit_masks files on target

  ~/omap4/mydroid/prebuilt/linux-x86/oprofile/invalid cpu type$ adb push events /system/usr/local/share/oprofile/arm/armv7/  ~/omap4/mydroid/prebuilt/linux-x86/oprofile/invalid cpu type$ adb push unit_masks /system/usr/local/share/oprofile/arm/armv7/

[edit]Installation

These steps should be done after the Android file system has been built.

Step 1: Edit the $MYDROID/external/oprofile/opimport_pull script as follows:

Remove the python version number from the first line eg. change

      #!/usr/bin/python2.4 -E

to

      #!/usr/bin/python -E

Replace

   stream = os.popen("find raw_samples -type f -name \*all")

by

   stream = os.popen("find raw_samples -type f -name \*\.\*\.\*")

Replace

   os.system(oprofile_event_dir + arch_path + "/bin/opreport --session-dir=.")

by

   os.system(oprofile_event_dir + arch_path + "/bin/opreport --session-dir=.")   os.system(oprofile_event_dir + arch_path + "/bin/opreport --session-dir=. -m tgid >> cpuloads.txt")   os.system(oprofile_event_dir + arch_path + "/bin/opreport --session-dir=. -m tgid -p $OUT/symbols -lg -t 0.1 >> cpuloads.txt")   os.system(oprofile_event_dir + "/bin/opreport -c --session-dir=. -m tgid -p $OUT/symbols > callgraph.txt")   os.system("cat callgraph.txt | gprof2dot.py -s -w -f oprofile -n 1 -e 1 | dot -Tpng -o callgraph.png")  

Step 2: Install the package 'graphviz' needed by above script 'opimport_pull'.

   $ sudo apt-get install graphviz

Step 3: Save the python script gprof2dot.py to any location in your path (e.g. in ~/bin of your Linux machine).Ensure that ~/bin is exported in the PATH.Make the script executable.

   $ chmod 777 ~/bin/gprof2dot.py

Step 4: We have seen the Android tools opannotate, oparchive, opimport and opreport tools inprebuilt/linux-x86/oprofile/bin/ folder are not working properly. These binaries from Android Donut are working properly, and they are available @Oprofile.tar.gz. Download this .gz file to $MYDROID folder

     ~/omap4/mydroid$ cd $MYDROID     ~/omap4/mydroid$ wget http://omapedia.org/images/3/32/Oprofile.tar.gz     ~/omap4/mydroid$ tar xvf Oprofile.tar.gz

[edit]Execution

Step 1: Set-up OProfile directories

      # opcontrol --setup

You should see the output below

  Unable to open cpu_type file for reading  Make sure you have done opcontrol --init  Please ignore the above error if running opcontrol --setup

Step 2: Ensure that the EVENTS can be listed

      # opcontrol --list-events      

You should see output below. If you dont see the output below, the patches in Step 1 were not applied correctly.

        CPU Type: ARM V7 PMNC        name                : meaning        ------------------------------------------------------------------------------        PMNC_SW_INCR        : Software increment of PMNC registers        IFETCH_MISS         : Instruction fetch misses from cache or normal cacheable memory        ITLB_MISS           : Instruction fetch misses from TLB        DCACHE_REFILL       : Data R/W operation that causes a refill from cache or normal cacheable memory        DCACHE_ACCESS       : Data R/W from cache        DTLB_REFILL         : Data R/W that causes a TLB refill        DREAD               : Data read architecturally executed (note: architecturally executed = for instructions that are unconditional or that pass the condition code)        DWRITE              : Data write architecturally executed        INSTR_EXECUTED      : All executed instructions        EXC_TAKEN           : Exception taken        EXC_EXECUTED        : Exception return architecturally executed        CID_WRITE           : Instruction that writes to the Context ID Register architecturally executed        PC_WRITE            : SW change of PC, architecturally executed (not by exceptions)        PC_IMM_BRANCH       : Immediate branch instruction executed (taken or not)        PC_PROC_RETURN      : Procedure return architecturally executed (not by exceptions)        UNALIGNED_ACCESS    : Unaligned access architecturally executed        PC_BRANCH_MIS_PRED  : Branch mispredicted or not predicted. Counts pipeline flushes because of misprediction        PC_BRANCH_MIS_USED  : Branch or change in program flow that could have been predicted        CPU_CYCLES          : Number of CPU cycles        JAVA_BC_EXEC        : Number of Java bytecodes decoded, including speculative ones        JAVA_SFTBC_EXEC     : Number of software Java bytecodes decoded, including speculative ones        JAVA_BB_EXEC        : Number of Jazelle taken branches executed, including those flushed due to a previous load/store which aborts late        CO_LF_MISS          : Number of coherent linefill requests which miss in all other CPUs, meaning that the request is sent to external memory        CO_LF_HIT           : Number of coherent linefill requests which hit in another CPU, meaning that the linefill data is fetched directly from the relevant cache        IC_DEP_STALL        : Number of cycles where CPU is ready to accept new instructions but does not receive any because of the instruction side not being able to provide any and the instruction cache is currently performing at least one linefill        DC_DEP_STALL        : Number of cycles where CPU has some instructions that it cannot issue to any pipeline and the LSU has at least one pending linefill request but no pending TLB requests        STREX_PASS          : Number of STREX instructions architecturally executed and passed        STREX_FAILS         : Number of STREX instructions architecturally executed and failed        DATA_EVICT          : Number of eviction requests due to a linefill in the data cache        ISS_NO_DISP         : Number of cycles where the issue stage does not dispatch any instruction        ISS_EMPTY           : Number of cycles where the issue stage is empty        INS_RENAME          : Number of instructions going through the Register Renaming stage        PRD_FN_RET          : Number of procedure returns whose condition codes do not fail, excluding all exception returns        INS_MAIN_EXEC       : Number of instructions being executed in main execution pipeline of the CPU, the multiply pipeline and the ALU pipeline        INS_SND_EXEC        : Number of instructions being executed in the second execution pipeline (ALU) of the CPU        INS_LSU             : Number of instructions being executed in the Load/Store unit        INS_FP_RR           : Number of floating-point instructions going through the Register Rename stage        INS_NEON_RR         : Number of NEON instructions going through the Register Rename stage        STALL_PLD           : Number of cycles where CPU is stalled because PLD slots are all full        STALL_WRITE         : Number of cycles where CPU is stalled because data side is full and executing writes to external memory        STALL_INS_TLB       : Number of cycles where CPU is stalled because of main TLB misses on requests issued by the instruction side        STALL_DATA_TLB      : Number of cycles where CPU is stalled because of main TLB misses on requests issued by the data side        STALL_INS_UTLB      : Number of cycles where CPU is stalled because of micro TLB misses on the instruction side        STALL_DATA_ULTB     : Number of cycles where CPU is stalled because of micro TLB misses on the data side        STALL_DMB           : Number of cycles where CPU is stalled due to executed of a DMB memory barrier        CLK_INT_EN          : Number of cycles during which the integer core clock is enabled        CLK_DE_EN           : Number of cycles during which the Data Engine clock is enabled        INS_ISB             : Number of ISB instructions architecturally executed        INS_DSB             : Number of DSB instructions architecturally executed        INS_DMB             : Number of DMB instructions speculatively executed        EXT_IRQ             : Number of external interrupts executed by the processor        PLE_CL_REQ_CMP      : PLE cache line request completed        PLE_CL_REQ_SKP      : PLE cache line request skipped        PLE_FIFO_FLSH       : PLE FIFO flush        PLE_REQ_COMP        : PLE request completed        PLE_FIFO_OF         : PLE FIFO overflow        PLE_REQ_PRG         : PLE request programmed         

Step 3: The kernel range start and end addresses need to be verified on the setup for each release using:

          # grep " _text" /proc/kallsyms           c0043000 T _text          # grep " _etext" /proc/kallsyms           c05da000 A _etext      

Using the above addresses, run the following command.Note that the events and the assoicated cycles can be specified in this command line. Reduce the event counts to get more samples.

          # opcontrol --vmlinux=/vmlinux --kernel-range 0xc0043000,0xc05da000 --event CPU_CYCLES:25000000

You should see the following output on your terminal

   CPU Type: ARM V7 PMNC   Using 2.6+ OProfile kernel interface   Using log file /data/oprofile   init: untracked pid 1868 exited   file/samples/oprofiled.log

Increase the Back trace depth, so that more details can be captured in the log

          # echo 16 > /dev/oprofile/backtrace_depth

[edit]Starting and Stopping the profiler

Step 1: Run the following command to start the profiler

   # opcontrol --start          

You should see output below

    CPU Type: ARM V7 PMNC    PMNC registers dump CPU 0:    PMNC  =0x41093000    CNTENS=0x80000003    INTENS=0x80000003    FLAGS =0x00000000    SELECT=0x00000001    CCNT  =0xfa0a1f00    CNT[0] count =0xfffb6c20    CNT[0] evtsel=0x00000060    CNT[1] count =0xfffb6c20    CNT[1] evtsel=0x00000061    CNT[2] count =0x00000000    CNT[2] evtsel=0x000000d1    CNT[3] count =0x00000000    CNT[3] evtsel=0x00000061    CNT[4] count =0x00000000    CNT[4] evtsel=0x000000e5    CNT[5] count =0x00000000    CNT[5] evtsel=0x000000a3    PMNC registers dump CPU 1:    PMNC  =0x41093000    CNTENS=0x80000003    INTENS=0x80000003    FLAGS =0x00000000    SELECT=0x00000001    CCNT  =0xfa0a1f00    CNT[0] count =0xfffb6c20    CNT[0] evtsel=0x00000060    CNT[1] count =0xfffb6c20    CNT[1] evtsel=0x00000061    CNT[2] count =0x00000000    CNT[2] evtsel=0x000000ba    CNT[3] count =0x00000000    CNT[3] evtsel=0x000000a5    CNT[4] count =0x00000000    CNT[4] evtsel=0x0000001e    CNT[5] count =0x00000000    CNT[5] evtsel=0x000000ec    

Step 2: Check the status of profiler

   # opcontrol --status   

You should see output below:

     CPU Type: ARM V7 PMNC     Driver directory: /dev/oprofile     Session directory: /data/oprofile     Counter 0:         name: CPU_CYCLES         count: 100000000     Counter 1 disabled     Counter 2 disabled     Counter 3 disabled     Counter 4 disabled     Counter 5 disabled     Counter 6 disabled     oprofiled pid: 1869     profiler is running               4 samples received         0 samples lost overflow


Note: Repeat this step several times, and make sure you are receiving samples.The samples received count should continue to increase until profiling is stopped

Step 3: Run the following command to stop the profiler (after accumulating around 1k samples or 1-2 minutes)

   # opcontrol --stop          

You should see output below:

    CPU Type: ARM V7 PMNC    PMNC registers dump CPU 1:    PMNC  =0x41093001    CNTENS=0x80000003    INTENS=0x80000003    FLAGS =0x00000000    SELECT=0x00000000    CCNT  =0xfad259d9    CNT[0] count =0x11bb2d59    CNT[0] evtsel=0x00000060    CNT[1] count =0x132dab33    CNT[1] evtsel=0x00000061    CNT[2] count =0x00000000    CNT[2] evtsel=0x000000ba    CNT[3] count =0x00000000    CNT[3] evtsel=0x000000a5    CNT[4] count =0x00000000    CNT[4] evtsel=0x0000001e    CNT[5] count =0x00000000    CNT[5] evtsel=0x000000ec    PMNC registers dump CPU 0:    PMNC  =0x41093001    CNTENS=0x80000003    INTENS=0x80000003    FLAGS =0x00000000    SELECT=0x00000000    CCNT  =0xfbcb5c68    CNT[0] count =0x13326d28    CNT[0] evtsel=0x00000060    CNT[1] count =0x1e33060c    CNT[1] evtsel=0x00000061    CNT[2] count =0x00000000    CNT[2] evtsel=0x000000d1    CNT[3] count =0x00000000    CNT[3] evtsel=0x00000061    CNT[4] count =0x00000000    CNT[4] evtsel=0x000000e5    CNT[5] count =0x00000000    CNT[5] evtsel=0x000000a3

[edit]Generating the Results

We need to run the following steps on the Host machine (that has Android SDK/build) to generate the results.

On command prompt of Host machine (that has Android SDK/build), do the following

          $ cd $MYDROID           $ source build/envsetup.sh           $ setpaths           $ export ADBHOST=<ip address of Blaze board>

Note: This should be done at the $MYDROID level (where the build was set-up otherwise, it wouldn't work)

If ADB over Ethernet is not working, refer to Troubleshooting here

You can use the following commands to start DHCP on Zoom board and set-up ADB over ethernet:

          # netcfg eth0 up ; sleep 3 ; netcfg eth0 dhcp ; sleep 2 ; netcfg           # setprop service.adb.tcp.port 5555 ; stop adbd ; start adbd

[edit]Post-process OProfile results

This needs to be done from the PC where Android SDK is installed. Go to the terminal on host PC and do the following:

If you are using OProfile package with pre-build binaries, symbol files and vmlinux, you can follow the steps below:In case, you are using OPROFILE binaries that were not build on your machine, you might have to create a symbolic link to zoom2 folder, since OProfile looks there

NOTE: The Kernel path needs to be updated for your KERNEL build

          $ cp $MYDROID/kernel/android-2.6.35/vmlinux $OUT/symbols/vmlinux 

'opimport_pull' script expects following environment variable to be defined.

          $ export OPROFILE_EVENTS_DIR=$MYDROID/prebuilt/linux-x86/oprofile

Generate the OPROFILE results using the command below

          $ opimport_pull -r <new_dir_to_store_dump_and_results_on_Linux_machine>           

FIXME: The above command might hang, hit "Ctrl+C" to end the command above and manually post-process the results.

Note: If you see the message below, its because you haven't download the Android Donut oprofile tools (Refer to [Installationhttp://omappedia.org/wiki/Android_Debugging#Installation] )

  ~/omap4/mydroid/prebuilt/linux-x86/oprofile/bin/opreport: error while loading shared libraries: libbfd-2.18.0.20080103.so: cannot open shared object file: No such file or directory  ~/omap4/mydroid/prebuilt/linux-x86/oprofile/bin/opreport: error while loading shared libraries: libbfd-2.18.0.20080103.so: cannot open shared object file: No such file or directory  ~/omap4/mydroid/prebuilt/linux-x86/oprofile/bin/opreport: error while loading shared libraries: libbfd-2.18.0.20080103.so: cannot open shared object file: No such file or directory  Traceback (most recent call last):

[edit]Steps to Re-run OProfile

In order to capture another OProfile log in the same session (i.e. without rebooting the board), perform the following:

 opcontrol --shutdown opcontrol --reset

After this start with all the steps as mentioned in "Execution" section above

[edit]Other Pointers - OProfile in OMAP4

1. The OProfile in OMAP4 is slightly different from OMAP3 as it profiles only the CPU usage.The CPU Idle time is not considered. This leads to minimal logs which focuses on CPU utilization.

2. In order to take more samples the CPU_CYCLES field in opcontrol configuration can be modified (like CPU_CYCLES:10000000) but we need to be careful as GP Timer is not being used now.

[edit]IDLE Mhz Calculation

NOTE: Above analysis is valid only for the duration the CPU is active. The cycle counter should not be used for calculation of IDLE Mhz since it gets paused during idle. Please use the cpuidle statistics to compute the idle time for your tests.

[edit]Manually Post-process the results (for advanced users)

To capture additional events, separate them across cores etc. configure opcontrol similar to the following

       opcontrol --kernel-range 0xc0043000,0xc05da000 --event CPU_CYCLES:10000000 --event IC_DEP_STALL:30000 --event DC_DEP_STALL:30000 --separate-cpu=1

To display consolidated samples for all captured events

   ~/omap4/mydroid$ cd <new_dir_to_store_dump_and_results_on_Linux_machine>    ~/omap4/mydroid$ $MYDROID/prebuilt/linux-x86/oprofile/bin/opreport --session-dir=. -m all   

Output should look like (considering that you captured atleast 1k Samples)

       CPU: invalid cpu type, speed 0 MHz (estimated)Counted CPU_CYCLES events (Number of CPU cycles) with a unit mask of 0x00 (No unit mask) count 10000000       Counted DC_DEP_STALL events (Number of cycles where CPU has some instructions that it cannot issue to any pipeline and LSU has at least one pending linefill request but no pending TLB requests) with a unit mask of 0x00 (No unit mask) count 30000       Counted IC_DEP_STALL events (Number of cycles where CPU is ready to accept new instructions but does not receive any because of the instruction side not being able to provide any and the instruction cache is currently performing at least one linefill) with a unit mask of 0x00 (No unit mask) count 30000       CPU_CYCLES:100...|DC_DEP_STALL:3...|IC_DEP_STALL:3...|        samples|      %|  samples|      %|  samples|      %|        ------------------------------------------------------            1584 75.0000        16 32.6531        10 29.4118 no-vmlinux             169  8.0019        12 24.4898         3  8.8235 libdvm.so             136  6.4394        13 26.5306         9 26.4706 dalvik-jit-code-cache              86  4.0720         4  8.1633         7 20.5882 libGLESv1_CM_POWERVR_SGX540_120.so.1.1.16.3924              61  2.8883         2  4.0816         1  2.9412 libc.so              12  0.5682         0       0         0       0 libcutils.so              12  0.5682         0       0         1  2.9412 libutils.so               8  0.3788         0       0         0       0 libsrv_um.so.1.1.16.3924               6  0.2841         0       0         0       0 libandroid_runtime.so               6  0.2841         0       0         0       0 libbinder.so               5  0.2367         0       0         0       0 libhardware_legacy.so               4  0.1894         0       0         0       0 libui.so               3  0.1420         0       0         0       0 sensors.omap4.so               3  0.1420         0       0         1  2.9412 libIMGegl.so.1.1.16.3924               3  0.1420         0       0         0       0 libpvrANDROID_WSEGL.so.1.1.16.3924               3  0.1420         0       0         0       0 libsurfaceflinger.so               2  0.0947         0       0         0       0 libm.so               2  0.0947         0       0         0       0 libskia.so               2  0.0947         0       0         0       0 libsurfaceflinger_client.so               1  0.0473         0       0         0       0 busybox               1  0.0473         0       0         0       0 linker               1  0.0473         0       0         0       0 libandroid_servers.so               1  0.0473         0       0         0       0 libnativehelper.so               1  0.0473         0       0         0       0 libz.so               0       0         2  4.0816         1  2.9412 gralloc.omap4430.so.1.1.16.3924               0       0         0       0         1  2.9412 libGLESv1_CM.so

To use CPU Cycles events

   ~/omap4/mydroid$ cd <new_dir_to_store_dump_and_results_on_Linux_machine>    ~/omap4/mydroid$ $MYDROID/prebuilt/linux-x86/oprofile/bin/opreport --session-dir=. event:CPU_CYCLES    

Output should look like (considering that you captured atleast 1k Samples):

       CPU: invalid cpu type, speed 0 MHz (estimated)       Counted CPU_CYCLES events (Number of CPU cycles) with a unit mask of 0x00 (No unit mask) count 10000000       Samples on CPU 0       Samples on CPU 1                   cpu:0|                cpu:1|         samples|      %|  samples|      %|       ------------------------------------              814 76.5757           770 73.4032 no-vmlinux                  71  6.6792            98  9.3422 libdvm.so                  67  6.3029            69  6.5777 dalvik-jit-code-cache                  42  3.9511            44  4.1945 libGLESv1_CM_POWERVR_SGX540_120.so.1.1.16.3924                  29  2.7281            32  3.0505 libc.so                  10  0.9407             2  0.1907 libutils.so                   6  0.5644             2  0.1907 libsrv_um.so.1.1.16.3924                   4  0.3763             2  0.1907 libbinder.so                   3  0.2822             9  0.8580 libcutils.so                   3  0.2822             0           0 libsurfaceflinger.so                   2  0.1881             1  0.0953 libIMGegl.so.1.1.16.3924                   2  0.1881             4  0.3813 libandroid_runtime.so                   2  0.1881             1  0.0953 libpvrANDROID_WSEGL.so.1.1.16.3924                   2  0.1881             0           0 libsurfaceflinger_client.so                   2  0.1881             2  0.1907 libui.so                   1  0.0941             2  0.1907 sensors.omap4.so                   1  0.0941             1  0.0953 libm.so                   1  0.0941             0           0 libnativehelper.so                   1  0.0941             1  0.0953 libskia.so                   0           0             1  0.0953 busybox                   0           0             1  0.0953 linker                   0           0             1  0.0953 libandroid_servers.so                   0           0             5  0.4766 libhardware_legacy.so                   0           0             1  0.0953 libz.so

[edit]OProfile on OMAP4 ICS (K3.0)

[edit]Kernel Build Changes

1. Enable these options in the .config file (via menuconfig) to enable oprofile:

    CONFIG_PROFILING=y    CONFIG_OPROFILE=y

2. Apply this patch for oprofile support on kernel 3.0 [1] and build the kernel.

3. Flash the device with new boot.img

[edit]Linux Host Setup

Note: There are some issues using the oprofile binaries that are part of the current Android release by default. Thus, you should pull in an older version of the oprofile binaries and a modified version of the opimport_pull script that are known to work with ICS on OMAP4.

1. Pull the oprofile binaries and copy them to $MYDROID

    $ cd ~/tmp    $ wget http://omapedia.org/images/3/32/Oprofile.tar.gz    $ tar xzvf Oprofile.tar.gz    $ cp -rf ~/tmp/prebuilt/linux-x86/oprofile/bin/* $MYDROID/out/host/linux-x86/bin

The oprofile binaries are opimport, opannotate, oparchive, and opreport.

2. Replace the $MYDROID/external/oprofile/opimport_pull script with this: [2].

And add these two lines at the end of opimport_pull script for call graph support

  os.system(oprofile_bin_dir + "/opreport -c --session-dir=. -m tgid -p $OUT/symbols > callgraph.txt")  os.system("cat callgraph.txt | gprof2dot.py -s -w -f oprofile -n 1 -e 1 | dot -Tpng -o callgraph.png")

Install graphviz package

  $ sudo apt-get install graphviz

[edit]Setup Oprofile on the OMAP4 Target Device

1. Use adb to push the busybox binaries to /data/local in the target device. The busybox binaries can be found here:[3]. You will need to run adb as root:

    $ cd $MYDROID/out/host/linux-x86/bin    $ sudo ./adb root    $ sudo ./adb remount    $ sudo ./adb push busybox /data/local    $ sudo ./adb shell    root@android: # export PATH=/data/busybox/bin:/data/busybox/sbin:/data/sbin:$PATH

2. Use adb to push the oprofile binaries to the target device (if they are not already present).

    $ cd $MYDROID/out/target/product/blaze_tablet/system/xbin    $ sudo ./adb push opcontrol /system/xbin    $ sudo ./adb push oprofiled /system/xbin

If the system is read-only (such as for an eMMC boot), you can first remount with the commands:

    $ sudo ./adb shell    root@android: # mount -o remount rw /     root@android: # mount -o remount rw /system    root@android: # ln -s /proc/mounts /etc/mtab

3. Follow the same steps for OMAP4 pre-K3.0 for pushing the events and unit_masks files to the target:[4]

[edit]Run Oprofile on the OMAP4 Target Device

1. Verify the kernel range start and end addresses:

    root@android: /data/local # KERNEL_BEG=`/data/busybox/bin/grep " _text" /proc/kallsyms | /data/busybox/bin/awk '{print $1}'`    root@android:/data/local # echo $KERNEL_BEG    c004f000    root@android: /data/local # KERNEL_END=`/data/busybox/bin/grep " _etext" /proc/kallsyms | /data/busybox/bin/awk '{print $1}'`    root@android:/data/local # echo $KERNEL_END                                        c07e4000

2. Reset oprofile:

    root@android:/data/local # opcontrol --shutdown    root@android:/data/local # opcontrol --reset    root@android:/data/local # opcontrol --setup

Note: Run this command using the kernel range start and end addresses that you found earlier. You can reduce the event counts to get more cycles.

    root@android:/data/local # opcontrol --vmlinux=/vmlinux --callgraph=16 --kernel-range=0x$KERNEL_BEG,0x$KERNEL_END --event=CPU_CYCLES:100000    Cannot open /dev/oprofile/1/enabled: No such file or directory    Cannot open /dev/oprofile/2/enabled: No such file or directory    Starting oprofiled...    Using 2.6+ OProfile kernel interface.    Reading module info.    Using log file /data/oprofile/samples/oprofiled.log    Ready

3. Check opcontrol status:

    root@android:/data/local # opcontrol --status    Driver directory: /dev/oprofile    Session directory: /data/oprofile    Counter 0:        name: CPU_CYCLES        count: 100000    Counter 1 disabled    Counter 2 disabled    oprofiled pid: 1414    profiler is not running      cpu1         0 samples received      cpu1         0 samples lost overflow      cpu1         0 samples invalid eip      cpu1         0 backtrace aborted      cpu0         0 samples received      cpu0         0 samples lost overflow      cpu0         0 samples invalid eip      cpu0         0 backtrace aborted

4. Ensure that the events can be listed:

    root@android:/data/local # opcontrol --list-events    name                : meaning    ------------------------------------------------------------------------------    IFU_IFETCH_MISS     : number of instruction fetch misses    CYCLES_IFU_MEM_STALL: cycles instruction fetch pipe is stalled    CYCLES_DATA_STALL   : cycles stall occurs for due to data dependency    ITLB_MISS           : number of Instruction MicroTLB misses    DTLB_MISS           : number of Data MicroTLB misses    BR_INST_EXECUTED    : branch instruction executed w/ or w/o program flow change    BR_INST_MISS_PRED   : branch mispredicted    INSN_EXECUTED       : instructions executed    DCACHE_ACCESS       : data cache access, cacheable locations    DCACHE_ACCESS_ALL   : data cache access, all locations    DCACHE_MISS         : data cache miss    DCACHE_WB           : data cache writeback, 1 event for every half cacheline    PC_CHANGE           : number of times the program counter was changed without a mode switch    TLB_MISS            : Main TLB miss    EXP_EXTERNAL        : Explicit external data access    LSU_STALL           : cycles stalled because Load Store request queue is full    WRITE_DRAIN         : Times write buffer was drained    CPU_CYCLES          : clock cycles counter

5. Run start and stop to receive samples for approximately 30 seconds of profiling:

    root@android:/data/local # opcontrol --start;sleep 30;opcontrol --stop

[edit]Pull the opreport to the Linux Host

1. Set the paths that are used by opimport:

    $ export ANDROID_HOST_OUT=$MYDROID/out/host/linux-x86    $ export OUT=$MYDROID/out/target/product/blaze_tablet

2. Run the opimport_pull script, which then calls opreport:

    $ $MYDROID/external/oprofile/opimport_pull –r ./test_output    CPU: CPU with timer interrupt, speed 0 MHz (estimated)    Profiling through timer interrupt              TIMER:0|      samples|      %|    ------------------         7671 99.8308 vmlinux            4  0.0521 oprofiled            3  0.0390 dalvik-jit-code-cache            3  0.0390 libdvm.so            2  0.0260 libc.so            10.0130 libandroid_runtime.so

The oprofile binaries opimport, opannotate, oparchive, and opreport available athttp://omapedia.org/images/3/32/Oprofile.tar.gz are 32-bit binaries. If you are using 64-bit machine, download the 32-bit libraries that you need (eg: libpopt.so.0 )

    $ sudo aptitude install ia32-libs    $ wget http://ftp.us.debian.org/debian/pool/main/p/popt/libpopt0_1.16-1_i386.deb                   #(Note: this link is for Intel x86 machines)
    # Extract it into your 32-bit library tree    $ sudo dpkg -X libpopt0_1.16-1_i386.deb ~/tmp/lib

[edit]Strace

Recommended Usage on target

       # strace -ff -F -tt -s 200 -o /sqlite_stmt_journals/strace -p <process_id_that_needs_to_be_traced>


Note

-ff makes strace follow fork() and output each forked files trace to a different file

-F means we try and follow any vfork()s.

-tt prints out the time of system calls in microseconds

-s 200 so that we can see a bit more detail in any strings that are used.

[edit]Using INST2 for Performance Measurement on DSP

INST2 is a tool that helps us measure DSP MHz used for a particular use case e.g. Video playback or record.

It is recommended that the VDD1 OPP is locked before starting this tool for obtaining accurate result.

To lock the OPP use the following commands:

    # echo n > /sys/power/vdd1_lock


(where n stands for the OPTIMAL OPP the use case should run at)

Refer to this omap3-opp.h file for the OPP table corresponding to your chip

After executing your usecase, make sure that OPP lock is removed using the command below

    # echo 0 > /sys/power/vdd1_lock


Step 1: Install busybox on the target filesystem.

Copy the pre-built busybox at /data/busybox/

On target do the following

 # cd /data/busybox/ # chmod 777 busybox # busybox  --install # export PATH=/data/busybox/:$PATH

Incase the "busybox --install" fails with message "Busybox not found" error, check with ls command to confirm if busybox is actually present and then try "./busybox --install"

PS: This step is optional for FROYO (2.6.32 Kernel)

Step 2: Download the Dsp_load_measurement_tool.tar.gz file to your host machine if you are on eclair (2.6.29 Kernel) orDsp_load_measurement_tool_froyo.tar.gz for Froyo (2.6.32 Kernel)

Untar this file using the command below

   $ tar xvf Dsp_load_measurement_tool.tar.gz   $ cd dsp_load_measurement_tool    $ tar xvf inst2.tar

Copy /dsp_load_measurement_tool/inst_log file to the root directory in your file system; using SD card or adb push

   # cp inst_log .<file system's root>        OR     $ adb push inst_log .

Step 3: Give permissions to the files copied

    # cd /    # chmod 777 inst_log

Step 4: Now run the use case i.e. start playback or record

Step 5: Run the instrumentation

    # ./inst_log

Following messages should appear

       DSP device detected !!       DSPProcessor_Attach succeeded.  

Step 6: Once the use case is complete (i.e. playback or record is done), wait for "INST: Log file written. Waiting for INST DSP side cleanup" message

Step 7: Now wait for "INST: DSP Cleanup done. Exiting" message. If this msg does not appear, run the use case once more and wait again for a "INST: DSP Cleanup done. Exiting" message. Basically this will flush previous results and you do not need to run the full use case again.

This is to make sure previous result is flushed out.

Step 8: Bring the "log.bin" on HOST PC. This gets generated in /sqlite* folder of target FS for Eclair (2.6.29 kernel) and in /mnt folder on Froyo (2.6.32 kernel)

Important Note:busybox should be installed prior to executing the cp command.If you get the message "cp: not found" while copying the log, you should use the adb pull command before unmounting the SD card.

For Eclair (2.6.29 kernel)

     # cp /sqlite*/*.bin /     # cp log.bin /sdcard OR <adb pull ...>     # sync

For Froyo (2.6.32 kernel)

     # cp /mnt/*.bin /     # cp log.bin /sdcard OR <adb pull ...>     # sync

Step 9: On the HOST PC use the following command line to generate the results.

Go to following folder:

     $ cd dsp_load_measurement_tool\inst2\tools     $ perl inst_load.pl -c<dsp freq> log.bin     for example           $ perl inst2/tools/inst_load.pl -c660 log.bin

On the screen you will see information about the file for example:

   Number of records: 40731   splice() offset past end of array at inst2/tools/inst_load.pl line 123.   Range #0 - beg: 0, end: 40731, length: 40731   Range #1 - beg: 0, end: 40731, length: 40731   Ignoring IDLE traces 0-22 and 40714-40730   Clock Freq:   660 MHz   Total Cycles: 25383675998   Event |  Handle_Event |    Cycles |  Evts/s | ms/Evt |   MHz |30 Evt/s MHz   ------+---------------+-----------+---------+--------+-------+------------   unkwn | 00000000_0000 |     11568 |       0 |      0 |     0 |   IDLE  | 00000000_2001 | 1.396e+10 |   48.99 |  11.23 |   220 |   UALG  | 2025fad4_2009 |  70257552 |      39 |   0.07 |   1.1 |   0.85   CTRL  | 2025fad4_2011 |     40768 |    0.19 |   0.01 |     0 |   RESET | 2025fad4_2021 |    762453 |     0.3 |    0.1 |     0 |   USN   | 2025fad4_2041 |  47083524 |   50.17 |   0.04 |   0.7 |   ALGO  | 2025fad4_2101 | 560448611 |   19.68 |   1.12 |   8.8 |  13.46   UALG  | 20260ee4_2009 | 2.320e+09 |   25.56 |   3.58 |  36.6 |  42.91   CTRL  | 20260ee4_2011 |     21475 |    0.13 |   0.01 |     0 |   RESET | 20260ee4_2021 |    286269 |    0.05 |   0.24 |     0 |   USN   | 20260ee4_2041 | 116914279 |   76.21 |   0.06 |   1.8 |   ALGO  | 20260ee4_2101 | 8.305e+09 |   25.54 |  12.81 | 130.9 |  153.7


Look at the last row, in above results. In this case the algo. consumed 130 Mhz. and it was running @ 25.54 fps. If we interpolate that to 30 fps, the effective Mhz would be 153.7

Step 10: The final step is to apply the next formula:

       DSP Mhz consumption = (Clock Freq - IDLE MHz). 

For example in this case above the DSP CPU Load is 400-220 = 180 MHz

[edit]Debugging segmentation fault

Type 1 Seg Fault has backtrace of shared objects (most of the times we face this)

The PC holds the offset and it has to be traced in the top most 'so' file present in back trace. This can be done by using addr2line or objdump tool

a. Using addr2line:

  Syntax:   <path to arm-eabi-addr2line> -f -e <path to so file> <offset which is the PC value>
  Example:  #00  pc 0000e234  /system/lib/libc.so   ./prebuilt/linux-x86/toolchain/arm-eabi-4.3.1/bin/arm-eabi-addr2line -f -e ./out/target/product/generic/symbols/system/lib/libc.so 0x0000e234 
  Returns   strlen  bionic/libc/arch-arm/bionic/strlen.c:62

b. Using objdump:

  Syntax:   <path to arm-eabi-objdump> -S <path to so file> 
  Example:   prebuilt/linux-x86/toolchain/arm-eabi-4.3.1/bin/arm-eabi-objdump -S out/target/product/zoom2/symbols/system/lib/libOMX.TI.Video.encoder.so > ObjDumpFile.txt  Output is redirected to ObjDumpFile file
  Now the ObjDumpFile has the Intermix of source code with disassembly. We can search using the PC offset in it to see the exact line number
  Tip: Make sure to use the debug version of 'so' files which are within the symbols dir in out folder otherwise we would not get the source code symbols in obj dump

Type 2 Seg fault print has the thread name along with register values. The backtrace of shared objects is not present

1. The PC holds a virtual address

2. The first '3' digits of PC provide info on the virtual memory mapping where the shared object('so') is loaded and remaining digits provide the offset within the 'so'

3. Use the 'ps' call to get the Process ID of the thread where it fails.

4. Get the memory map of the process using the following:

  In device console/adb shell, print values of 'maps' file within the process id <pid>.   Syntax: cat proc/<pid>/maps  It is recommended that we get the memory map during the normal execution before the failure itself otherwise it is possible for process to be killed and might lead to mismatch.

5. Use the first '3' digits of PC value to identify the shared object which has caused the failure(using memory map)

6. Get the obj dump of the 'so' and search with the offset to get the exact line of failure. (this can be done in the similar way as in Type 1 seg fault debugging mentioned above)

Example:

  Seg Fault Message:  PV author: unhandled page fault (11) at 0x00000004, code 0x017  pgd = ccfc8000  [00000004] *pgd=8cbe1031, *pte=00000000, *ppte=00000000  Pid: 9691, comm:            PV author  CPU: 0    Not tainted  (2.6.29-omap1 #20)  PC is at 0x81b23140  LR is at 0x81b23049  ...
  Based on PC, the shared object is in range of 0x81bxxxxx and offset within that is 0x23140
  PV author runs in context of Media server process (PID: 944)    Using cat /proc/944/maps we can identify the ‘so’ loaded in this region  81b00000-81b2a000 r-xp 00000000 b3:02 34574      /system/lib/libOMX.TI.Video.encoder.so  So appears to be in OMX TI Video encoder
  Now with the offset we can use the addr2line or objdump to get the line causing this issue

[edit]OPP Level Measurement

We can measure the OPP level(VDD1) at which the use case is running easily using the script available in Mydroid (FroYo codebase).

Script path: <MYDROID_PATH>/device/ti/support-tools/scripts/measurement/get_opp_levels.sh

Or you can access it from GIT Web interface


Pre-requisite

a. Busybox binary is present in /data/busybox folder (instructions for this can be referred to Step 1 under section "Using INST2 for Performance Measurement on DSP" above

b. Copy this script file(get_opp_levels.sh) to the file system


Steps for measurement: (in device console)

a. Change the permissions of the script file

b. Start the desired use case

c. Execute the script

  chmod 777 get_opp_levels.sh  <start use case>  ./get_opp_levels.sh

d. Output would look like this:

   OPP Level,Initial Time(cs),Second Time(cs),Time Spent in OPP(cs),Measurement Time(s),Percentage Time in OPP   OPP1G,2332,2332,0,20,0   OPP130,131,131,0,20,0   OPP100,410,410,0,20,0   OPP50,43145,45151,2006,20,100

Based on the percentage time we can decide the OPP at which use case has run. In the above example, it has run at OPP50 for 100% of the time for 20 second window.


Other pointers:

a. The script by default runs for 20secs and prints the results.

b. It can be tweaked based on our requirements to perform it for shorter/longer duration etc. using an additional argument.

For eg: The command below with measure OPP transitions for 100 sec. window

  ./get_opp_levels.sh 100

[edit]Other Tools & references

  • Bootchart,Bootchart on Android
  • Smem on Android
  • Android memory usage guide
  • Tim Bird's android tips at ABS 2011 (ppt)