close

因為系統跑出來結果不如預期
而其實要optimization一個系統,本來就應該要有一個比較正統的衡量機制

之前苦無辦法,四處問人
查了很多資料,在這裡把他整理一下

其實有時候簡單的方式不失為一個辦法,雖然也許會被評為沒有數據基礎
but只要跑起來快who care

Q1:

What can I use to profile C++ code in Linux?
Q2:

I am developing a rather large software on Android with a log native code, it's working now but having some performance issues.

I am hoping I can profile each module(function call) of the software for CPU cycles, memory usage, etc, on several real android phones. Is there a simple c library to do that?

I see people using oprofile, It seems to be a overkill for my case since that it is a system wild profiler, and it requires rebuild the kernel and system image.

As I have the full source code of my app, all I really need is a simple c library that I can embed in my code to do some profiling while the app runs several test cases.

BTW, what is the Linux way of doing this?


完全就是我的翻版=  =.......我問了資工系那邊啊們也是建議我去用Oprofile,如果這幾天try樓下那個方法不成功

我應該就會採納重刷kernel來支援Oprofile了

以上問題網路上普遍都指向同一個解答

http://stackoverflow.com/questions/375913/what-can-i-use-to-profile-c-code-in-linux/378024#378024

轉載自上面文章回覆

OK, downvote time...

If your goal is to use a profiler, use one of the suggested ones.

However, if you're in a hurry and you can manually interrupt your program under the debugger while it's being subjectively slow, there's a simple way to find performance problems.

Just halt it several times, and each time look at the call stack. If there is some code that is wasting some percentage of the time, 20% or 50% or whatever, that is the probability that you will catch it in the act on each sample. So that is roughly the percentage of samples on which you will see it. There is no educated guesswork required. If you do have a guess as to what the problem is, this will prove or disprove it.

You may have multiple performance problems of different sizes. If you clean out any one of them, the remaining ones will take a larger percentage, and be easier to spot, on subsequent passes.

Caveat: programmers tend to be skeptical of this technique unless they've used it themselves. They will say that profilers give you this information, but that is only true if they sample the entire call stack. Call graphs don't give you the same information, because 1) they don't summarize at the instruction level, and 2) they give confusing summaries in the presence of recursion. They will also say it only works on toy programs, when actually it works on any program, and it seems to work better on bigger programs, because they tend to have more problems to find.

P.S. This can also be done on multi-thread programs if there is a way to collect call-stack samples of the thread pool at a point in time, as there is in Java.

P.P.S As a rough generality, the more layers of abstraction you have in your software, the more likely you are to find that that is the cause of performance problems (and the opportunity to get speedup).

Added: It might not be obvious, but the stack sampling technique works equally well in the presence of recursion. The reason is that the time that would be saved by removal of an instruction is approximated by the fraction of samples containing it, regardless of the number of times it may occur within a sample.

Another objection I often hear is: "It will stop someplace random, and it will miss the real problem". This comes from having a prior concept of what the real problem is. A key property of performance problems is that they defy expectations. Sampling tells you something is a problem, and your first reaction is disbelief. That is natural, but you can be sure if it finds a problem it is real, and vice-versa.


arrow
arrow
    全站熱搜
    創作者介紹
    創作者 angledark0123 的頭像
    angledark0123

    CONY的世界

    angledark0123 發表在 痞客邦 留言(1) 人氣()