profiler
A PHP 5.3 profiler based off of Laravel 3's Anbu.
A Hacker's Tale
The date is 12/02/10. The days before Christmas are dripping away and I've pretty much hit a major road block as a windows programmer. I've been using AQTime, I've tried sleepy, shiny, and very sleepy, and as we speak, VTune is installing. I've tried to use the VS2008 profiler, and it's been positively punishing as well as often insensible. I've used the random pause technique. I've examined call-trees. I've fired off function traces. But the sad painful fact of the matter is that the app I'm working with is over a million lines of code, with probably another million lines worth of third-party apps.
I need better tools. I've read the other topics. I've tried out each profiler listed in each topic. There simply has to be something better than these junky and expensive options, or ludicrous amounts of work for almost no gain. To further complicate matters, our code is heavily threaded, and runs a number of Qt Event loops, some of which are so fragile that they crash under heavy instrumentation due to timing delays. Don't ask me why we're running multiple event loops. No one can tell me.
Are there any options more along the lines of Valgrind in a windows environment?
Is there anything better than the long swath of broken tools I've already tried?
Is there anything designed to integrate with Qt, perhaps with a useful display of events in queue?
A full list of the tools I tried, with the ones that were really useful in italics:
- AQTime: Rather good! Has some trouble with deep recursion, but the call graph is correct in these cases, and can be used to clear up any confusion you might have. Not a perfect tool, but worth trying out. It might suit your needs, and it certainly was good enough for me most of the time.
- Random Pause attack in debug mode: Not enough information enough of the time.
A good tool but not a complete solution.
- Parallel Studios: The nuclear option. Obtrusive, weird, and crazily powerful. I think you should hit up the 30 day evaluation, and figure out if it's a good fit. It's just darn cool, too.
- AMD Codeanalyst: Wonderful, easy to use, very crash-prone, but I think that's an environment thing. I'd recommend trying it, as it is free.
- Luke Stackwalker: Works fine on small projects, it's a bit trying to get it working on ours. Some good results though, and it definitely replaces Sleepy for my personal tasks.
- PurifyPlus: No support for Win-x64 environments, most prominently Windows 7. Otherwise excellent. A number of my colleagues in other departments swear by it.
- VS2008 Profiler: Produces output in the 100+gigs range in function trace mode at the required resolution. On the plus side, produces solid results.
- GProf: Requires GCC to be even moderately effective.
- VTune: VTune's W7 support borders on criminal. Otherwise excellent
- PIN: I'd need to hack up my own tool, so this is sort of a last resort.
- Sleepy\VerySleepy: Useful for smaller apps, but failing me here.
- EasyProfiler: Not bad if you don't mind a bit of manually injected code to indicate where to instrument.
- Valgrind: *nix only, but very good when you're in that environment.
- OProfile: Linux only.
- Proffy: They shoot wild horses.
Suggested tools that I haven't tried:
- XPerf:
- Glowcode:
- Devpartner:
Notes:
Intel environment at the moment. VS2008, boost libraries. Qt 4+. And the wretched humdinger of them all: Qt/MFC integration via trolltech.
Now: Almost two weeks later, it looks like my issue is resolved. Thanks to a variety of tools, including almost everything on the list and a couple of my personal tricks, we found the primary bottlenecks. However, I'm going to keep testing, exploring, and trying out new profilers as well as new tech. Why? Because I owe it to you guys, because you guys rock. It does slow the timeline down a little, but I'm still very excited to keep trying out new tools.
Synopsis
Among many other problems, a number of components had recently been switched to the incorrect threading model, causing serious hang-ups due to the fact that the code underneath us was suddenly no longer multithreaded. I can't say more because it violates my NDA, but I can tell you that this would never have been found by casual inspection or even by normal code review. Without profilers, callgraphs, and random pausing in conjunction, we'd still be screaming our fury at the beautiful blue arc of the sky. Thankfully, I work with some of the best hackers I've ever met, and I have access to an amazing 'verse full of great tools and great people.
Gentlefolk, I appreciate this tremendously, and only regret that I don't have enough rep to reward each of you with a bounty. I still think this is an important question to get a better answer to than the ones we've got so far on SO.
As a result, each week for the next three weeks, I'll be putting up the biggest bounty I can afford, and awarding it to the answer with the nicest tool that I think isn't common knowledge. After three weeks, we'll hopefully have accumulated a definitive profile of the profilers, if you'll pardon my punning.
Take-away
Use a profiler. They're good enough for Ritchie, Kernighan, Bentley, and Knuth. I don't care who you think you are. Use a profiler. If the one you've got doesn't work, find another. If you can't find one, code one. If you can't code one, or it's a small hang up, or you're just stuck, use random pausing. If all else fails, hire some grad students to bang out a profiler.
A Longer View
So, I thought it might be nice to write up a bit of a retrospective. I opted to work extensively with Parallel Studios, in part because it is actually built on top of the PIN Tool. Having had academic dealings with some of the researchers involved, I felt that this was probably a mark of some quality. Thankfully, I was right. While the GUI is a bit dreadful, I found IPS to be incredibly useful, though I can't comfortably recommend it for everyone. Critically, there's no obvious way to get line-level hit counts, something that AQT and a number of other profilers provide, and I've found very useful for examining rate of branch-selection among other things. In net, I've enjoyed using AQTime as well, and I've found their support to be really responsive. Again, I have to qualify my recommendation: A lot of their features don't work that well, and some of them are downright crash-prone on Win7x64. XPerf also performed admirably, but is agonizingly slow for the sampling detail required to get good reads on certain kinds of applications.
Right now, I'd have to say that I don't think there's a definitive option for profiling C++ code in a W7x64 environment, but there are certainly options that simply fail to perform any useful service.
Source: (StackOverflow)
I urgently need a C# profiler.
Although I'm not averse to paying for one, something which is free or at least with a trial version would be ideal since it takes time to raise a purchase order.
Any recommendations?
Source: (StackOverflow)
What can you guys recommend to use with Java?
Only requirement is it should be open source, or has not too expensive academic licence .
Source: (StackOverflow)
Does anyone have experience profiling a Python/SQLAlchemy app? And what are the best way to find bottlenecks and design flaws?
We have a Python application where the database layer is handled by SQLAlchemy. The application uses a batch design, so a lot of database requests is done sequentially and in a limited timespan. It currently takes a bit too long to run, so some optimization is needed. We don't use the ORM functionality, and the database is PostgreSQL.
Source: (StackOverflow)
Want to know what the stackoverflow community feels about the various free and non-free Java Profilers and profiling tools available.
Source: (StackOverflow)
I'm trying to find a free profiler for Eclipse that works well. I would like a graphical breakdown of execution time, in particular. I've tried TPTP but have had no luck at all with GUI apps (it took almost a minute for a GUI app to start and was virtually unusable on screen - it uses a lot of Java OpenGL, so I'm not sure if it has to do with that). I liked YourKit, but unfortunately it's not free. I even tried switching to NetBeans since they have a built in profiler.
If anyone has had success with particular profilers (even if it was TPTP), I'd like to hear about it. Any recommendations would be greatly appreciated.
Note: I know this has been asked before, but I have not found anything recent that really gives a good answer.
Source: (StackOverflow)
I use cProfile now but I find it tedious to write pstats code just to query the statistics data.
I'm looking for a visual tool that shows me what my Python code is doing in terms of CPU time and memory allocation.
Some examples from the Java world are visualvm and JProfiler.
- Does something like this exist?
- Is there an IDE that does this?
- Would dtrace help?
I know about KCachegrind for Linux, but I would prefer something that I can run on Windows/Mac without installing KDE.
Source: (StackOverflow)
After watching the presentation "Performance Anxiety" of Joshua Bloch, I read the paper he suggested in the presentation "Evaluating the Accuracy of Java Profilers". Quoting the conclusion:
Our results are disturbing because they indicate that profiler incorrectness is pervasive—occurring in most of our seven benchmarks and in two production JVM—-and significant—all four of
the state-of-the-art profilers produce incorrect profiles. Incorrect
profiles can easily cause a performance analyst to spend time optimizing cold methods that will have minimal effect on performance.
We show that a proof-of-concept profiler that does not use yield
points for sampling does not suffer from the above problems
The conclusion of the paper is that we cannot really believe the result of profilers. But then, what is the alternative of using profilers. Should we go back and just use our feeling to do optimization?
UPDATE: A point that seems to be missed in the discussion is observer effect. Can we build a profiler that really 'observer effect'-free?
Source: (StackOverflow)
How do I limit a SQL Server Profiler trace to a specific database? I can't see how to filter the trace to not see events for all databases on the instance I connect to.
Source: (StackOverflow)
I need to see the queries submitted to a PostgreSQL server. Normally I would use SQL Server profiler to perform this action in SQL Server land, but I'm yet to find how to do this in PostgreSQL. There appears to be quite a few pay-for tools, I am hoping there is an open source variant.
Source: (StackOverflow)
Possibly some methods to turn on and turn off profiling from code?
Or can you select a specific function to profile ?
Source: (StackOverflow)
This could be a borderline advertisement, not to mention subjective, but the question is an honest one. For the last two months, I've been developing a new open source profiler for .NET called SlimTune Profiler (http://code.google.com/p/slimtune/).
It's a relatively new effort, but when I looked at the range of profilers available I was not terribly impressed. I've done some initial work based on existing products, but I felt this would be a good place to ask: what exactly do you WANT from a profiler?
I come from real time graphics and games, so it's important to me that a profiler be as fast as possible. Otherwise, the game becomes unplayable and profiling an unplayable slow game tends not to be very enlightening. I'm willing to sacrifice some accuracy as a result. I don't even care about exceptions. But I'm not very familiar with what developers for other types of applications are interested in. Are there any make or break features for you? Where do existing tools fall down?
Again, I apologize if this is just way off base for StackOverflow, but it's always been an incredibly useful resource to me and there's a very wide range of developers here.
Source: (StackOverflow)
I've been trying to use Firebug's profiler to better understand the source of some JavaScript performance issues we are seeing, but I'm a little confused by the output.
When I profile some code the profiler reports Profile (464.323 ms, 26,412 calls). I suspect that the 464.323 ms is the sum of the execution time for those 26,412 calls.
However, when I drill down into the detailed results I see individual results with an average execution time greater than 464.323 ms, e.g. the result with the highest average time reports the following details:
Calls: **1**
Percent: **0%**
Own Time: **0.006 ms**
Time: **783.506 ms**
Avg: **783.506 ms**
Min: **783.506 ms**
Max: **783.506 ms**
Another result reports:
Calls: **4**
Percent: **0.01%**
Own Time: **0.032 ms**
Time: **785.279 ms**
Avg: **196.32 ms**
Min: **0.012 ms**
Max: **783.741 ms**
Between these two results the sum of the Time results is a lot more than 464.323.
So, what do these various numbers mean? Which ones should I trust?
Source: (StackOverflow)