EzDevInfo.com

pycallgraph

pycallgraph is a Python module that creates call graphs for Python programs. Python Call Graph — Python Call Graph 1.0.1 documentation

PyCallGraph insufficient profiling depth

So I've taken PyCallGraph for a spin recently, as a more visually appealing and clear alternative to using cProfile.

However, it doesn't perform as advertised. To show you what I mean, I've taken the regex sample code(modified it slightly to take the words from a text file I quickly created from copy/pasting words off the online Oxford Learner's Dictionary.

Instead of getting : Expected Result

I'm getting : Actual Result

Not quite the same level of detail, you'll agree.

Here is the code I'm using, it's also available on the PyCallGraph page, albeit without the modification I did to link to a text file.

Would anybody have a clue about something I'd need to configure I missed?

I'm running Python 2.7.8 on Windows 7 and installed the latest versions of PyCallGraph(1.0.1) and Graphviz(2.38) yesterday.

import argparse
import re

from pycallgraph import PyCallGraph
from pycallgraph import Config
from pycallgraph.output import GraphvizOutput


class RegExp(object):

def main(self):
    parser = argparse.ArgumentParser()
    parser.add_argument('--grouped', action='store_true')
    conf = parser.parse_args()

    if conf.grouped:
        self.run('regexp_grouped.png', Config(groups=True))
    else:
        self.run('regexp_ungrouped.png', Config(groups=False))

def run(self, output, config):
    graphviz = GraphvizOutput()
    graphviz.output_file = output
    self.expression = r'^([^s]).*(.)\2.*\1$'

    with PyCallGraph(config=config, output=graphviz):
        self.precompiled()
        self.onthefly()

def words(self):
    a = 200
    for word in open('test.txt'):
        yield word.strip()
        a -= 1
        if not a:
            return


def precompiled(self):
    reo = re.compile(self.expression)
    for word in self.words():
        reo.match(word)

def onthefly(self):
    for word in self.words():
        re.match(self.expression, word)


if __name__ == '__main__':
    RegExp().main()

Source: (StackOverflow)

Non-graphical output from pycallgraph

I've started writing a small Python utility to cache functions. The available caching tools (lru_cache, Beaker) do not detect changes of sub-functions.

For this, I need a Call Graph. There exists an excellent tool in pycallgraph by Gerald Kaszuba. However, so far I've only got it to output function-name strings. What I need are either function-objects or function-code-hashes.

What I mean with these two terms: Let def foo(x): return x, then foo is the function-object, and hash(foo.__code__.co_code) is the function-code-hash.

What I have

You can see what I have here. But below is a minimal example. The problem I have in this example, is that I can't go from a function name (the string) to the function definition again. I'm trying with eval(func).

So, I guess there are two ways of solving this:

  1. Proper pycallgraph.output, or some otherway to get what I want directly from Pycallgraph.
  2. Dynamically loading the function from the function.__name__ string.

import unittest
from pycallgraph import PyCallGraph
from pycallgraph.output import GraphvizOutput

class Callgraph:
    def __init__(self, output_file='callgraph.png'):
        self.graphviz = GraphvizOutput()
        self.graphviz.output_file = output_file

    def execute(self, function, *args, **kwargs):
        with PyCallGraph(output=self.graphviz):
            ret = function(*args, **kwargs)

        self.graph = dict()
        for node in self.graphviz.processor.nodes():
            if node.name != '__main__':
                f = eval(node.name)
                self.graph[node.name] = hash(f.__code__.co_code)
        return ret

    def unchanged(self):
        '''Checks each function in the callgraph whether it has changed.
        Returns True if all the function have their original code-hash. False otherwise.
        '''
        for func, codehash in self.graph.iteritems():
            f = eval(func)
            if hash(f.__code__.co_code) != codehash:
                return False
        return True

def func_inner(x):
    return x
def func_outer(x):
    return 2*func_inner(x)

class CallgraphTest(unittest.TestCase):
    def testChanges(self):
        cg = Callgraph()
        y = cg.execute(func_outer, 3)
        self.assertEqual(6, y)
        self.assertTrue(cg.unchanged())
        # Change one of the functions
        def func_inner(x):
            return 3+x
        self.assertFalse(cg.unchanged())
        # Change back!
        def func_inner(x):
            return x
        self.assertTrue(cg.unchanged())


if __name__ == '__main__':
    unittest.main()

Source: (StackOverflow)

Advertisements

pycallgraph with pycharm does not work on windows

I m using Windows 7, Python 3.4.1, Anaconda 2.0.1 , Pycharm 3.4.
Graphviz and dot work normally in the console.

However, when trying to use pycallgraph it finishes with an error.

"C:\Users\John\Anaconda3\python.exe" C:/PycharmProjects/myprojectname/abilities.py
Traceback (most recent call last):
  File "C:/PycharmProjects/myprojectname/abilities.py", line 1247, in <module>
    with PyCallGraph(output=GraphvizOutput()):
  File "C:\Users\John\Anaconda3\lib\site-packages\pycallgraph\pycallgraph.py", line 32, in __init__
    self.reset()
  File "C:\Users\John\Anaconda3\lib\site-packages\pycallgraph\pycallgraph.py", line 53, in reset
    self.prepare_output(output)
  File "C:\Users\John\Anaconda3\lib\site-packages\pycallgraph\pycallgraph.py", line 97, in prepare_output
    output.sanity_check()
  File "C:\Users\John\Anaconda3\lib\site-packages\pycallgraph\output\graphviz.py", line 63, in sanity_check
    self.ensure_binary(self.tool)
  File "C:\Users\John\Anaconda3\lib\site-packages\pycallgraph\output\output.py", line 97, in ensure_binary
    'The command "{}" is required to be in your path.'.format(cmd))
pycallgraph.exceptions.PyCallGraphException: The command "dot" is required to be in your path.

Process finished with exit code 1

What can i do to fix this?
I checked this but it's for mac.


Source: (StackOverflow)

PyCallGraph and Long Running Processes

I just discovered PyCallGraph and it is quite a wonderful tool. I've noticed, though, that it has problems with long-running processes, slowly consuming huge amounts of memory (possibly since /tmp is a ramdisk) and eventually killing the system.

I'm trying to determine if it is possible to somehow have a checkpointing thread/process that will use the PyCallGraph.PickleOutput to save a partial profile every XXXX seconds. Looking through the docs does not provide enlightenment.

Another question is whether it is possible to have the names of functions automatically abbreviated? Reading a graph full of MyBigProgram.CoreFunctionality.Utility.baz.Foo() and MyBigProgram.CoreFunctionality.Utility.baz.Bar() gets crazy. :)


Source: (StackOverflow)

Profiling in Python with Pycallgraph: Colour Coding of Nodes and Time per Call

I use pycallgraph to have a nice visualisation of profiling my program. This produces nice plots like the following:

enter image description here

I call pycallgraph directly from within my python script:

if __name__ == '__main__':
    graphviz = GraphvizOutput()
    graphviz.output_file = './profile.png'
    service_filter = GlobbingFilter(include=['*storageservice.*',
                                             '*ptcompat.*'])
    config = Config(groups=True, verbose=True)
    config.trace_filter = service_filter

    print('RUN PROFILE')
    with PyCallGraph(config=config, output=graphviz):
        test_run()
    print('DONE RUN PROFILE')

Just by eye balling you can identify function calls that may be optimised to save time. The nodes list the number of calls as well as the total runtime and are coloured according to their performance. However, this does not quite work as expected. The colour scheme does not reflect total runtime but a mixture of runtime and number of calls.

For instance, a node in the graph above appears in darker blue which runs only 0.0424 seconds but was called 412 times. This is surprising because a node being run for 0.1524 seconds but with only 138 calls shows a much lighter colour. Thus, identifying functions with a large total runtime is quite hard if also number of calls influences the node colour. Moreover, it would also be easier to spot potential bottlenecks if the time per call would be listed as well.

A: How do I change the node colouring such that number of calls is ignored, but only the total runtime is considered?

B: How do I make pycallgraph to also list the time per call besides number of calls and total runtime?


Source: (StackOverflow)

PyCallGraph middleware in django

I'm trying to implement a middleware in django(1.4) to create a call graph using PyCallGraph. I've based it from two different snippets found online. This is what it looks like:

import time
from django.conf import settings
from pycallgraph import Config
from pycallgraph import PyCallGraph
from pycallgraph.output import GraphvizOutput

class CallgraphMiddleware(object):
    def process_view(self, request, callback, callback_args, callback_kwargs):
        if settings.DEBUG and 'graph' in request.GET:
            config = Config()
            config.trace_filter = GlobbingFilter(exclude=['pycallgraph.*','*.secret_function',], include=['reports.*'])
            graphviz = GraphvizOutput(output_file='callgraph-' + str(time.time()) + '.png')
            pycallgraph = PyCallGraph(output=graphviz, config=config)
            pycallgraph.start()
            self.pycallgraph = pycallgraph

    def process_response(self, request, response):
        if settings.DEBUG and 'graph' in request.GET:
            self.pycallgraph.done()
        return response

I've added it to the other middlewares installed on settings.py then started the server.
It seems to trigger when the process_view is called but when it gets to process_response django complains, telling me that 'CallgraphMiddleware' object has no attribute 'pycallgraph'. How is that possible? Apparently the line

self.pycallgraph = pycallgraph

is not taken into account. Why?


Source: (StackOverflow)

How do you install PyCallGraph / use pip?

and thanks to anyone who gives some of their time to consider my problem.

What I need help on is for someone to give me a simple and accessible explanation on how to install that module. I have never, ever used anything from PyPi before, I have only heard of pip after looking up PyCallGraph.

I'm not a programmer first, I'm doing an accounting internship and am using python to write scripts to help me speed up some processes, at the urging of a colleague who himself uses python. I write scripts using Notepad++ and execute them through IDLE.

I'm currently working on optimizing a script I wrote and came upon PyCallGraph while checking this very site on tips on how to do so.

I tried the very minimalistic instruction of just doing "pip install pycallgraph" just about anywhere I could think of, including cmd.exe, to no avail. Runing get-pip.py directly seems to have worked for installing pip, though.

Otherwise I can always just stick with the cProfile printout and write-off using modules needing such an install, although that saddly seems to be quite a few...


Source: (StackOverflow)

Pycallgraph not generating graphd output in debug mode

I'm using Pycallgraph to generate output, but I want to save the intermediate graphd output (instead of generating an image) because I want to make some small modifications to it.

I'm running as:

PYTHONPATH=. pycallgraph -d graphviz -- ./ab_ndh_graph.py > out.graphd

Which is generating 2x things:

  1. pycallgraph.png -- this is the entire call graph (graphd output in out.graphd)
  2. filter_max_depth.png -- this is the code based call graph (correct, but no graphd output)

How can I get the graphd output to be generated for "filter_max_depth" instead?

File contents:

config = Config(max_depth=2)
config.trace_filter = GlobbingFilter(exclude=[
    'pycallgraph.*',
])
graphviz = GraphvizOutput(output_file='filter_max_depth.png')

with PyCallGraph(output=graphviz, config=config):
    o = AB_NDH()
    o.run()

Source: (StackOverflow)

How to get useful information from pycallgraph tracing?

In my question about tracking the python execution I was recommended to use pycallgraph so I decided to give it a try on following code:

#!/bin/env python

import matplotlib.pyplot as plt
import numpy as np

x = np.linspace(0, 2, 100)

# The first call to plt.plot will automatically create
# the necessary figure and axes to achieve the desired plot.
plt.plot(x, x, label='linear')

# Subsequent calls to plt.plot re-use
# the current axes and each add another line.
plt.plot(x, x**2, label='quadratic')
plt.plot(x, x**3, label='cubic')

# Setting the title, legend, and axis labels also automatically
# use the current axes and set the title,
# create the legend, and label the axis respectively.
plt.xlabel('x label')
plt.ylabel('y label')

plt.title("Simple Plot")

plt.legend()

plt.show()

This code is from my another question where I've been asking about hierarchy in matplotlib, and I get the following answer:

If you're really keen to see how the auto creation is done - it's all open source. You can see the call to plot() creates an Axes instance by a call to gca() in the code here. This in turn calls gcf(), which looks for a FigureManager (which is what actually maintains the state). If one exists, it returns the figure it's managing, otherwise it creates a new one using plt.figure(). Again, this process to some degree inherits from matlab, where the initial call is usually figure before any plotting operation.

First, I was trying pycallgraph with graphviz option, but it gives me following error:

$ pycallgraph graphviz -- matplotlib.py
libpath/shortest.c:324: triangulation failed
libpath/shortest.c:192: source point not in any triangle
Error: in routesplines, Pshortestpath failed
Segmentation fault
Traceback (most recent call last):
  File "/usr/local/bin/pycallgraph", line 26, in <module>
    exec(__file_content)
  File "/usr/local/lib/python2.7/dist-packages/pycallgraph/pycallgraph.py", line 38, in __exit__
    self.done()
  File "/usr/local/lib/python2.7/dist-packages/pycallgraph/pycallgraph.py", line 81, in done
    self.stop()
  File "/usr/local/lib/python2.7/dist-packages/pycallgraph/pycallgraph.py", line 90, in generate
    output.done()
  File "/usr/local/lib/python2.7/dist-packages/pycallgraph/output/graphviz.py", line 112, in done
    'code %(ret)i.' % locals())
pycallgraph.exceptions.PyCallGraphException: The command "dot -Tpng -opycallgraph.png /tmp/tmpObsZGK" failed with error code 35584.

Second I've tried to generate gephi format and it worked:

$ pycallgraph gephi -- matplotlib.py

When I've opened this in gephi I get huge graph (Nodes: 1062, Edges: 1362) which was almost useless and I cannot saw anything useful here. So I've tried limit the output:

$ pycallgraph --max-depth 5 gephi -- traceme.py

This gives me graph with 254 nodes and 251 edges which was basically also useless because I still cannot saw nothing useful here. So, I've decided to try following:

egrep -i 'gcf|gca|plot' pycallgraph.gdf

But it returns me nothing. Then I've started wondering what exactly is pycallgraph tracing and I've made this hello world:

#!/bin/env python
print "This line will be printed."

And I've run it with (graphviz):

pycallgraph graphviz -- hello_world.py

The output (read from png file that was generated) was:

__main__ ---> <module>
  1. What tool can I use to get similar answer as I've referred? In another words, how can I know that this is the order of calling the functions: plot() -> gca() -> gcf()?
  2. What is pycallgraph actually tracing?
  3. Why hello world example is working with graphviz option but more complex example is working only with gephi option?
  4. How to preserve the DOT format which is pycallgraph generating and passing to graphviz?

Source: (StackOverflow)