Profiling Python Flask Web Apps with Werkzeug Middleware Application Profiler

There will be times when one of your Flask application routes will slow down and you will want to find out exactly which parts in the request flow consume the most amount of time. One could use time.time() to measure the elapsed time between two points in the codebase but that still doesn’t tell us which part of the code is hogging time.

Generally, we want to know how often the functions, methods, etc. in the entire script or code flow ran and how long did they take to execute. With the Werkzeug Application Profiler middleware, we can easily profile each request that is served by our Flask app. Werkzeug is a library that is already a part of Flask, so you won’t need to (pip) install anything.

Usage

Let’s see how to use this middleware profiler. Here’s what the implementation looks like:

from flask import Flask
# Import the middleware app profiler
from werkzeug.middleware.profiler import ProfilerMiddleware

app = Flask(__name__)
# Initialise the profiler
app.wsgi_app = ProfilerMiddleware(app.wsgi_app)

Now when you send a request to one of your endpoints/routes, you’ll see a dump of stats like this in your console/terminal logs:

----------------------------------------
PATH: '/search'
         7508936 function calls (7492431 primitive calls) in 3.824 seconds

   Ordered by: internal time, call count

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.701    0.701    2.152    2.152 /home/codingshower/.pyenv/lib/python/site-packages/nltk/corpus/reader/wordnet.py:1213(_load_lemma_pos_offset_map)
   101132    0.485    0.000    0.485    0.000 {method 'read' of '_io.BufferedReader' objects}
   166457    0.351    0.000    1.258    0.000 /home/codingshower/.pyenv/lib/python/site-packages/nltk/data.py:1072(readline)
  1440656    0.226    0.000    0.347    0.000 /home/codingshower/.pyenv/lib/python/site-packages/nltk/corpus/reader/wordnet.py:1224(_next_token)
5048/4702    0.165    0.000    0.327    0.000 /home/codingshower/.pyenv/lib/python/site-packages/nltk/corpus/reader/wordnet.py:1429(_synset_from_pos_and_line)
1635771/1635600    0.139    0.000    0.156    0.000 {built-in method builtins.next}
       29    0.113    0.004    0.113    0.004 {method 'recv_into' of '_socket.socket' objects}
       ...

----------------------------------------

If you have used the standard python profilers before, then you’ll notice that this output is similar to that. Following are the columns reported:

  • ncalls: number of times the function was called.
  • tottime: total time spent inside the function (excluding calls to sub or inner functions).
  • percall: tottime divided by ncalls.
  • cumtime: total time spent inside the functions and any subfunctions called inside, including recursive calls.
  • percall: cumtime divided by ncalls.
  • filename:lineno(function): file name and line location for the respective entry.

But why is the output similar to that of the standard profilers? Because the Werkzeug middleware profiler uses the standard profilers only. This is what it does internally:

try:
    from cProfile import Profile
except ImportError:
    from profile import Profile  # type: ignore

One important thing to note here is that, cProfile and profile provide deterministic profiling of Python programs, i.e., all function calls, function returns and exception events are monitored (to derive the precise time taken by them) when the code is being executed. This kind of profiling adds a bit of overhead to your code execution.

Hence, if you do plan to use this in production, make sure you test beforehand how much overhead is added so that you can decide whether that is acceptable or not. I’ve happily used it in production environments after running my tests and determining that the overhead is acceptable (probably in tens of ms in my case).

Although, I’ve recently been experimenting with pyinstrument which does statistical profiling (sampling) instead of deterministic profiling.

Alright, so now let’s discuss what are the different options (arguments) that we can pass to ProfilerMiddleware to tweak the configuration according to our needs.

stream: IO[str]

Use this to dump the stats to a stream instead of sys.stdout (default). In the example below, we’ll write the stats to a file, disabling printing in the console.

file = open('stats.txt', 'a')
ProfilerMiddleware(app.wsgi_app, stream=file)

sort_by: Iterable[str]

A tuple of columns to sort the stats by. Self-explanatory but still, see pstats.Stats.print_stats() for more info.

ProfilerMiddleware(app.wsgi_app, sort_by=('totime', 'stdname')

restrictions: Iterable[Union[str, int, float]]

Limits the entries to be printed in the stats. See pstats.Stats.print_stats() for more info.

# Restrict the stats output by only 5 lines
ProfilerMiddleware(app.wsgi_app, restrictions=(5,))

# Restrict the stats output by 10% of lines (can pass 0-1)
ProfilerMiddleware(app.wsgi_app, restrictions=(0.1,))

# Restrict the stats output by first 10 lines and then
# further filter those entries where the standard names
# (last col with filename and lineno) are matched against
# 'foo' as regexp (not plain str)
ProfilerMiddleware(app.wsgi_app, restrictions=(10, 'foo'))

For the last example in the snippet above, you’ll see something like this in the stats output:

Ordered by: internal time, call count
   List reduced from 1991 to 10 due to restriction <10>
   List reduced from 10 to 2 due to restriction <'foo'>

profile_dir: Optional[str]

Saves the profiled stats data into the specified directory as binary files. For example, run mkdir ./profs and then use the following code to dump profiled data into the ./profs directory in separate files for each request.

ProfilerMiddleware(app.wsgi_app, profile_dir='./profs')

The file names per request in the ./profs (profile_dir) directory is controlled by the next option.

filename_format: str

“Format string” for profile data file names, or a callable returning a name. These file names will be present/stored in the profile_dir path (previous option). The default value for this field is:

# Sample file name - "POST.search.4258ms.1655544631.prof"
"{method}.{path}.{elapsed:.0f}ms.{time:.0f}.prof"

This is how you can specify your own file format:

ProfilerMiddleware(app.wsgi_app, filename_format="{method}.{path}.prof")

Visualization

Once the profiled data has been dumped into a file, it is not the most effective way to read and analyse the stats dump (text) or the profile dumps (binary format). We should be able to visualise the entire dump in a nice user interface for better analysis.

This is where the SnakeViz tool comes into the picture and is super easy to use. SnakeViz is a browser-based graphical tool (visualizer) for the output of cProfile.

$ pip install snakeviz
$ snakeviz <profile_dir>

Just install the tool/module with pip and run it with the profile_dir path that we covered before. It’ll automatically boot up a web server that you can load in your browser and visually analyse the profiled data which is so much better.

This is what I did to run SnakeViz on my server where I had my cProfile data dumps stored:

# -s SnakeViz makes an attempt to open the browser by default.
# With this option, it won't
#
# -H 0.0.0.0 bind to all network interfaces
#
# profs/ is my profile_dir path

$ snakeviz -s -H 0.0.0.0 profs/
snakeviz web server started on 0.0.0.0:8080; enter Ctrl-C to exit
http://0.0.0.0:8080/snakeviz/%2Fhome%2Fcodingshower%2Fprofs

Here’s a screenshot of the tool loaded in the browser:

SnakeViz

The Icicle representation above will easily help you figure out which function calls whom and who takes how much time to execute.

Leave a Reply

Your email address will not be published. Required fields are marked *