Dynamic Variant Analysis with Python
In this post I will present a way to find performance issues withing Python code with the help of pytest by doing variant analysis.
Variant analysis is the process of using a known vulnerability as a seed to find similar problems in your code.
After checking some of the work Semmle is doing regarding variant analysis I started wondering if there was a way to use the same technique for non-security related problems in code, even more, I started wondering if there was any easier way to accomplish the same goal without having to parse the python code, generating an AST and then running the analysis.
That’s when I remembered the monkeypatch example for preventing remote operations for the requests library on the pytest monkeypatch docs and realized that I could perform some sort of dinamic variant analysis by instrumenting functions, classes or methods with a custom implementation via monkeypatching. This way I could add additional security, performance or other type of checks to the original functions.
Doing dinamic variant analysis, has a nice benefit over static variant analysis, and that is that the number of false positives found will be close to 0.
In order to do this dynamic analysis feasible we would need a way to automatically exercise all the potentially vulnerable places in the code. This might look like a problem at first sight, but is actually not for most projects, since these days most projects count with test suites that should execute a good percentage of the codebase, even more they usually rely on CI system to run this test suites effortlessly. If the test coverage is high, doing analysis in this way should not be a problem at all.
The goal
I’m going to focus on performance related issues here, in particular finding performance issues on Django applications. Although this example will be targeting Django, this technique could be used by any python project with pytest support.
The concrete bug I will be chasing is calls to the django’s length templatetag with a queryset. As can be seen on the django documentation, calling length on a queryset triggers a full evaluation of the queryset which might be a performance hit when the queryset is big enough. This kind of bug might go unnotice because:
- The database used for local development is usually a subset of the one used on production, so evaluating a small queryset is fast enough to go unseen.
- Most projects at an early stage will probably have a small database, but as time passes by and the database increases its size, the performance will be slower and slower.
Show me the code
The buggy template
<html>
...
Total elements {{ data|length }}
...
</html>
The view calling the template
def buggy_template(request):
# This might return 2**32 elements
data = TestModel.objects.all()
return render(request, 'buggy.html', locals())
The test triggering the call to the templatetag
def test_buggy(client):
response = client.get('/buggy')
assert response.status_code == 200
The instrumentation
import django
import pytest
from django.db.models import QuerySet
def queryset_check_length(value):
"""
New |length implementation which checks for a queryset instance value type and raises an error
"""
if isinstance(value, QuerySet):
raise Exception('Calling length with a QuerySet')
else:
# call the default length templatetag
return django.template.defaultfilters.length(value)
@pytest.fixture(autouse=True)
def template_length_check(monkeypatch):
# Replace the length templatetag implementation with a custom one
monkeypatch.setitem(django.template.defaultfilters.register.filters, 'length', queryset_check_length)
Lets explain what this little piece of code is doing. First of all I’m defining a pytest fixture template_length_check
with autouse=True
this will cause the fixture to be called before the execution of each test, which means that each test will be run
with our custom implementation of the length templatetag, queryset_check_length
What do we want our custom implementation to do?
We want it to raise an exception in case the argument used to call the length templatetag is of type QuerySet, otherwise it will call the default length templatetag implementation. Raising an exception will cause the test to fail, so we will be able to find all the buggy templates by checking the failing tests.
Keep in mind that we are doing this only for finding bugs, having this kind of fixtures on your test suite is discouraged, since we are modifying the original behaviour of the function.
Results
When running the test suite, this is how the error looks like:
ERROR django.request:log.py:228 Internal Server Error: /buggy
Traceback (most recent call last):
File "/home/davida/dinamic_variant_analysis/env/lib/python3.6/site-packages/django/core/handlers/exception.py", line 34, in inner
response = get_response(request)
File "/home/davida/dinamic_variant_analysis/env/lib/python3.6/site-packages/django/core/handlers/base.py", line 115, in _get_response
response = self.process_exception_by_middleware(e, request)
File "/home/davida/dinamic_variant_analysis/env/lib/python3.6/site-packages/django/core/handlers/base.py", line 113, in _get_response
response = wrapped_callback(request, *callback_args, **callback_kwargs)
File "/home/davida/dinamic_variant_analysis/example/variant/views.py", line 8, in buggy_template
return render(request, 'buggy.html', locals())
File "/home/davida/dinamic_variant_analysis/env/lib/python3.6/site-packages/django/shortcuts.py", line 36, in render
content = loader.render_to_string(template_name, context, request, using=using)
File "/home/davida/dinamic_variant_analysis/env/lib/python3.6/site-packages/django/template/loader.py", line 62, in render_to_string
return template.render(context, request)
File "/home/davida/dinamic_variant_analysis/env/lib/python3.6/site-packages/django/template/backends/django.py", line 61, in render
return self.template.render(context)
File "/home/davida/dinamic_variant_analysis/env/lib/python3.6/site-packages/django/template/base.py", line 171, in render
return self._render(context)
File "/home/davida/dinamic_variant_analysis/env/lib/python3.6/site-packages/django/test/utils.py", line 96, in instrumented_test_render
return self.nodelist.render(context)
File "/home/davida/dinamic_variant_analysis/env/lib/python3.6/site-packages/django/template/base.py", line 937, in render
bit = node.render_annotated(context)
File "/home/davida/dinamic_variant_analysis/env/lib/python3.6/site-packages/django/template/base.py", line 904, in render_annotated
return self.render(context)
File "/home/davida/dinamic_variant_analysis/env/lib/python3.6/site-packages/django/template/base.py", line 987, in render
output = self.filter_expression.resolve(context)
File "/home/davida/dinamic_variant_analysis/env/lib/python3.6/site-packages/django/template/base.py", line 698, in resolve
new_obj = func(obj, *arg_vals)
File "/home/davida/dinamic_variant_analysis/example/variant/conftest.py", line 11, in queryset_check_length
raise Exception('Calling length with a QuerySet')
Exception: Calling length with a QuerySet
The full code example can be found here