Search code examples
dnstwisted

Trial tests DNS queries leave reactor in unclean state


I am trying to create a test that performs some tests with DNS queries. I tried to create a minimal test, that fires up a listening DNS server, and uses the twisted resolver to query this server:

from twisted.trial import unittest

from twisted.internet import reactor, defer
from twisted.names import client, dns, error, server


class Tester(unittest.TestCase):
    def setUp(self):
        self.resolver = client.Resolver(resolv='/etc/resolv.conf')
        self.resolver = client.Resolver(servers=[('127.0.0.1', 1025)])
        self.factory = server.DNSServerFactory(clients=[self.resolver])
        self.protocol = dns.DNSDatagramProtocol(controller=self.factory)
        self.port = reactor.listenUDP(1025, self.protocol)

    def tearDown(self):
        self.port.stopListening()

    def test_test(self):
        def callback(ignore):
            print("Received callback!")
        res = client.createResolver(servers=[('127.0.0.1', 1025)], resolvconf='/dev/null', hosts='/dev/null')
        d = res.lookupAddress('foobar.com')
        d.addCallback(callback)

Running this test results in the following error:

 [ERROR]
 Traceback (most recent call last):
 Failure: twisted.trial.util.DirtyReactorAggregateError: Reactor was unclean.
 DelayedCalls: (set twisted.internet.base.DelayedCall.debug = True to debug)
 <DelayedCall 0x7f44c69042e8 [0.9992678165435791s] called=0 cancelled=0 
 DNSMixin._clearFailed(<Deferred at 0x7f44c6904358>, 28457)>
 <DelayedCall 0x7f44c68f3e10 [59.99872899055481s] called=0 cancelled=0 Resolver.maybeParseConfig()>

test.Tester.test_test

==================================================================
[ERROR]
Traceback (most recent call last):
Failure: twisted.trial.util.DirtyReactorAggregateError: Reactor was 
unclean.
Selectables:
<<class 'twisted.names.dns.DNSDatagramProtocol'> on 34529>

test.Tester.test_test
-------------------------------------------------------------------------------
Ran 1 tests in 0.003s

So it seems that the reactor is not cleared from the message that is sent by the resolver in test_test.

I don't understand why that happens. The documentation says that trial runs the reactor, and that I should not touch it. Do I use the testing framework wrong?


Solution

  • You probably shouldn't do real network traffic in your test suite. The real network is flaky and test suites which rely on it tend to be error prone and frustration-provoking. You don't really want your test run to fail just because systemd-resolved got updated and started doing something wacky to some DNS traffic it noticed passing around your system.

    The main strategy for avoiding real network traffic is to have an alternate implementation of the interface one level below the one you're testing - an implementation which simply doesn't use the real network. My go-to strategy is to emulate the network behavior using simple in-memory objects. If you want, you can then run the test suite for that lower level against both the real and in-memory implementations and verify both implementations are "the same" at least up to some point.

    That said, there's a simple bug in your tearDown. It calls stopListening which returns a Deferred but it doesn't return that Deferred itself. Thus, trial decides cleanup is done when tearDown returns but it may not be yet. Return the stopListening Deferred and you may avoid one of the unclean errors.

    There's a similar bug in test_test. It doesn't return d so trial decides the test is over (successfully) as soon as the method returns. Return d and it will decide the test is over when d fires (and only pass the test if it fires with a success result).