Search code examples
pythondictionaryfuzzer

Fuzzer for Python dictionaries


I am currently looking for a fuzzer for Python dictionaries. I am already aware of some fuzzing tools such as:

However, they seem a bit broader of what I am looking for. Actually, my goal is to provide a Python dictionary to a given tool and obtain a new dictionary very similar to the input one but with some values changed.

For instance, providing

{k1: "aaa", k2: "bbb", k3: "ccc"}

I intend to obtain the following new dictionaries:

{k1: "aaj", k2: "bbb", k3: "ccc"}
{k1: "aaa", k2: "bbr", k3: "ccc"}
{k1: "aaa", k2: "bbb", k3: "ccp"}
...

Are you aware of this kind of tools? Any suggestion will be welcomed.

In the best of the scenarios I would like this to be an open source tool.

EDIT1: I post the code I tryed up to the moment:

  def change_randomly(self, v):
    from random import randint
    import string

    new_v = list(v)
    pos_value = randint(0, len(v)-1)
    random_char = string.letters[randint(0, len(string.letters)-1)]

    new_v[pos_value] = str(random_char)
    return ''.join(new_v)

For sure, it may be improved, so I look forward for any thought regarding it.

Thanks!


Solution

  • Based on the comments to the question, why not simply writing a fixed length template based fuzzer like this:

    #! /usr/bin/env python
    """Minimal template based dict string value fuzzer."""
    from __future__ import print_function
    
    import random
    import string
    
    
    def random_string(rng, length, chars=string.printable):
        """A random string with given length."""
        return ''.join(rng.choice(chars) for _ in range(length))
    
    
    def dict_string_template_fuzz_gen(rng, dict_in):
        """Given a random number generator rng, and starting from
        template dict_in expected to have only strings as values,
        this generator function yields derived dicts with random
        variations in the string values keeping the length of
        those identical."""
    
        while True:
            yield dict((k, random_string(rng, len(v))) for k, v in dict_in.items())
    
    
    def main():
        """Drive a test run of minimal template fuzz."""
    
        k1, k2, k3 = 'ka', 'kb', 'kc'
        template = {k1: "aaa", k2: "bbb", k3: "ccc"}
    
        print("# Input(template):")
        print(template)
    
        rng = random.SystemRandom()
        print("# Output(fuzz):")
        for n, fuzz in enumerate(dict_string_template_fuzz_gen(rng,
                                 template), start=0):
            print(fuzz)
            if n > 3:
                break
    
    if __name__ == '__main__':
        main()
    

    On the use case input it might yield this:

    # Input(template):
    {'kc': 'ccc', 'kb': 'bbb', 'ka': 'aaa'}
    # Output(fuzz):
    {'kc': '6HZ', 'kb': 'zoD', 'ka': '5>b'}
    {'kc': '%<\r', 'kb': 'g>v', 'ka': 'Mo0'}
    {'kc': 'Y $', 'kb': '4z.', 'ka': '0".'}
    {'kc': '^M.', 'kb': 'QY1', 'ka': 'P0)'}
    {'kc': 'FK4', 'kb': 'oZW', 'ka': 'G1q'}
    

    So this should give the OP something to start as it might be a bootstrapping problem, where Python knowledge is only starting ...

    I just hacked it in - PEP8 compliant though - and it should work no matter if Python v2 or v3.

    Many open ends to work on ... but should get one going to evaluate, if a library or some simple enhanced coding might suffice. Only the OP will know but is welcome to comment on this answer proposal or update the question.

    Hints: I nearly always use SystemRandom so you can parallelize more robustly. There may be faster ways, but performance was not visible to me in the specification. The print's are of course sprankled in as this is educational at best. HTH

    Update: Having read the OP comment on changing only part of the strings to preserve some similarity, one could exchange above fuzzer function by e.g.:

    def dict_string_template_fuzz_len_gen(rng, dict_in, f_len=1):
        """Given a random number generator rng, and starting from
        template dict_in expected to have only strings as values,
        this generator function yields derived dicts with random
        variations in the string values keeping the length of
        those identical.
        Added as hack the f_len parameter that counts the
        characters open to be fuzzed from the end of the string."""
    
        r_s = random_string  # shorten for line readability below
        while True:
            yield dict(
                (k, v[:f_len + 1] + r_s(rng, f_len)) for k, v in dict_in.items())
    

    and then have as sample output:

    # Input(template):
    {'kc': 'ccc', 'kb': 'bbb', 'ka': 'aaa'}
    # Output(fuzz):
    {'kc': 'cc\t', 'kb': 'bbd', 'ka': 'aa\\'}
    {'kc': 'cc&', 'kb': 'bbt', 'ka': 'aa\\'}
    {'kc': 'ccg', 'kb': 'bb_', 'ka': 'aaJ'}
    {'kc': 'ccc', 'kb': 'bbv', 'ka': 'aau'}
    {'kc': 'ccw', 'kb': 'bbs', 'ka': "aa'"}
    

    When calling this function instead of the other.