Search code examples
javaenumeration

Enumerate all words of a language in a sequence


I've successfully enumerated all words for the languages Σ={a*} and Σ={a^n b^n} with following approach:

public class AStar {
    public static int n = 0;
    public static void main(String[] args) {
        for(int i=0;i<10;i++){
            System.out.println(next());
        }
    }

    public static String next(){
        String as="" , bs="";
        String s= "";
        for(int i=0; i<n; i++) {
            as+="a"; bs+="b";
            s=as+bs;
        }
        n++;
        return s;
    }
}

On each call to next(), it prints a word like this(using for loop to invoke next multiple times): ab, aabb, aaabbb, ......, a^n b^n

Current Problem

Now I'm working on a similar class that could enumerate words of language Σ*={a,b}

On each call to next(), it should return a new word that comes after the previously returned word, for instance, it should return these words:

'', 'a', 'b', 'aa', 'ab', 'ba', 'bb', 'aaa', ...... upto words having n-literals

The ordering of literals does not matter, I've tried this logic but no success so far

public class ABStar {
    static int col=0, row=0, cur=0, rlen=0, flag=1, set=0;
    public static void main(String[] args) {
        for(int i=0;i<5;i++){
            System.out.println(next());
        }
    }
    public static String next(){
        String s="";
        row = 1 << col;
        rlen = row/2;
        for(int i=1; i<=row; i++){
            for(int j=1 ; j<=col ; j++){
                if(col==1 && flag ==1){
                    return s+="a";
                }
                else if(col==1 && flag ==2){
                    return s+="b";
                }
                rlen --;
                if(rlen==0){
                    flag = 3-flag;
                    rlen=row/2;
                }
            }
        }
        col++;
        return "";
    }
}

I'm using flags col(columns), row(rows), cur(current), flag(1 for a's, 2 for b's), rlen (length of rows for each flag), I'm using these attributes to preserve the states between different calls.

I've tried to solve this problem with almost 2 dozen logics but all in vain, please help me solve this problem, I'll be highly obligated.


Solution

  • Elegant solution for a two characters alphabet

    Since you only have two letters in your alphabet, an elegant solution could be to use binary decomposition of integers (a represents a binary 0 and b represents a 1). Here's the algorithm you could implement:

    private static final int MAX_LENGTH = 3;
    
    public static void main(String[] args) {
        int wordLength = 1;
        int current = 0;
        int max = (1 << wordLength) - 1;
    
        while (wordLength <= MAX_LENGTH) {
            System.out.println(makeWord(current, wordLength));
    
            if (current < max) current++;
            else {
                wordLength++;
                current = 0;
                max = (1 << wordLength) - 1;
            }
        }
    }
    
    private static String makeWord(int current, int wordLength) {
        StringBuilder sb = new StringBuilder();
        for (int i = 0; i < wordLength; i++) {
            sb.append((char) ((current % 2) + 'a'));
            current >>= 1;
        }
        return sb.reverse().toString();
    }
    

    Output:

    a
    b
    aa
    ab
    ba
    bb
    aaa
    aab
    aba
    abb
    baa
    bab
    bba
    bbb
    

    General solution for any alphabet

    You can generalize the above solution by counting in base k (where k is the size of the alphabet) instead of 2. It could look like this:

    public static void main(String[] args) {
        listWords(new char[]{ 'e', 'z', 'f' }, 3);
    }
    
    private static void listWords(char[] alphabet, int maxWordLength) {
        Arrays.sort(alphabet);
        int base = alphabet.length;
    
        int wordLength = 1;
        int current = 0;
        int max = (int) Math.pow(base, wordLength) - 1;
    
        while (wordLength <= maxWordLength) {
            System.out.println(makeWord(alphabet, current, wordLength));
    
            if (current < max) current++;
            else {
                wordLength++;
                current = 0;
                max = (int) Math.pow(base, wordLength) - 1;
            }
        }
    }
    
    private static String makeWord(char[] alphabet, int current, int wordLength) {
        int base = alphabet.length;
    
        StringBuilder sb = new StringBuilder();
        for (int i = 0; i < wordLength; i++) {
            sb.append(alphabet[current % base]);
            current /= base;
        }
        return sb.reverse().toString();
    }
    

    Output:

    e
    f
    z
    ee
    ef
    ez
    fe
    ff
    fz
    ze
    zf
    zz
    eee
    eef
    eez
    efe
    eff
    efz
    eze
    ezf
    ezz
    fee
    fef
    fez
    ffe
    fff
    ffz
    fze
    fzf
    fzz
    zee
    zef
    zez
    zfe
    zff
    zfz
    zze
    zzf
    zzz
    

    Note that this implementation will be much slower than the previous one because divisions and Math.pow are slow operations compared to binary shifts.

    Non-arithmetic general solution

    The logic of counting in base k can also be implemented manually. It might be more efficient but will definitely take more time and code. Below is the general idea of the algorithm:

    • you start by filling the string with the lowest value
    • then, you increment the value of the last character
    • when you can't increment anymore, you reset all the characters at the right of your current position
    • you increment your position, and increment the value of this position
    • you start all over again doing the same thing from the right of the string

    What's cool about this is that if you use a StringBuilder, you can minimize the number of array allocation to only one per iteration since each state of the StringBuilder can be reached in a few operations (proportional the the length of the string) from the previous state.