I want to implement Hopcroft's algorithm to minimize a DFA WIKIPEDIA. Until now I can remove unreachable states. The problem is that I don't understand this algorithm. I don't know how to implement it. Can somebody explain it? Or maybe expand on the algorithm so that its more understandable to implement. I don't get the following part of the algorithm at all:
let X be the set of states for which a transition on c leads to a state in A
for each set Y in P for which X ∩ Y is nonempty and Y \ X is nonempty do
replace Y in P by the two sets X ∩ Y and Y \ X
The algorithm is as following:
what I have implemented so far(poorly written, will do clean up once i finish it):
package dRegAut;
import java.io.*;
import java.util.ArrayList;
import java.util.Collections;
import java.util.Iterator;
public class dfamin {
// Global variables to hold data from the file
private int numStates,numAlphabets,numFinalStates;
private char alphabets[];
private int finalStates[];
private int [][] transitionTable;
/**
* @param args
* @throws IOException
* @throws NumberFormatException
*/
public static void main(String[] args) throws NumberFormatException, IOException {
int numStates,numAlphabets,numFinalStates;
char alphabets[];
int finalStates[];
int [][] transitionTable;
/*
* INPUT FILE FORMAT: numberOfStates numberofAlphabets transitions1 transtions2 ... numberOfFianlStates FinalState(s)
* Example:
* 8 2 1 5 6 2 0 2 2 6 7 5 2 6 6 4 6 2 2 2 6
* 5 2 0 1 0 1 3 4 3 4 2 4 3 0 2 3
* 8 2 1 0 0 2 3 1 3 0 3 5 6 4 5 6 6 3 1 3
* 9 2 1 4 2 5 3 7 4 7 5 8 6 1 7 1 8 2 0 4 3 2 5 8
*/
// Take file name and open a stream to read it
FileInputStream fileStream = new FileInputStream("/path/to/file");
BufferedReader br = new BufferedReader(new InputStreamReader(fileStream));
// Store each line from the file
String line;
// Read each line from file
while((line = br.readLine()) != null){
// Read single spaced data from each line
String [] splittedLine = line.split(" ");
// Read numStates,numAlphabets from the line
numStates = Integer.parseInt(splittedLine[0]);
numAlphabets = Integer.parseInt(splittedLine[1]);
//for(int a=0;a<numAlphabets;a++){
//alphabets[a] = '0';
//}
transitionTable = new int[numStates][numAlphabets];
int tt= 2;
// Loop thorough the line and read transition table
for(int row=0;row<numStates;row++){
for(int col=0;col<numAlphabets;col++){
transitionTable[row][col] = Integer.parseInt(splittedLine[tt]);
tt++;
}// End of for-loop to go thorough alphabets
}// End of for-loop to go thorough states
// Read number of final states
numFinalStates = Integer.parseInt(splittedLine[2+numStates*numAlphabets]);
//System.out.println(numFinalStates);
// Read final states
int z=0;
finalStates = new int[numFinalStates];
int start = 3+numStates*numAlphabets ;
int end = (3+(numStates*numAlphabets))+numFinalStates;
for(int fs=start;fs<end;fs++){
finalStates[z] = Integer.parseInt(splittedLine[fs]);
//System.out.println(finalStates[z]);
z++;
}// End of for-loop to read all final states
dfamin x = new dfamin(numStates,numAlphabets,numFinalStates,finalStates,transitionTable);
x.minimizer();
System.out.println(x);
}// End of while-loop to read file
// Close the stream
br.close();
}
dfamin(int nS,int nA,int nFS,int fS[], int [][] tT){
numStates = nS;
numAlphabets = nA;
numFinalStates = nFS;
//alphabets = a;
finalStates = fS;
transitionTable = tT;
}// End of DFAMinimizer constructor
/*
* A method to minmize the dfa
*/
public void minimizer(){
// Remove unreachable States
ArrayList<Integer> reachableStates = reachableStates(numStates, numAlphabets,transitionTable);
// Store all final states
ArrayList<Integer> fStates = new ArrayList<Integer>();
// Loop thorugh finalStates array and transfer its data to array list
for(int fs:finalStates){
fStates.add(fs);
}// End of for-loop
// Store all non final states
ArrayList<Integer> nonFStates = new ArrayList<Integer>();
// Store only non final states in nonFstates
nonFStates = nonFinalStates(reachableStates,fStates);
//TODO: IMPLEMENT HOPCROFT's ALGORITHM
}// End of minimizer method
/*
* unreachableStates - A method to find unreachable states of a DFA
*
*/
public ArrayList<Integer> reachableStates(int numStates, int numAlphabets, int [][] transitionTable){
// Initialize a list to hold temporary list of states in it
ArrayList<Integer> reachableStates =new ArrayList();
ArrayList<Integer> newStates = new ArrayList();
// Start from the state zero
reachableStates.add(0);
newStates.add(0);
// Temporary array to hold reachable states
ArrayList<Integer> temp = new ArrayList();
// Loop until there is data in newStates
do{
// Empty temp array
temp.clear();
// Loop thorough all the states, and check for {p such that p=δ(q,c)};
for(int j=0;j<newStates.size();j++){
for(int i=0; i<numAlphabets;i++){
// If found add it to the temp set
temp.add(transitionTable[newStates.get(j)][i]);
} // End of for-loop to go thorough all characters
}// End of for-loop to go thorough all elements of the newStates array list
// Clear newStates list
newStates.clear();
// Add the elements that are in temp, but are not in reachableStates to newStates
// new_states := temp \ reachable_states;
for(int z=0;z<temp.size();z++){
for(int z1=0; z1<reachableStates.size();z1++){
// If the state was already present, don't add
if(temp.get(z) == reachableStates.get(z1)){
break;
}
if(temp.get(z) != reachableStates.get(z1) && z1 == reachableStates.size()-1){
// Only from temp to newstates if its not in reachablestates currently
newStates.add(temp.get(z));
}
}// End of for-loop to go thorough all reachableStates elements and check if a match
}// End of for-loop thorugh all temp states
// If newStates list is not empty then add it to the reachableStates
if(!newStates.isEmpty()){
// Add the newStates elements to reachable states
for(int y=0;y<newStates.size();y++){
//System.out.printf("newStates:%d newStatesSize:%d in %d",newStates.get(y),newStates.size(),y);
reachableStates.add(newStates.get(y));
}
}
}while(!newStates.isEmpty());
reachableStates = removeDuplicate(reachableStates);
return reachableStates;
}// End of unreachableStates method
/*
* removeDuplicate - a function to remove duplicate entries from an ArrayList
*
*/
ArrayList<Integer> removeDuplicate(ArrayList<Integer> input){
// Remove duplicate entries from reachableStates list
// Compare the first index, with all other indexes, compare the second with all other indexes
for(int i=0;i<input.size()-1;i++){
for(int j=i+1;j<input.size();j++){
// If dupblicate states remove it
if(input.get(i) == input.get(j)){
input.remove(j);
}
}
}// End of for-loop to remove duplicate entries from reachableList
// Sort the list before returning
Collections.sort(input);
// Return the list
return input;
}// End of removeDuplicate method
/*
* nonFinalStates - a method to return an array list of nonfinal states, given all and final states
*
*/
ArrayList<Integer> nonFinalStates(ArrayList<Integer> allStates, ArrayList<Integer> finalStates){
// All non final States
ArrayList<Integer> nonFinalStates = new ArrayList<Integer>();
// Loop thorough allStates, and compare each state with the list of finalstates
for(int i=0; i<allStates.size();i++){
// Loop thorough list of final states
for(int j=0; j<finalStates.size();j++){
// If a state is final state
if(allStates.get(i) == finalStates.get(j)){
// Then remove it from the list
allStates.remove(i);
}
}// End of for-loop to travers finalstates
}// End of for-loop to traverse allstates
return nonFinalStates;
}
// returns a string that is compatible with our input file specification
public String toString() {
StringBuffer buf = new StringBuffer();
//buf.append(" "+ numStates +" ");
//buf.append ( numAlphabets + " " );
buf.append("Transition Table: ");
for ( int i = 0; i < numStates; i++ ) {
for ( int j = 0; j < numAlphabets; j++ ) {
buf.append ( " "+ transitionTable[i][j] + " " );
}
}
buf.append ( "Number of Final State(s): "+numFinalStates + " Final State(s): " );
for ( int i = 0; i < numFinalStates; i++ )
buf.append ( finalStates[i] + " " );
return buf.toString();
}
}
Call two DFA states equivalent if it's impossible to tell from accepting/rejecting behavior alone which of them the DFA is in. For each language, the minimum DFA accepting that language has no equivalent states.
Hopcroft's algorithm for DFA minimization works by computing the equivalence classes of the states of the unminimized DFA. The nub of this computation is an iteration where, at each step, we have a partition of the states that is coarser than equivalence (i.e., equivalent states always belong to the same set of the partition).
The initial partition is accepting states and rejecting states. Clearly these are not equivalent.
Suppose that we have states q1 and q2 in the same set of the current partition. Letting the transition function be delta, if there exists a symbol sigma such that delta(q1, sigma) and delta(q2, sigma) are in different sets of the partition, then we know by the coarseness invariant that these states are not equivalent, i.e., there exists a string x such that delta(delta(q1, sigma), x) and delta(delta(q2, sigma), x) differ in whether they accept/reject. But delta(delta(q1, sigma), x) = delta(q1, sigma x), so the string sigma x differentiates between q1 and q2. The logic that you've quoted is to split one of the partition sets appropriately.
The interesting part of the proof of correctness is that, when Step 2 is not possible, we have arrived at the true equivalence classes.
The pseudocode looks more complicated than this because it's concerned with the efficiency with which we can find applications of Step 2.