I have faced a scenario, the task was to read a file which contains 3 Millions
IP Address.
There is MySQL Table which contains Id,PrimaryIP
, PrimaryIP
can by multiple IP separated by #
, more over that PrimaryIP
can also contain CIDR IP
.
So totally, there are 8000 records
, each record with multiple IP
and CIDR IP
.
Now, my task was to read that file, check it against with database and write the matching IP,ID
to a file.
Initially, when i run my program, my program failed because: java.lang.OutOfMemoryError: Java heap space
, so i have increased it by 3GB, still it was failing, then later i split the file into 6 subfiles
, as 0.5 Millions
each.
To find CIDR IP List, i have used Apache SubnetUtils
.
Below is my code :
public static void main(String[] args) {
String sqlQuery = "SELECT id,PrimaryIP from IPTable where PrimaryIP != '' limit 100000;";
Connection connection = null;
Statement statement = null;
File oFile = new File("output.txt");
System.out.println(new Date());
try{
List<String> fileData = FileUtils.readLines(new File("input.txt"));
System.out.println("File Data Size : "+fileData.size());
Class.forName("com.mysql.jdbc.Driver");
connection = DriverManager.getConnection("jdbc:mysql://localhost/db?user=root&password=pwd");
statement = connection.createStatement();
ResultSet resultSet = statement.executeQuery(sqlQuery);
System.out.println("Started with MySQL Querying");
Map<String, Integer> primaryIPIDMap = new HashMap<String, Integer>();
while (resultSet.next()) {
primaryIPIDMap.clear();
int recordID = resultSet.getInt(1);
if (resultSet.getString(2).contains("#")) {
String primaryIP[] = resultSet.getString(2).split("#");
for (int i = 0; i < primaryIP.length; i++) {
if (primaryIP[i].contains("/")) {
String allIP[] = getAllIP(primaryIP[i]);
for (int allIPi = 0; allIPi < allIP.length; allIPi++) {
primaryIPIDMap.put(allIP[allIPi].intern(), recordID);
}
} else {
primaryIPIDMap.put(primaryIP[i].intern(), recordID);
}
}
} else {
primaryIPIDMap.put(resultSet.getString(2).intern(), recordID);
}
Iterator entries = fileData.iterator();
while (entries.hasNext()) {
String t = (String) entries.next();
if (primaryIPIDMap.containsKey(t)) {
FileUtils.writeStringToFile(oFile, recordID + "," + t);
}
}
primaryIPIDMap.clear();
}
resultSet.close();
statement.close();
connection.close();
} catch (Exception e) {
e.printStackTrace();
} finally {
try {
if (statement != null)
statement.close();
} catch (Exception se2) {
}
try {
if (connection != null)
connection.close();
} catch (Exception se) {
se.printStackTrace();
}
}
System.out.println("Finished");
System.out.println("End Time : "+new Date());
}
private static String[] getAllIP(String ip) {
return new SubnetUtils(ip).getInfo().getAllAddresses();
}
Can someone tell me the best practice to solve this.
Today it just 3 Millions, tomorrow it may be 5 Millions. I can't keep on creating subfile.
I fixed the problem using
line-by-line
table was not designed by me
.