I have csv file with 200k rows and 3 types of MAC address defined as:
My goal to stay only with colon-separated form.
So to convert -
to :
is not big deal:
mac = mac.replace("-", ":");
But how to convert ECE1A9312000
to EC:E1:A9:31:20:00
.
I thought to use regex but its too expensive to use groups for so many data (~80k).
Do I need to run over each char
and append :
like:
for(int i=0; i<mac.length(); i++){
ch = mac.charAt(i);
if(i % 2 == 0 && i != 0){
tmp += ':';
}
tmp += ch;
}
or there is more efficient way?
Thank you,
I threw together a totally unoptimized program based on your discarded regex approach and timed it. It completed in 650 ms (250 ms with warmup). The slowest part doesn't involve the regex, but String.format
. If we replace it with a straight StringBuilder
approach, the time drops to 40 ms.
public class Test {
static Pattern regex = Pattern.compile("(..)(..)(..)(..)(..)(..)");
public static void main(String[] args) {
final List<String> inMacs = new ArrayList<>(), outMacs = new ArrayList<>();
for (int i = 0; i < 80_000; i++) inMacs.add(mac());
final long start = System.nanoTime();
for (String mac : inMacs) {
final Matcher m = regex.matcher(mac);
m.matches();
outMacs.add(String.format("%s:%s:%s:%s:%s:%s",
m.group(1), m.group(2), m.group(3), m.group(4), m.group(5), m.group(6)));
}
System.out.println("Took " + (System.nanoTime() - start)/1_000_000 + " milliseconds");
final Iterator<String> it = outMacs.iterator();
for (int i = 0; i < 100; i++) System.out.println(it.next());
}
static Random rnd = new Random();
static String mac() {
final long mac = (long) (rnd.nextDouble()*(1L<<48));
return String.format("%012x", mac).toUpperCase();
}
}
If you are really looking for a fast solution, then avoid the regex and use a simple test to detect your MAC format:
static List<String> fixMacs(List<String> inMacs) {
final List<String> outMacs = new ArrayList<>(inMacs.size());
for (String mac : inMacs) outMacs.add(
mac.charAt(2) == '-'? mac.replace("-", ":")
: mac.charAt(2) != ':'? fixMac(mac)
: mac);
return outMacs;
}
static String fixMac(String inMac) {
final StringBuilder b = new StringBuilder(18);
for (int i = 0; i < inMac.length(); i++) {
b.append(inMac.charAt(i));
if (i%2 == 1 && i != inMac.length()-1) b.append(':');
}
return b.toString();
}
With this approach I measured just 8 ms for your 80,000 MACs.