Search code examples
javaregexstringhashmapjava-stream

Creating a Map from a string based on the provided Keys when the string doesn't have delimiters in Java 8


Given the following String:

String s = "DIMENSION 24cm 34cm 12cm DETAILED SPECIFICATION Twin/Twin XL Flat Sheet: 105"l x 74"w. CARE For best results, machine wash warm with like colors. COLOURS red blue green"; 

Keys are = DIMENSIONS, DETAILED SPECIFICATION, CARE, COLOURS

We need to create Map<String,String> where keys will be as provided above and corresponding text will be the value.

The map's contents will look like:

DIMENSION: 24cm 34cm 12cm,
DETAILED SPECIFICATION: Twin/Twin XL Flat Sheet: 105"l x 74"w,
CARE:  For best results, machine wash warm with like colors,
COLOURS: red blue green 

And not necessary that all these keys and values are present in the string.

Suppose the key CARE is not present in the input String:

String s = "DIMENSION 24cm 34cm 12cm DETAILED SPECIFICATION Twin/Twin XL Flat Sheet: 105"l x 74"w. COLOURS red blue green"; 

The map's contents will look like:

DIMENSION: 24cm 34cm 12cm,
DETAILED SPECIFICATION: Twin/Twin XL Flat Sheet: 105"l x 74"w,
COLOURS: red blue green 

I.e. if a key is absent in the given string then the corresponding value will be also absent. For instance, DIMENSION key is absent and string starts like "DETAILED SPECIFICATION ... ".

As the string doesn't have delimiters, I am unable to create a map from it.

With plane Java, I am able to do like this:

if(s.contains("ASSEMBLY")) {
    ass = s.substring(s.indexOf("COLOURS") + 8);
    s = s.replaceAll(s.substring(s.indexOf("COLOURS")),"");
}
if(s.contains("OVERALL")){
    ov = s.substring(s.indexOf("CARE") + 5);
    s = s.replaceAll(s.substring(s.indexOf("CARE")),"");
}
if(s.contains("CARE")){
    care1 = s.substring(s.indexOf("DETAILED SPECIFICATION") + 24);
    s = s.replaceAll(s.substring(s.indexOf("DETAILED SPECIFICATION")),"");
}
if(s.contains("DIMENSIONS")){
    de1 = s.substring(s.indexOf("DIMENSIONS") + 11);
    s =s.replaceAll(s.substring(s.indexOf("DIMENSIONS")),"");
}

If we have delimiter, then I am able to do it like this.

Map<String, String> map = Stream.of(s)
    .map(s -> s.split("="))
    .collect(Collectors.toMap(s -> s[0], s -> s[1]));

Solution

  • You can generate a regular expression from the given keys and compile it into a pattern.

    I approached this problem using lookahead and lookbehind:

    public static Pattern getPattern(Collection<String> keys) {
        String joinedKeys = String.join("|", keys);
        String regex = String.format("(?<=%s)|(?=%s)", joinedKeys, joinedKeys); // creates a regex "(?<=DIMENSION|DETAILED\\sSPECIFICATION|CARE|COLOURS)|(?=DIMENSION|DETAILED\\sSPECIFICATION|CARE|COLOURS)"
        return Pattern.compile(regex);
    }
    
    • (?=foo) - Lookahead - matches a position that immediately follows after the foo;
    • (?<=foo) - Lookbehind - matches a position that immediately precedes the foo.

    For more information have a look at this tutorial

    When we have a pattern, we can generate a stream using Pattern.splitAsStream().

    public static Map<String, String> toMap(String source, Set<String> keys) {
        
        return getPattern(keys).splitAsStream(source)
            .collect(combineByKey(keys));
    }
    

    Each element of this a stream would be either a key or a value, and in order to obtain a map a result of the stream execution, we need a collector which will be capable to distinguish between a key from a value.

    We can create such a collector using Collector.of(). As its a mutable container, I'll use a Deque of map-entries.

    A case with an empty string used as a key that can be observed in the accumulator function below represents a situation which might take place while executing the stream in parallel when a thread gets the piece of data which starts from a value and not from a key, in other words, a key and a value and up in different containers. This problem gets fixed when we're merging containers in the combiner function.

    StringBuilder is used as an intermediate type of value an entry returned by Map.entry() is immutable.

    Note: Map.entry() was introduced with Java 9. To make this solution compliant with Java 8 use new AbstractMap.SimpleEntry<>() instead (see the code for JDK 8).

    public static Collector<String, ?, Map<String, String>> combineByKey(Set<String> keys) {
    
        return Collector.of(
            ArrayDeque::new,
            (Deque<Map.Entry<String, StringBuilder>> deque, String next) -> {
                if (keys.contains(next)) deque.add(Map.entry(next, new StringBuilder()));
                else {
                    if (deque.isEmpty()) deque.add(Map.entry("", new StringBuilder(next)));
                    else deque.getLast().getValue().append(next);
                }
            },
            (left, right) -> {
                if (!right.isEmpty() && !left.isEmpty() && right.getFirst().getKey().isEmpty()) {
                    left.getLast().getValue().append(right.pollFirst().getValue());
                }
                left.addAll(right);
                return left;
            },
            deque -> deque.stream().collect(Collectors.toMap(
                Map.Entry::getKey,
                entry -> entry.getValue().toString().strip()
            ))
        );
    }
    

    main()

    public static void main(String[] args) {
        String source = "DIMENSION 24cm 34cm 12cm DETAILED SPECIFICATION Twin/Twin XL Flat Sheet: 105l x 74w. CARE For best results, machine wash warm with like colors. COLOURS red blue green";
    
        Set<String> keys = Set.of("DIMENSION", "DETAILED SPECIFICATION", "CARE", "COLOURS");
    
        Map<String, String> result = toMap(source, keys); // converting the source string to map
        
        result.forEach((k, v) -> System.out.println(k + " -> " + v)); // printing the result
    }
    

    Output:

    COLOURS -> red blue green
    DIMENSION -> 24cm 34cm 12cm
    DETAILED SPECIFICATION -> Twin/Twin XL Flat Sheet: 105l x 74w.
    CARE -> For best results, machine wash warm with like colors.
    

    A link to Online Demo