TL;DR
I have a large set of data that looks like arrays of value-only JSON objects. I'm wondering if regex can concisely handle this structure:
"[]" --> List.of(List.of())
"[{'v1','v2','v3'}]" --> List.of(List.of("v1","v2","v3"))
"[{'v4','v5'},{'v6','v7'}]" --> List.of(List.of("v4","v5"),List.of("v6","v7"))
The values are ordered; a POJO will be constructed for each inner list with the ordered arguments and each value is a primitive (int
, long
, or String
) as defined in each POJO.
Details
This parser is part of a Jackson CSV serializer/deserializer of POJOs with a container of POJO. Unfortunately, CsvMapper
only supports POJOs with a container of primitives thus the need for the custom parser (as far as I can tell). As an example, a structure like this:
record Person(
String name,
List<Pet> pets) {
}
record Pet(
String name,
String type) {
}
so the following:
new Person("Jan", List.of(new Pet("Mr. Bubbles", "dog"), new Pet("Lilly", "cat")));
is serialized to CSV as two columns:
Jan,"[{'Mr. Bubbles','dog'}, {'Lilly','cat'}]"
where the second column is a container of POJOs. To deserialize this, my custom Jackson deserializer does this:
public static class PersonDeserializer extends StdDeserializer<Person> {
private static final long serialVersionUID = 1L;
public PersonDeserializer() {
this(Person.class);
}
public PersonDeserializer(Class<Person> type) {
super(type);
}
@Override
public Person deserialize(JsonParser p, DeserializationContext ctxt)
throws IOException, JsonProcessingException {
JsonNode node = p.getCodec().readTree(p);
String name = node.get("name").asText();
List<Pet> pets = deserialize(node.get("pets").asText());
return new Person(name, pets);
}
private static List<Pet> deserialize(String serializedPets) {
List<Pet> pets = new ArrayList<>();
// messy custom parser, ATM
// ...
return pets;
}
}
where a very messy deserialize
custom parser builds the POJOs. I'm hoping someone with more regex experience can help?
Here's a pure java way:
List<List<String>> result = Arrays.stream(input.replaceAll("^.\\{?|}?.$", "").split("},\\{"))
.map(inner -> Arrays.stream(inner.replaceAll("^.'?|'?.$", "").split("','")).toList())
.toList();
which:
[{
from the front and }]
from the back (curlys optional)},{
You could also use a json parser:
ObjectMapper om = new ObjectMapper().configure(JsonParser.Feature.ALLOW_SINGLE_QUOTES, true);
List<List<String>> result = om.readValue(input.replace("{", "[").replace("}","]"), new TypeReference<List<List<String>>>() {});
but you must: