The issue is I need to parse a string line delimited by spaces (but not exactly) into a Pydantic model. The field names are known and types for this task are not important, so keeping str
is fine.
I'm unfamiliar with Pydantic, but I assume there is a way to leverage the BaseModel.model_validate
method or similar to make the parsing more natively.
For the sake of example I trimmed the log and model!
Example log file:
79a59df900b949e55d DOC-EXAMPLE-BUCKET1 [06/Feb/2019:00:00:38 +0000]
The model:
class S3AccessLogEntry(BaseModel):
owner: str
bucket: str
timestamp: str
There are no pydantic mechanisms to do what you want. Actually, I'm not sure that you even need pydantic model here. But if having pydantic model is important you can do it in the following way:
class S3AccessLogEntry(BaseModel):
owner: str
bucket: str
timestamp: str
@classmethod
def from_log(cls, line: str) -> "S3AccessLogEntry":
"""Build model based on log line."""
try:
owner, bucket, timestamp = line.split(sep=" ", maxsplit=2)
except ValueError as exc:
raise ExceptionYouWant from exc
return cls(
owner=owner,
bucket=bucket,
timestamp=timestamp
)
S3AccessLogEntry.from_log(line="79a59df900b949e55d DOC-EXAMPLE-BUCKET1 [06/Feb/2019:00:00:38 +0000]")
Or if this is csv-like format you can do something like this:
csv_logs = csv.DictReader(csv_data.splitlines())
for line in csv_logs:
log_entry = S3AccessLogEntry.model_validate(obj=line)