Search code examples
c#json.netopenapijsonschemajsonconvert

Get the JSON Schema's from a large OpenAPI Document OR using NewtonSoft and resolve refs


I'm currently looking extracting all of the JSON Schemas from a large OpenAPI spec. I've been using the following NuGet packages:

Microsoft.OpenApi v1.3.1 Microsoft.OpenApi.Readers v1.3.1

I was hoping to use these to parse a large Open API spec and extract all of the JSON Schemas, which I am able to parse into 'Microsoft.OpenApi.Models.OpenApiSchema' objects. But I can't seem to create a JSON Schema from these objects and write it to file.

As it stands at the moment I have the following:

using (FileStream fs = File.Open(file.FullName, FileMode.Open))
{
    var openApiDocument = new OpenApiStreamReader().Read(fs, out var diagnostic);
    foreach (var schema in openApiDocument.Components.Schemas)
    {
        var schemaName = schema.Key;
        var schemaContent = schema.Value;

        var outputDir = Path.Combine(outputDirectory.FullName, fileNameWithoutExtension);
        if (!Directory.Exists(outputDir))
        {
            Directory.CreateDirectory(outputDir);
        }
        var outputPath = Path.Combine(outputDir, schemaName + "-Schema.json");
        var outputString = schemaContent.Serialize(OpenApiSpecVersion.OpenApi3_0, OpenApiFormat.Json);
        using (TextWriter sw = new StreamWriter(outputPath, true))
        {
            sw.Write(outputString);
            sw.Close();
        }
    }
}

The schemaContent appears to have all of the relevant properties for the schema, but I don't seem to be able to identify the next step in getting it from that object to a JSON Schema. I'm sure I'm missing something simple so any insight would be appreciated.

UPDATED

I had a bit of a think and took a slightly different approach using NewtonSoft Json instead.

var OpenApitext = File.ReadAllText(file.FullName, Encoding.UTF8);
var settings = new JsonSerializerSettings
{
    PreserveReferencesHandling = PreserveReferencesHandling.Objects,
    MetadataPropertyHandling = MetadataPropertyHandling.Ignore, //ign
    Formatting = Newtonsoft.Json.Formatting.Indented
};

dynamic openApiJson = JsonConvert.DeserializeObject<ExpandoObject>(OpenApitext, settings);

if (openApiJson?.components?.schemas != null)
{
    foreach (var schema in openApiJson.components.schemas)
    {
        var schemaString = JsonConvert.SerializeObject(schema, settings);

        var outputDir = Path.Combine(outputDirectory.FullName, fileNameWithoutExtension);
        if (!Directory.Exists(outputDir))
        {
            Directory.CreateDirectory(outputDir);
        }
        var outputPath = Path.Combine(outputDir, schema.Name + "-Schema.json");

        using (TextWriter sw = new StreamWriter(outputPath, true))
        {
            sw.Write(schemaString);
            sw.Close();
        }
    }
}

Now this will allow me to create the JSON Schema and write it to file, but it doesn't want to resolve references. Looking at the API spec all references appear to be local to the API Spec. What do I need to do in order to resolve all the references in the Open API Spec before I cycle through the schemas and write them to file? I've done a bit of research and a few people seem to build out this capability themselves, but they always use a class object as a way of supporting it which I can't do here.


Solution

  • I reached out through the microsoft/OpenAPI.NET GitHub repo in the end. By a bit of a coincidence/happenstance I got a response from the same person both there and here. So, thank you Darrel you've helped me solve the above scenario which I was getting rather confused over. I knew in the end it was that I hadn't quite implemented it correctly.

    For reference the below use case was to take in a sizeable OpenAPI Spec (Json) and extract the JSON Schemas referenced whilst ensuring that the JSON Pointers ($ref, $id) etc were resolved when this was written out to file.

    The reason this was the approach I wanted to take was that due to the size of the OpenAPI specs I had to work with it was incredibly difficult using pre-built tooling like Postman for example which can extract Schemas.

    Final code snippet for my implementation, little rough on a couple of the lines, I'll neaten that up over the weekend.

    Console.WriteLine($"Processing file: {file.FullName}");
    var fileNameWithoutExtension = Path.GetFileNameWithoutExtension(file.FullName);
    var fileExtension = Path.GetExtension(file.FullName);
    
    var reader = new OpenApiStreamReader();
    var result = await reader.ReadAsync(new FileStream(file.FullName, FileMode.Open));
    
    foreach (var schemaEntry in result.OpenApiDocument.Components.Schemas)
    {
        var schemaFileName = schemaEntry.Key + ".json";
        Console.WriteLine("Creating " + schemaFileName);
    
        var outputDir = Path.Combine(outputDirectory.FullName, fileNameWithoutExtension);
        if (!Directory.Exists(outputDir))
        {
            Directory.CreateDirectory(outputDir);
        }
        var outputPath = Path.Combine(outputDir, schemaFileName + "-Schema.json");
    
        using FileStream? fileStream = new FileStream(outputPath, FileMode.CreateNew);
        var writerSettings = new OpenApiWriterSettings() { InlineLocalReferences = true, InlineExternalReferences = true };
        using var writer = new StreamWriter(fileStream);
        schemaEntry.Value.SerializeAsV2WithoutReference(new OpenApiJsonWriter(writer, writerSettings));
    }