Search code examples
javascriptarraysparsingdata-structuresreduce

Create structured JS object based on unstructured JS object


I have this js object data:

const content = 
  [ { h1 : 'This is title number 1'          } 
  , { h2 : 'Description'                     }
  , { p  : 'Description unique content text' }
  , { h2 : 'Content'                         }
  , { p  : 'Content unique text here 1'      }
  , { ul : [ 
           'string value 1',
           'string value 2'
       ]                       
  , } 
  , { p  : 'Content unique text here 2'      }
  , { h2 : 'CTA message'                     }
  , { p  : 'CTA message unique content here' }
  , { h2 : 'CTA button'                      }
  , { p  : 'CTA button unique content here'  }
  , { p  : ''                                }
  , { h1 : 'This is title number 2'          } 
  , { h2 : 'Description'                     }
  , { p  : 'Description unique content text' }
  , { h2 : 'Content'                         }
  , { p  : 'Content unique text here 1'      }
  , { h2 : 'CTA message'                     }
  , { p  : 'CTA message unique content here' }
  , { h2 : 'CTA button'                      }
  , { p  : 'CTA button unique content here'  }
  , { p  : ''                                }
  ]

h1 indicates a new object, the ul is optional

I want to map it to such structure:

interface Content = {
  title: string;
  description: string;
  content: any[];
  cta: {
    message: string;
    button: string;
  }
}

I am wondering what is the best way of doing that?

I think I have to loop through the items and just populate a new JSON object based on my interface. The first element is always title, then just checking if "description" then the next item is description value.

    const json = Content[];
    content.forEach(element => {
    if(element.h1) {
        // add props to new object
        // add object to json array
    }
});

I just wonder how wold you create multiple Content objects based on that original content JSON object?

Here is the result I am expecting:

json = [
  {
    title: 'This is title number 1',
    description: 'Description unique content text',
    content: [
      {
        p: 'Content unique text here 1',
      },
      {
        ul: [
            'string value 1',
            'string value 2'
        ]
      },
      {
        p: 'Content unique text here 2'
      } 
    ],
   cta: {
      message: 'CTA message unique content here'
      button: 'CTA button unique content here'
   }
  },
  ...
]

UPDATE: Based on comments below I am looking for top down parser solution. The solution should be easily extensible in case if input array will be changed a bit by introducing new unique h2+p or h2+p+ul, etc elements.


Solution

  • The next provided approach features a generically implemented reducer function which is custom configurable for ...

    • an item's key upon which the creation of a new structured content type is decided.
    • an object based lookup/map/index which features key based implementations for either creating or aggregating a (key specific) content type.

    From one of the OP's above comments ...

    "... forgot to mention that algorithm should be extensible in case if input data will be changed slightly – sreginogemoh"

    Though I wouldn't go that far naming the approach "top-down parser" as others already did, the parser analogy helps.

    The advantage of the approach comes with the (generically implemented) reducer which roughly fulfills the task of a main tokenizer by processing an array/list from top to bottom (or left to right).

    Upon a match of a custom provided property name (or key-word) and a currently processed item's (token's) sole entry's key the reducer does create a new structured (data) item. Non matching item-keys do signal an aggregation task.

    Both task types (creation and aggregation) have in common that they too, always have to be custom implemented/provided as methods of an object based lookup/map/index.

    The aggregation tasks can be manifold, depending on whether a to be merged sub content type gets hinted explicitly (by e.g. other specific entry-keys) or not. What they have in common is the passing of always the same arguments signature of (predecessor, merger, key, value).

    This four parameters present the sufficient information (neither less nor more data) it needs, in order to reliably aggregate any sub content type (based on key, value and if necessary on predecessor) at the base/main content type which was passed as merger.

    FYI ... In terms of the top-down parser analogy one should notice that with the predecessor item/token one actually uses a top-down/lookbehind approach (but top-down from the little theory I know is supposed to come with lookahead).

    The reducer approach allows both the adaption to other source items and the creation of other target structures by changing the to be passed initial value's properties ... newItemKey and aggregators ... accordingly.

    The two folded solution of reducer and custom tasks got implemented in a way that the reducer does not mutate source items by making (actively) use of structuredClone for more complex sub contents whereas a task's arguments signature (passively) prevents the mutation of source items.

    // gnenerically implemented and custom configurable reducer.
    function createAndAggregateStructuredContent(
      { aggregators = {}, miscsAggregationKey = 'miscs', newItemKey, result = [] },
      item, itemIdx, itemList,
    ) {
      const [itemKey, itemValue] = Object.entries(item)[0];
      const createOrAggregateContentType =
        aggregators[itemKey] ?? aggregators[miscsAggregationKey];
    
      if ('function' === typeof createOrAggregateContentType) {
        if (itemKey === newItemKey) {
          // create and collect a new content type.
          result
            .push(
              createOrAggregateContentType(itemValue)
            );
        } else {
          // aggregate an existing content type.
          createOrAggregateContentType(
            itemList[itemIdx - 1], // - predecessor item from provided list.
            result.slice(-1)[0],   // - currently aggregated content type.
            itemKey,
            itemValue,
          );
        }
      }
      return { aggregators, miscsAggregationKey, newItemKey, result };
    }
    
    // poor man's fallback for environments
    // which do not support `structuredClone`.
    const cloneDataStructure = (
      ('function' === typeof structuredClone) && structuredClone ||
      (value => JSON.parse(JSON.stringify(value)))
    );
    
    // interface Content = {
    //   title: string;
    //   description: string;
    //   content: any[];
    //   cta: {
    //     message: string;
    //     button: string;
    //   }
    // }
    
    // object based lookup/map/index for both
    // content-type creation and aggregation
    // according to the OP's `Content` interface.
    const aggregators = {
      // creation.
      h1: value => ({ title: String(value) }),
    
      // aggregation.
      h2: (predecessor, merger, key, value) => {
        key = value.trim().toLowerCase();
        if ((key === 'description') || (key === 'content')) {
          merger[key] = null;
        } else if ((/^cta\s+message|button$/).test(key)) {
          merger.cta ??= {};
        }
      },
      // aggregation.
      miscs: (predecessor, merger, key, value) => {
        const contentType = String(predecessor.h2)
          .trim().toLowerCase();
        const ctaType = (/^cta\s+(message|button)$/)
          .exec(contentType)?.[1] ?? null;
    
        if ((contentType === 'description') && (merger.description === null)) {
    
          merger.description = String(value);
    
        } else if ((ctaType !== null) && ('cta' in merger)) {
    
          Object.assign(merger.cta, { [ ctaType ]: String(value) });
    
        } else if (value) {
          // fallback ...
          // ... default handling of various/varying non empty content.
          (merger.content ??= []).push({ [ key ]: cloneDataStructure(value) });
        }
      },
    };
    
    const content = 
      [ { h1 : 'This is title number 1'          } 
      , { h2 : 'Description'                     }
      , { p  : 'Description unique content text' }
      , { h2 : 'Content'                         }
      , { p  : 'Content unique text here 1'      }
      , { ul : [ 
               'string value 1',
               'string value 2'
           ]                       
      , } 
      , { p  : 'Content unique text here 2'      }
      , { h2 : 'CTA message'                     }
      , { p  : 'CTA message unique content here' }
      , { h2 : 'CTA button'                      }
      , { p  : 'CTA button unique content here'  }
      , { p  : ''                                }
      , { h1 : 'This is title number 2'          } 
      , { h2 : 'Description'                     }
      , { p  : 'Description unique content text' }
      , { h2 : 'Content'                         }
      , { p  : 'Content unique text here 1'      }
      , { h2 : 'CTA message'                     }
      , { p  : 'CTA message unique content here' }
      , { h2 : 'CTA button'                      }
      , { p  : 'CTA button unique content here'  }
      , { p  : ''                                }
      ];
    const structuredContent = content
      .reduce(
        createAndAggregateStructuredContent, {
          aggregators,
          newItemKey: 'h1',
          result: [],
        },
      ).result;
    
    console.log({ structuredContent, content });
    .as-console-wrapper { min-height: 100%!important; top: 0; }

    FYI ... questions and approaches similar to the very topic here ...