Search code examples
javascriptnode.jstextsplitdivide

Divide text by letter count, but take care of splitted words


I have this function that divides given article by given letter count but it also split the words at the end of the lines, I would like to at hypen at the end of the line if the word is not completed/splitted.

var text = `Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.`

function divideTextByLetterCount(text, letterCount) {
  let dividedText = "";
  let currentLetterCount = 0;
  for (let i = 0; i < text.length; i++) {
    dividedText += text[i];
    currentLetterCount++;
    if (currentLetterCount === letterCount) {
      if (dividedText.slice(-1) == ' ') {
        dividedText = dividedText.slice(0, -1)
      }
      dividedText += "\n";
      currentLetterCount = 0;
    }
  }
  let dividedTextArr = dividedText.split('\n');
  dividedTextArr.forEach((val, i) => {
    if (val.slice(0, 1) == ' ') {
      dividedTextArr[i] = val.slice(1);
    }
  });
  return dividedTextArr.join('\n');
}

console.log(divideTextByLetterCount(text, 20));

So the output is,

Lorem Ipsum is simpl
y dummy text of the
printing and typeset
ting industry. Lorem
Ipsum has been the
industry's standard
dummy text ever sinc
e the 1500s, when an
unknown printer too
k a galley of type a
nd scrambled it to m
ake a type specimen
book. It has survive
d not only five cent
uries, but also the
leap into electronic
typesetting, remain
ing essentially unch
anged. It was popula
rised in the 1960s w
ith the release of L
etraset sheets conta
ining Lorem Ipsum pa
ssages, and more rec
ently with desktop p
ublishing software l
ike Aldus PageMaker
including versions o
f Lorem Ipsum.

But it should add hypen at the end of the lines which ends with uncompleted words like, simpl must be simpl- or sinc must be sinc-, how do I create that logic? thanks.


Solution

  • A much quicker and readable approach that also accomplishes the functionality you're looking for would be to use a regular expression. Match up to 20 characters, and put an optional capture group for a single non-space character in lookahead after the last character. If the capture group captures anything, you're in the middle of a word, and can add a -.

    const text = `Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.`
    
    function divideTextByLetterCount(text, letterCount) {
      const pattern = new RegExp(`.{0,${letterCount}}(?=(\\S)?)`, 'g');
      return text
        .replace(
          pattern,
          (match, nextChar) => match.trim() + (!match.endsWith(' ') && nextChar ? '-' : '') + '\n'
        );
    }
    
    console.log(divideTextByLetterCount(text, 20));

    Example with 50 instead of 20:

    const text = `Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.`
    
    function divideTextByLetterCount(text, letterCount) {
      const pattern = new RegExp(`.{0,${letterCount}}(?=(\\S)?)`, 'g');
      return text
        .replace(
          pattern,
          (match, nextChar) => match.trim() + (!match.endsWith(' ') && nextChar ? '-' : '') + '\n'
        );
    }
    
    console.log(divideTextByLetterCount(text, 50));