Search code examples
vbscript

Vbscript strip text file by filesize without losing contents of last line


I have the following Vbscript code which currently splits a large text file into smaller files by filesize argument in MB:

Sub Split
  Dim iFile, oFile, iStream, Data
  Dim Ext, e, offset, length, NewName
  
  Info (" ==> Splitting " & sFile & " by " & Size & "B files and saving to " & dPath)
  
  if not oFSO.FileExists(sFile) then Error _
  ("Cannot locate the file """ & sFile & """")
  Set iFile = oFSO.GetFile(sFile)
  
  On Error Resume Next
  Set iStream = iFile.OpenAsTextStream(1)

  if err.number > 0 then Error("Cannot open the file """ & sFile _
  & """" & LF & Err.Description)

  On Error Goto 0
  Data = iStream.Read(iFile.Size)
  iStream.close
  
  Ext = 0
  offset = 1
  
  Do
    Ext = Right("00" & Ext + 1, 3)
    if ext > "999" then Error ("Too many files - maximum is 999!")
    
    NewName = dPath & prefix & datetype & suffix & "_" & Ext & ".txt"
    Info "Writing """ & NewName & """"

    On Error Resume Next
  
    ' ---> Write file
    Set oFile = oFSO.CreateTextFile(NewName, 2)

    if err.number > 0 then Error("Cannot open the file """ _
    & NewName & """" & LF & Err.Description)
    On Error Goto 0
    
    length = Size
    If length > Len(data)+1 - offset Then length = Len(data) + 1 - offset
    
    oFile.Write Mid(Data, offset, length)
    offset = offset + length
    oFile.Close
  Loop Until offset >= Len(data)

End Sub

The above code works however, when the large file is split into smaller files it will split contents of the final line for each file too but I would like to retain all contents on the line and split at the end of the line as well as by current filesize function I have.

For example end of line (last line in smaller files) being as follows:

This is a test

The last line stored each each file my be cut and can look like the following:

This is a te

Is it possible to retain all contents on the line and split at the end of the line as well as by current filesize, if so how can I achieve this? Thanks


Solution

  • To split the file by size, and only split on line ends, can be achieved by shifting the end point of each chunk to the next CRLF. Here's your Split Sub with that enhancement.

    Sub Split
      Dim iFile, oFile, iStream, Data, TotalSize
      Dim Ext, e, offset, length, NewName
      
      Info (" ==> Splitting " & sFile & " by " & Size & "B files and saving to " & dPath)
      
      if not oFSO.FileExists(sFile) then Error _
      ("Cannot locate the file """ & sFile & """")
      Set iFile = oFSO.GetFile(sFile)
      
      On Error Resume Next
      Set iStream = iFile.OpenAsTextStream(1)
    
      if err.number > 0 then Error("Cannot open the file """ & sFile _
      & """" & LF & Err.Description)
    
      On Error Goto 0
      Data = iStream.Read(iFile.Size)
      iStream.close
      TotalSize = Len(Data)
      
      Ext = 0
      offset = 1
      
      Do
        Ext = Right("00" & Ext + 1, 3)
        if ext > "999" then Error ("Too many files - maximum is 999!")
        
        NewName = dPath & prefix & datetype & suffix & "_" & Ext & ".txt"
        Info "Writing """ & NewName & """"
    
        On Error Resume Next
      
        ' ---> Write file
        Set oFile = oFSO.CreateTextFile(NewName, 2)
    
        if err.number > 0 then Error("Cannot open the file """ _
        & NewName & """" & LF & Err.Description)
        On Error Goto 0
        
        length = Size
        If length > TotalSize+1 - offset Then length = TotalSize + 1 - offset
    
        'Shift the end point to the next CRLF
        Do While Mid(Data, offset+length, 2)<>VBCRLF And offset+length<TotalSize
          length = length + 1
        Loop
        If Mid(Data, offset+length, 2)=VBCRLF Then length = length + 2
        
        oFile.Write Mid(Data, offset, length)
        offset = offset + length
        oFile.Close
      Loop Until offset >= TotalSize
    
    End Sub