So, I'm working on a library for my internal applications that interacts with our PostgreSQL database (amongst many other things). One requirement at the moment is that this library can dump data from the database to a file. I have something working, but I've been trying to improve its performance as much as possible. This is what I'm currently looking at:
Using COPYReader As NpgsqlCopyTextReader = CType(CIADB.DBConnection.BeginTextExport(COPYSQL), NpgsqlCopyTextReader)
With COPYReader
Dim stopWatch As New Stopwatch
Dim ts As TimeSpan
Dim elapsedTime As String
' ** FIRST ATTEMPT
stopWatch.Start()
Dim BufferText As String = .ReadLine
Do While Not BufferText Is Nothing
CurrentPosition += 1
OutputFile.WriteLine(BufferText)
If Not UpdateForm Is Nothing Then
UpdateForm.UpdateProgress(Convert.ToInt32((CurrentPosition / MaxRecords) * 100))
End If
BufferText = .ReadLine
Loop
OutputFile.Flush()
OutputFile.Close()
stopWatch.Stop()
ts = stopWatch.Elapsed
elapsedTime = String.Format("{0:00}:{1:00}:{2:00}.{3:00}", ts.Hours, ts.Minutes, ts.Seconds, ts.Milliseconds / 10)
' ** FIRST ATTEMPT RESULTS
' ** Records Retrieved: 65358
' ** Time To Complete: 2:12.07
' ** Lines Written: 65358
' ** File Size: 8,166 KB
' ** SECOND ATTEMPT
stopWatch.Start()
Using TestOutputFile As New IO.StreamWriter(DestinationFile.FullName.Replace(".TXT", "_TEST.TXT"), False)
TestOutputFile.Write(.ReadToEndAsync.Result)
End Using
stopWatch.Stop()
ts = stopWatch.Elapsed
elapsedTime = String.Format("{0:00}:{1:00}:{2:00}.{3:00}", ts.Hours, ts.Minutes, ts.Seconds, ts.Milliseconds / 10)
' ** SECOND ATTEMPT
' ** Records Retrieved: 65358
' ** Time To Complete: 1:04.01
' ** Lines Written: 65358
' ** File Size: 8,102 KB
End With
End Using
I've run multiple tests of each method and come up with pretty much the same results. FIRST ATTEMPT takes about twice as long as SECOND ATTEMPT
Obviously, the UpdateForm.UpdateProgress
method used in FIRST ATTEMPT (used to keep the form responsive and show the current progress of the export) is going to cause the process to take longer because of the form updating and such involved, not to mention the line-by-line writing to the file. This is pretty much exactly why I was looking at reducing the number of extra calls by doing a full dump in one line of code. The problem is that if I use the "one-liner", the form is completely unresponsive until the process is complete.
I've tried moving the code for the "all-in-one-shot" dump from the SECOND ATTEMPT into a separate Async
method, but I'm extremely unfamiliar with async methods in general, so I'm (obviously) not doing it right:
Private Async Sub OutputToFile(ByVal COPYReader As NpgsqlCopyTextReader, ByVal DestinationFile As IO.FileInfo)
' ** METHOD 3
Using TestOutputFile As New IO.StreamWriter(DestinationFile.FullName.Replace(".TXT", "_TEST.TXT"), False)
Await TestOutputFile.WriteAsync(COPYReader.ReadToEndAsync.Result)
End Using
' ** METHOD 3 RESULTS
' ** Records Retrieved: 65358
' ** Time To Complete: 0:15.07
' ** Lines Written: 34
' ** File Size: 4 KB
End Sub
One other thing to mention: I tried moving all of this to a BackgroundWorker
, but I was getting some strange behavior when I tried to call my UpdateForm.UpdateProgress
method which caused the application to completely skip the actual dumping process. I've currently given up on trying to get this onto a separate thread, but I'm still open to other suggestions. This is actually one of the smaller tables that I'm dumping, so I'm not looking forward to what one of the bigger ones will do.
Just for the sake of completeness, here's the UpdateForm
class that I've implemented in my library for reusability across other applications:
Imports System.Windows.Forms
Namespace Common
Public Class FormHandler
Implements IDisposable
Public Property ApplicationForm As Form
Public Property ApplicationStatusLabel As Label
Public Property ApplicationToolStripLabel As ToolStripStatusLabel
Public Property ApplicationProgressBar As ProgressBar
Private LabelVisibleState As Boolean = True
Private ProgressBarVisibleState As Boolean = True
Private CurrentStatusText As String
Private CurrentProgress As Integer
Public Sub New(ByVal AppForm As Form)
ApplicationForm = AppForm
End Sub
Public Sub New(ByVal StatusLabel As Label, ByVal Progress As ProgressBar)
ApplicationStatusLabel = StatusLabel
ApplicationToolStripLabel = Nothing
ApplicationProgressBar = Progress
ApplicationForm = ApplicationProgressBar.Parent.FindForm
LabelVisibleState = StatusLabel.Visible
ProgressBarVisibleState = Progress.Visible
With ApplicationProgressBar
.Minimum = 0
.Maximum = 100
.Value = 0
.Visible = True
End With
With ApplicationStatusLabel
.Visible = True
.Text = ""
End With
End Sub
Public Sub New(ByVal StatusLabel As ToolStripStatusLabel, ByVal Progress As ProgressBar)
ApplicationToolStripLabel = StatusLabel
ApplicationStatusLabel = Nothing
ApplicationProgressBar = Progress
ApplicationForm = ApplicationProgressBar.Parent.FindForm
LabelVisibleState = StatusLabel.Visible
ProgressBarVisibleState = Progress.Visible
With ApplicationProgressBar
.Minimum = 0
.Maximum = 100
.Value = 0
.Visible = True
End With
With ApplicationToolStripLabel
.Visible = True
.Text = ""
End With
End Sub
Public Sub New(ByVal AppForm As Form, ByVal StatusLabel As Label, ByVal Progress As ProgressBar)
ApplicationForm = AppForm
ApplicationStatusLabel = StatusLabel
ApplicationToolStripLabel = Nothing
ApplicationProgressBar = Progress
LabelVisibleState = StatusLabel.Visible
ProgressBarVisibleState = Progress.Visible
With ApplicationProgressBar
.Minimum = 0
.Maximum = 100
.Value = 0
.Visible = True
End With
With ApplicationStatusLabel
.Visible = True
.Text = ""
End With
End Sub
Public Sub New(ByVal AppForm As Form, ByVal StatusLabel As ToolStripStatusLabel, ByVal Progress As ProgressBar)
ApplicationForm = AppForm
ApplicationToolStripLabel = StatusLabel
ApplicationStatusLabel = Nothing
ApplicationProgressBar = Progress
LabelVisibleState = StatusLabel.Visible
ProgressBarVisibleState = Progress.Visible
With ApplicationProgressBar
.Minimum = 0
.Maximum = 100
.Value = 0
.Visible = True
End With
With ApplicationToolStripLabel
.Visible = True
.Text = ""
End With
End Sub
Friend Sub UpdateProgress(ByVal StatusText As String, ByVal CurrentPosition As Integer, ByVal MaxValue As Integer)
CurrentStatusText = StatusText
CurrentProgress = Convert.ToInt32((CurrentPosition / MaxValue) * 100)
UpdateStatus()
End Sub
Friend Sub UpdateProgress(ByVal StatusText As String, ByVal PercentComplete As Decimal)
CurrentStatusText = StatusText
CurrentProgress = Convert.ToInt32(PercentComplete)
UpdateStatus()
End Sub
Friend Sub UpdateProgress(ByVal StatusText As String)
CurrentStatusText = StatusText
CurrentProgress = 0
UpdateStatus()
End Sub
Friend Sub UpdateProgress(ByVal PercentComplete As Decimal)
CurrentProgress = Convert.ToInt32(PercentComplete)
UpdateStatus()
End Sub
Friend Sub UpdateProgress(ByVal CurrentPosition As Integer, ByVal MaxValue As Integer)
CurrentProgress = Convert.ToInt32((CurrentPosition / MaxValue) * 100)
UpdateStatus()
End Sub
Friend Sub ResetProgressUpdate()
CurrentStatusText = ""
CurrentProgress = 0
UpdateStatus()
End Sub
Private Sub UpdateStatus()
If Not ApplicationForm Is Nothing Then
If ApplicationForm.InvokeRequired Then
Dim UpdateInvoker As New MethodInvoker(AddressOf UpdateStatus)
Try
ApplicationForm.Invoke(UpdateInvoker)
Catch ex As Exception
Dim InvokeError As New ErrorHandler(ex)
InvokeError.LogException()
End Try
Else
UpdateApplicationProgress(CurrentStatusText)
End If
End If
End Sub
Friend Sub UpdateApplicationProgress(ByVal ProgressText As String)
If Not ApplicationForm Is Nothing Then
With ApplicationForm
If Not ProgressText Is Nothing Then
If Not ApplicationStatusLabel Is Nothing Then
ApplicationStatusLabel.Text = ProgressText
End If
If Not ApplicationToolStripLabel Is Nothing Then
ApplicationToolStripLabel.Text = ProgressText
End If
End If
If Not ApplicationProgressBar Is Nothing Then
ApplicationProgressBar.Value = CurrentProgress
End If
End With
ApplicationForm.Refresh()
Application.DoEvents()
End If
End Sub
Public Sub Dispose() Implements IDisposable.Dispose
If Not ApplicationForm Is Nothing Then
ApplicationForm.Dispose()
End If
If Not ApplicationStatusLabel Is Nothing Then
ApplicationStatusLabel.Visible = LabelVisibleState
ApplicationStatusLabel.Dispose()
End If
If Not ApplicationToolStripLabel Is Nothing Then
ApplicationToolStripLabel.Visible = LabelVisibleState
ApplicationToolStripLabel.Dispose()
End If
If Not ApplicationProgressBar Is Nothing Then
ApplicationProgressBar.Visible = ProgressBarVisibleState
ApplicationProgressBar.Dispose()
End If
End Sub
End Class
End Namespace
Per the suggestion in the comments from @the_lotus, I modified my FIRST ATTEMPT slightly to check the value of the current progress (I declared a CurrentProgress
variable as an Integer
), and it dramatically improved the time taken:
' ** FOURTH ATTEMPT
Using COPYReader As NpgsqlCopyTextReader = CType(CIADB.DBConnection.BeginTextExport(COPYSQL), NpgsqlCopyTextReader)
With COPYReader
Dim stopWatch As New Stopwatch
Dim ts As TimeSpan
Dim elapsedTime As String
Dim CurrentProgress As Integer = 0
stopWatch.Start()
Dim BufferText As String = .ReadLine
Do While Not BufferText Is Nothing
CurrentPosition += 1
OutputFile.WriteLine(BufferText)
' ** Checks to see if the value of the ProgressBar will actually
' ** be changed by the CurrentPosition before making a call to
' ** UpdateProgress. If the value doesn't change, don't waste
' ** the call
If Convert.ToInt32((CurrentPosition / MaxRecords) * 100) <> CurrentProgress Then
CurrentProgress = Convert.ToInt32((CurrentPosition / MaxRecords) * 100)
If Not UpdateForm Is Nothing Then
UpdateForm.UpdateProgress(CurrentProgress)
End If
End If
BufferText = .ReadLine
Loop
OutputFile.Flush()
OutputFile.Close()
stopWatch.Stop()
ts = stopWatch.Elapsed
elapsedTime = String.Format("{0:00}:{1:00}:{2:00}.{3:00}", ts.Hours, ts.Minutes, ts.Seconds, ts.Milliseconds / 10)
End With
End Using
' ** FOURTH ATTEMPT RESULTS
' ** Records Retrieved: 65358
' ** Time To Complete: 0:47.45
' ** Lines Written: 65358
' ** File Size: 8,166 KB
Of course, the form is a "little" less responsive than if I were making the call on every record, but I'm thinking it's worth the trade-off.
So that I could minimize the amount of code I'd have to retype (read: "copy/paste") each time I use the UpdateProgress
method, I've moved the test for changed values over there and it seems to be operating with the same performance improvements. Again, for the sake of completeness, here's the code for the two private methods involved in doing the actual progress/status update:
Private Sub UpdateStatus()
If Not ApplicationForm Is Nothing Then
If ApplicationForm.InvokeRequired Then
Dim UpdateInvoker As New MethodInvoker(AddressOf UpdateStatus)
Try
ApplicationForm.Invoke(UpdateInvoker)
Catch ex As Exception
Dim InvokeError As New ErrorHandler(ex)
InvokeError.LogException()
End Try
Else
UpdateApplicationProgress()
End If
End If
End Sub
Private Sub UpdateApplicationProgress()
Dim Changed As Boolean = False
If Not ApplicationForm Is Nothing Then
With ApplicationForm
If Not CurrentStatusText Is Nothing Then
If Not ApplicationStatusLabel Is Nothing Then
If ApplicationStatusLabel.Text <> CurrentStatusText Then
Changed = True
ApplicationStatusLabel.Text = CurrentStatusText
End If
End If
If Not ApplicationToolStripLabel Is Nothing Then
If ApplicationToolStripLabel.Text <> CurrentStatusText Then
Changed = True
ApplicationToolStripLabel.Text = CurrentStatusText
End If
End If
End If
If Not ApplicationProgressBar Is Nothing Then
If ApplicationProgressBar.Value <> CurrentProgress Then
Changed = True
ApplicationProgressBar.Value = CurrentProgress
End If
End If
End With
If Changed Then
ApplicationForm.Refresh()
End If
Application.DoEvents()
End If
End Sub
Doing it this way also came with the added benefit of returning some of the responsiveness to the form that was previously lost. I hope at least some of this code and information is helpful to someone out there.
You don't need to call UpdateProgress on each call. It's not necessary when the percentage didn't even move. Try to do a small check and only update the percentage when needed.
It's also possible that the second attempt is faster because it doesn't go to the database. The data could be cached.