Performance patch and a few enhancements #568
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I am using the OpenJpeg libraries to convert between large TIF and JP2 files. A while back I found Aaron Boxer's OpenMP patch (https://code.google.com/p/openjpeg/issues/detail?id=372). It had a memory leak which led to poor performance with large files. After some small changes to correct the memory leak I am seeing a substantial performance boost. I am very interested in getting this patch accepted into the main branch so the changes can be maintained going forward.
The performance gains I am seeing are well worth incorporating this patch into the code base. This patch allows you to take advantage of all the CPU's in the system. With the current trunk I am seeing 20 to 30 percent CPU utilization, with this patch (using the same number of threads as CPU cores) I am seeing 80 to 90 percent CPU utilization. In a system with a fast CPU, a lot of cores, and a lot of memory you could scale up the number of threads to really take advantage of the system resources. Storing large files and creating large mosaics are good candidates for the jp2 files and the performance numbers appear to be better for these larger files. A chart with some performance numbers comparing the main branch with the patch is listed at the bottom of this post.
I included some additional enhancements along with the performance patch:
As for the performance numbers, I am running on a virtual Windows 7 64 bit machine, 4 processors and 6 GB of memory. There are no OpenMP enhancements in the code that loads and converts the BMP, PNG, and TIF files into an image in memory before being processed into a .jp2 file. Because of this those times are not included in the timing analysis. Likewise with decompression the actual writing of the decompressed file is excluded from the timing analysis. As best as I can tell performance number calculations are all over the map. I am categorizing the performance numbers here as % faster (original time - new time) / new time. This gives a good indication of the performance I am actually seeing. The original times are also included in the list for any alternative formulas.
From what I've seen the smaller files do not get as much benefit from the threading. My guess is that the times are so short the overhead of managing the threads eats into the performance gains. There is also a lot of variance in times with the small files, likely due to the normal system usage noise. The BMP files show the smallest gains compared to PNG and TIF files. Generally speaking when creating tiled JP2 files the performance gains are less for the smaller tile sizes. In some cases negative gains are seen using the BMP files. The best performance gains seen are with large TIF files, compression giving the best results.