Back

Home Our Services DSP Applications Imaging Optimization of image processing algorithm

Optimization of image processing algorithm

This project was to drastically improve the performance of an image processing algorithm. It was implemented in C++. The image processing algorithm was for pre-processing of Medical images of types US (Ultrasound), CT (Computerized Tomography) and MRI (Magneto Resonance Imaging). Non-linear processing was used. Picture sizes were from 256x256x200 to 512x512x200 with 16 bit gray levels.

First we started understanding the algorithm and we tried optimizing the algorithm. The original algorithm was written generically and so had some inefficient calculations. We improved those calculations and achieved a performance improvement of 1:4. Then we came to know that Intel Pentium Processor has advanced instruction sets which are not made use by a normal x86 compilers. So we studied MMX, SSE and SSE2 instruction sets. Based on our analysis, we found SSE and SSE2 to be best suited for the algorithm we were optimizing.

SSE(STREAMING SIMD EXTENSIONS) and SSE2 uses SIMD(Single Instruction Multiple Data) execution model. SIMD model packs bytes / words / double words / float in 64-bit packed registers. Operations can be performed on these packed registers using a single instruction. This execution model can be used effectively on image processing algorithms. SIMD execution model is shown below:

SSE and SSE2 were not supported by MSVC tools which we were using for development. So we used Intel C++ compiler tools to use SSE and SSE2 instructions. Based on our study of SSE and SSE2 instruction sets, we implemented the algorithm making best use of the SIMD model. This gave drastic improvements.

We delivered the improved implementation of the algorithms as libraries. Performance improvement finally achieved was 1:53 using SSE2 and 1:43 using SSE instructions.

Back