Archive for the ‘Optimization’ Category

There comes a time in most complex programs where you want to ask a simple question like, ‘have I already processed a string with this id’? Linear searches through an array are easy to write and work well enough for small array sizes. Plus, the memory overhead of linear searches is fantastic, since it basically has none. But when your arrays can contain many elements, it is time to ditch those linear searches and go with an ordered map or unordered map. Continue reading ‘Ordered map vs. Unordered map – A Performance Study’ »

In memory image compression is essential for any type of visual computing application. Images alone can take a good amount of memory, and when creating an application which uses many images, you may soon find yourself using several gigabytes of memory. While this may be fine for your personal workstation, it may not be okay if you intend to release your application to the public. To work around this memory consumption problem, you’ll want to use in-memory image compression techniques to reduce your program’s memory footprint. Continue reading ‘In Memory Image Compression tutorial’ »

In a previous article about image processing with SSE, we used some basic SSE intrinsics to perform a very easy image manipulation routine, removing all blue from an image. This task was easy, since each pixel was 8 bits per component, with 4 components (ARGB). However, for more advanced image processing functions such as 2D convolution, it is preferable to work with each color component as a 32-bit floating point number rather than an 8-bit unsigned integer. Continue reading ‘Advanced Image Processing with SSE’ »

Using SSE to process images or video is essential to achieving good performance. Most popular multimedia applications use SSE to greatly accelerate application performance. Unfortunately, like everything in life, if SSE is used incorrectly it can actually perform worse than non-SSE code. This article will take you through some code and discuss the performance of each. Continue reading ‘Image Processing with SSE’ »

If you use Microsoft’s Visual Studio to develop your applications, chances are you either have the express or professional editions, which are free or $549 respectively. Unfortunately, neither of these editions comes with a code profiler! Instead, if you want to use a built-in code profiler for Visual Studio out of the box, you’ll need to have either the premium or ultimate edition for $5,469 or $11,899 respectively. No joke! Luckily, you don’t need to use Visual Studio’s built-in profiler to effectively and easily profile your code.

Continue reading ‘How to profile C++ code in Visual Studio for free’ »

The SSE instruction set can be a very useful tool in developing high performance applications. SSE, or Streaming SIMD Extensions, is particularly helpful when you need to perform the same instructions over and over again on different pieces of data. SSE vectors are 128-bits wide, and allow you to perform calculations for 4 different floating point numbers at the same time. SSE can also be configured to work on 2, 64-bit floating point numbers concurrently, 4, 32-bit integers, or even 16, 8-bit chars. Continue reading ‘Getting started with SSE programming’ »

High level languages such as C, C++, C#, FORTRAN, and Java all do a great job of abstracting the hardware away from the language. This means that programmers generally don’t have to worry about how the hardware goes about executing their program. However, in order to get the maximum amount of performance out of your programs, it is necessary to start thinking about how the hardware is actually going to execute your program.

Continue reading ‘Taking advantage of cache coherence in your programs’ »