admin | The Supercomputing Blog

Author Archive

Image twist and swirl algorithm

Image warps and other distortions are significantly more complicated than simple image processing techniques such as convolution. This tutorial will cover how to twist an image in the center. This exact code can be modified to do twists or other types of image warps. Continue reading ‘Image twist and swirl algorithm’ »

Posted by admin on September 12, 2011 at 10:37 pm under C++, Graphics, OpenMP.
Tags: Algorithm, Double Precision, Image Algorithm, Image processing, Multisampling, OpenMP, Performance, Rotate, Scale, Swirl, Twist
Comments Off on Image twist and swirl algorithm.

CUDA Memory and Cache Architecture

Understanding the basic memory architecture of whatever system you’re programming for is necessary to create high performance applications. Most desktop systems consist of large amounts of system memory connected to a single CPU, which may have 2 or three levels or fully coherent cache. Before you get started with CUDA, you should read this to understand the basic memory hierarchy of modern CUDA capable compute devices. Continue reading ‘CUDA Memory and Cache Architecture’ »

Posted by admin on September 10, 2011 at 6:18 pm under CUDA.
Tags: Cache, Coalesce, Coherence, CUDA, Hierarchy, L1, L2, Memory, Tutorial
Comments Off on CUDA Memory and Cache Architecture.

Image Processing with SSE

Using SSE to process images or video is essential to achieving good performance. Most popular multimedia applications use SSE to greatly accelerate application performance. Unfortunately, like everything in life, if SSE is used incorrectly it can actually perform worse than non-SSE code. This article will take you through some code and discuss the performance of each. Continue reading ‘Image Processing with SSE’ »

Posted by admin on August 11, 2011 at 12:11 am under C++, Graphics, Optimization, Windows.
Tags: Aligned, Cache Coherence, Graphics, Image, Integer, Memory bandwidth, Multimedia, Optimization, Perofmance, SSE, SSE2, Unaligned, Video
Comments Off on Image Processing with SSE.

How to profile C++ code in Visual Studio for free

If you use Microsoft’s Visual Studio to develop your applications, chances are you either have the express or professional editions, which are free or $549 respectively. Unfortunately, neither of these editions comes with a code profiler! Instead, if you want to use a built-in code profiler for Visual Studio out of the box, you’ll need to have either the premium or ultimate edition for $5,469 or $11,899 respectively. No joke! Luckily, you don’t need to use Visual Studio’s built-in profiler to effectively and easily profile your code.

Continue reading ‘How to profile C++ code in Visual Studio for free’ »

Posted by admin on November 28, 2010 at 2:38 pm under C++, Optimization, Windows.
Comments Off on How to profile C++ code in Visual Studio for free.

Search algorithm with CUDA

Searching is a common task in computer science, and fortunately, it is also perfectly suited for CUDA. For this article, we’re talking about searching through an unsorted text file for a specific word or phrase. For example, if you have a 50 megabyte text file open in Microsoft Visual Studio, you’re sure to notice that searching for a word can take several seconds, which is more than any person wants to wait just to find a word in a document. This article will demonstrate a simple kernel which can perform simple string matches.

Continue reading ‘Search algorithm with CUDA’ »

Posted by admin on July 28, 2010 at 6:41 pm under CUDA.
Tags: Algorithm, Atomic, CUDA, Search, Tutorial, unsorted, word
Comments Off on Search algorithm with CUDA.

Oil Painting Algorithm

Taking an image and making it look like an oil painting is not only visually impressive, but also easy, from an algorithmic point of view. This page will show you how to write code to achieve the oil painting effect.

Continue reading ‘Oil Painting Algorithm’ »

Posted by admin on April 24, 2010 at 5:23 pm under Graphics.
Tags: Algorithm, Graphics, Image processing, Oil Painting, Painting, Tutorial
Comments Off on Oil Painting Algorithm.

Optimizing CUDA programs for GTX 400 series

Unlike most programming languages, CUDA is coupled very closely together with the hardware implementation. While x86 processors have not changed very much over the past 10 years, CUDA hardware has had a significant change in architecture several times. First, the introduction of CUDA with the 80 series, followed shortly by the 200 series, and now nVidia has begun selling cards in the 400 series, namely the GTX 480 and GTX 470.

Continue reading ‘Optimizing CUDA programs for GTX 400 series’ »

Posted by admin on April 24, 2010 at 11:34 am under CUDA.
Tags: 400 series, CUDA, GTX 400, GTX 470, GTX 480, Optimization
Comments Off on Optimizing CUDA programs for GTX 400 series.

Performance of sqrt in CUDA

Taking the square root of a floating point number is essential in many engineering applications. Whether you are doing nBody simulations, simulating molecules, or linear algebra, the ability to accurately and quickly perform thousands or even millions of square root operations is essential. Unfortunately, the square root functions on most CPUs are very time consuming, even with specialized SSE instructions. Fortunately enough, GPUs have specialized hardware to perform such square root operations extremely fast. CUDA, NVidia’s solution to extremely high performance parallel computing, puts the onboard specialized hardware to full use, and easily outperforms modern Intel or AMD CPUs by a factor of over a hundred.

Continue reading ‘Performance of sqrt in CUDA’ »

Posted by admin on January 19, 2010 at 11:17 pm under CUDA.
Tags: CUDA, Experiment, Optimization, Performance, Sqrt
Comments Off on Performance of sqrt in CUDA.

Image Convolution with GDI+

Image convolution is the most vital image processing algorithm available. Using simple 2-D convolution, you can blur, sharpen, emboss, and even detect edges in an image. Not only is convolution so powerful, but it is also very easy to perform. Simply put, the value of a modified pixel is determined solely by it’s original value summed up with weighted values of it’s neighboring pixels. After the weighted sum is completed, a division takes place to normalize the value of the pixel, usually so that the brightness of the image remains the same. Sometimes, an offset can be added after the normalization for certain effects. Continue reading ‘Image Convolution with GDI+’ »

Posted by admin on January 18, 2010 at 10:47 pm under Graphics.
Tags: Convolution, GDI, Graphics, Image processing, Tutorial
Comments Off on Image Convolution with GDI+.

How to download HTML with C++

This tutorial shows you how to download an HTML page, or any other type of web page, using C++ or C. This tutorial is only applicable for Windows programs, since the methods described here utilize a library written for Windows only. In this tutorial, we will be calling a function which will read a webpage, and save it to a file. After the file is created and saved, we can proceed to read that file through standard methods. At first glance, it may seem like this method is very inefficient, since hard drive accesses take a long time. But in actuality, the vast majority of the performance penalty will be from downloading the web page from the internet. Since the we read the file directly after creating it, you can be assured that the file is in cache, so there won’t be such a performance hit.

Step 1: Include and link the appropriate library

#include <urlmon.h>

Aside from including the library header file, you will need to link the urlmon.lib. To do this, right click on your project in the solution explorer windows, and select Properties from the pop-up menu. Go to the Configuration Properties -> Linker -> Input window. In the “Additional Dependencies” field, type urlmon.lib and press enter. Apply your changes, and close the project properties window.

Step 2: Choose Unicode or ASCII for your project

There are two types of character sets that can be used in an application. The first, ASCII, has only 8 bits, or 1 byte, per character. ASCII is often considered outdated, but is much simpler to deal with. Unicode uses more 16 bits per character, which facilitates muli-lingual programs. There are two sets of functions in the urlmon library, one set of functions is for ASCII, and the other set of functions is for Unicode. I have set the project in this tutorial to compile with the ASCII character set. You may choose to use Unicode, of course, but it just important that you know what character set your project is set to compile. To find out, open up the project properties window, and go to the Configuration Properties -> General window. Notice what the “Character Set” field is set to. “Not Set” corresponds to using the ASCII character set.

Step 3: Download the web page to a file

To download the web page, simply use the URLDownloadToFile function. This function returns an HRESULT error code, which is really just a long. When dealing with HRESULTs, just keep in mind that zero is returned as success. Therefore, it is always best to explicitly use the error code definitions, such as S_OK for success.

char webAddress[256];
char szFileName[80] = "result.html";

cout << "Please enter web address: ";	// example: https://supercomputingblog.com
cin >> webAddress;

HRESULT hr = URLDownloadToFile(NULL, webAddress, szFileName,0, NULL);
if (hr == S_OK)
{
	cout << "Success!\n";
	// Open the file and print it to the console window
	// Since the file was just written, it should still be in cache somewhere.
	ifstream fin(szFileName);
	char szBuff[2048];
	while(fin.getline(szBuff, 2048))
	{
		cout << szBuff << "\n";
	}
}
else
{
	cout << "Operation failed with error code: " << hr << "\n";
}

Download the source code

You can download the source code for this tutorial here

Posted by admin on December 30, 2009 at 12:37 am under Windows.
Tags: C++, HTML, Internet, Tutorial, Web, Windows
Comments Off on How to download HTML with C++.

The Supercomputing Blog