Skip navigation.

Msc Project Ideas

These are areas where I would be interested in hearing your project ideas. I have included some of the obvious ones myself, but I would be happy to discuss your suggestions of related projects.

CUDA GPU Based Parallel Computing

 

CUDA is nVidia's environment for using graphics processors as general purpose computing devices. Since nVidia launched CUDA, massive performance enhancements of various applications have been reported -- often 5-20 times faster. There is consequently great scope to move existing algorithms ‑or their data intensive bottlenecks ‑designed for single processors to a the thousand identical threads offered by CUDA.

You will be given access to our new personal supercomputer for these projects.

Here are some ideas:

  • Benchmarking CUDA. We have an initial set of benchmarks for 32 bit CUDA and multicore PCs. Your project would be to implement, test, balance, and extend these benchmarks for our state of the art hardware (using Ubuntu)
  • Accelerating WEKA. Weka is a machine learning toolset implemented in Java that is very well known. Your project would be rewrite one of training algorithms into Cuda C for parallel implementation. (Using Windows 7). As there are several training algorithms, there a number of projects here.

 

Sentiment Analysis on Twitter:

What are people saying about companies on Twitter? is it positive "Just had great coffee at Starbucks,", negative ("starbucks service sucks"), or irrelevant ("OK meet at starbucks"). Sentiment analysis is interesting to both marketing and people interested in Natural Language Processing. (See http://en.wikipedia.org/wiki/Sentiment_analysis) for more background.

We have a large data set of pre-recorded tweets to start on this project from the Web Person Search competition (see http://www.wikicfp.com/cfp/servlet/event.showcfp?eventid=8213&copyownerid=9110) for details of several tasks)

 

Web Person Search

 

In the web persons search task we try to find which web pages correspond to a number of people who share the same name. We have several data sets where the pages have been manually analysed, so it is possible to calculate how accurate your program is. We have had several succesful projects in this area so there are a number of avenues to explore.

This is essentially a problem in data clustering, and there are several algorithms to look into. I would especially be interested in someone willing to tackle GPU based clustering.