CSC 654 - Spring 2022 - Graduate/Honors Project

This material is for graduate students and undergraduate students doing “contract honors” for this class only.

For this project, you will explore what current research in algorithms looks like, both in terms of topics studied and standards for research writing. Note that “research” here means the focus is on scholarly/scientific research, and your project must be grounded in sound peer-reviewed research publications. Web sites, Wikipedia, or books are not sufficient! For an overview of computer science research (albeit with a focus on computer security at one point) read Research in Computer Science and Computer Security.

You will first select a project topic, and then you will locate research papers to read to learn about current research in that area. The final deliverable will be a paper that you write summarizing current research related to that topic. You can either take a broad look at work in an area (summarizing results at a high level from 5–8 papers) or you can take a “deep-dive” into specific results (a more in-depth look at 2–3 papers). Ideally your topic should draw on papers from the past five years, but there are certainly classic results that can form a good project as well. If at any point you are uncertain about what is expected, or if you would like some guidance, please contact me – don’t just make guesses about what you should do! The major “project milestones” are explained below.

Topic selection (due Tuesday, March 29): Submit a brief one paragraph description of your project topic in Canvas by March 30, along with some references (at least two) that you will be basing your project on. Note that you will need to have scanned through some references by this time to make sure the project you’re proposing is feasible. Don’t just pick a topic that sounds interesting, but make sure you can read, comprehend, and digest the level of presentation in your references.
Progress report (due Tuesday, April 19): By the time of the progress report your project should be pretty well investigated, meaning you’ve collected and read the main reference papers, and thought through what you’re going to write about in your report. You should turn in a progress report that contains a basic introduction section to your project report (this should describe the topic you’re studying at a high level and describe what you will be giving details on), as well as an outline of your report and a list of bibliographic references that you plan on using. You need at least two solid references (peer-reviewed conference or journal papers). Make sure you use appropriate citation styles (including full conference/journal names, dates, and page numbers – see here).

The progress report is graded, and counts for 20% of your overall project grade. However, the most important part of the progress report is that I will provide comments and suggestions in Canvas soon after you submit it. This is where I tell you if you are going in the right direction, and give suggestions for things you should include for a good final report. You can turn in your progress report at any point, and I will provide feedback promptly. You are encouraged to turn in your progress report before the deadline so you can get early feedback on the direction you are taking!
Final report (due Thursday, May 5, noon): Your final report is due at the scheduled final exam time.

Paper Format and Level

Your paper should follow the same general format and writing standards of published algorithms research – the papers you are reading. See the “Research in Computer Science” link above for some information about standard research paper structure. This is a graduate-level computer science class, so technical depth is important. Analysis, theorems, and proofs are certainly important and should be included as appropriate. Since this is a research topic, it’s also important to think about (and write about) what questions are left unanswered by the current research that should be investigated (“open problems”). As for the length of the paper, something around 12 pages (11 or 12 point font, single spaced) should be enough to cover the important parts. There’s no need to try to write about everything that’s out there related to your topic.

Remember that the writing should be entirely your own – it is not acceptable to copy text from a paper or the web. Theorem statements may be copied exactly if cited properly, but any proofs should be your own versions (based on the logical outline of the paper you are summarizing). My general advice to people is this: Investigate and read as much about the topic as you can until you really understand it, taking some light notes. Then you should know the topic well enough to put aside all your references, and do the writing without looking at the original material. That ensures that the writing is coming from you and not the reference material.

Finding a Topic

Finding a topic can be a challenge, and there are two general approaches. First, you can look through recent conferences and journals that publish algorithms research, and look for titles of papers that interest you. Alternatively, you can look into some of my pre-selected topic areas below. While the second approach may be easier, you may find something that is a better match for your interests using the first approach.

I’m open to pretty much any topic in algorithms research. The one basic requirement is that the algorithms must fit in a well-defined model that can be analyzed. Algorithms of the style that you see in AI or data science, that are often just “throw stuff at it and see what happens” (machine learning, genetic algorithms, etc.), are generally not acceptable, although see the machine learning topic suggestion below for an acceptable approach.

Approach 1: Scanning High-Quality Conferences and Journals

There are a few top conferences and journals that consistently have the “best in the world” theoretical computer science research, including algorithms work (and other topics you might see in the Theory of Computing class). Note that since this is very cutting-edge material, it can be hard for beginning students to understand easily. Here’s a list:

There are also some specifically algorithms-focused publication venues, which tend to publish more accessible algorithms research:

And finally, there are some that focus on algorithms from an implementation and experimentation approach:

Approach 2: Considering some pre-selected topic areas

I have listed below a few general areas that I find interesting, or that I think students might find interesting. If the general area interests you, I would suggest looking beyond the basic references I give – for example, follow references and use Google Scholar to find related work. You might find something that’s a perfect fit for your interest by expanding from these topics just a little bit!

Different Models of Algorithm Analysis: Standard “textbook” algorithm analysis focuses on worst-case analysis of algorithms. Is that always the best measure? What other ways might algorithms be analyzed? Tim Roughgarden wrote an introductory “Beyond the Worst-Case Analysis of Algorithms” survey which gives an overview of some of this work, and has pointers to more.

Different Models of Computing and Sorting: What if the input to your problem is too large to fit in memory? There are some classic papers on external memory algorithms, and some recent interesting results finding lower bounds for external memory integer sorting.

Streaming Algorithms for Graphs: This is another take on algorithms when the input is too large to fit in memory, with a lot of work on processing massive graphs (think of large social networks). A good starting point for this is the paper “Graph Stream Algorithms: A Survey” (or this PDF).

Dynamic Graph Algorithms: What if your input is changing dynamically over time, and you need to keep some result on the graph up to date? For example, a graph is slowly changing, and you need to maintain a shortest paths tree. There has been a lot of work on this lately, including work specifically on dynamic approximate shortest paths and a series of results from Monika Henzinger.

Graph sparsification and sketches: Yet another “what if you have too much data” questions, but in this case you compute a simplified version of your large input that reflects the properties you are analyzing. A good paper on graph sketches was published in 2012, but there is more recent work as well

Experimental Algorithmics: There are some interesting questions that can be asked and explored through experiments with algorithms. This does not just mean “we implemented an algorithm and ran it” though – research in this area must take a rigorous and careful approach, and aimed at answering research questions. A good starting point is “Experimental Algorithmics” by Catherine McGeoch, and you can dig further into any of the three examples of experiment-driven algorithms research in that paper.

Algorithmic Aspects of Machine Learning: Machine learning algorithms are generally not studied in the way that an algorithms researcher would look at algorithms. One group trying to bridge this gap is Sanjeev Arora’s machine learning with provable guarantees group at Princeton. I don’t know much about this area, so can’t say more, but if you are interested in machine learning then looking though these publications might lead to an interesting project.

Algorithmic Problems in Geometry and GIS: The broad area of Computational Geometry is an active area of algorithms research (with its own well-respected conference), but a certain subclass of this work deals with interesting applied problems in Geographic Information Systems, including some recent work in computing Flood-risk analysis on terrains.

For the truly ambitious: There are a few recent algorithms research results I have mentioned in class, which are truly cutting-edge. These deal with some pretty hairy analysis though, so take a look at the key paper and make sure you can process that level of math before taking one of these on. These results include a new result on integer multiplication, a new result on matrix multiplication, and a new result on solving sparse linear systems.