# PolyTopix

In the last couple of years, I developed some (research) interest in recommendation algorithms and speech synthesis. My interests in these areas are geared towards developing an automated personalized news radio.

Almost all of us are interesting in consuming news. In this internet age, there is no dearth of news sources. Often we have too many sources. We tend to “read” news from several sources / news aggregators, spending several hours per week. Most of the time we are simply interested in the top and relevant headlines.

PolyTopix is my way of simplifying the process of consuming top and relevant news. The initial prototype is here. The website “reads” several news tweets (collected from different sources) and ordered based on a machine learning algorithm. Users can login and specify their individual interests (and zip code) to narrow down the news.

Try PolyTopix let me know your feedback. Here are some upcoming features :

• Automatically collect weather news (and local news) based on your location.
• Reading more details of most important news.
• News will be classified as exciting/sad/happy etc., (based on a machine learning algorithm) and read with the corresponding emotional voice.

Essentially PolyTopix is aimed towards a completely automated and personalized news radio, that can “read” news from across the world anytime with one click.

————————————————————————————————————————

# Book Review of “Boosting : Foundations and Algorithms”

Following is my review of Boosting : Foundations and Algorithms (by Robert E. Schapire and Yoav Freund) to appear in the  SIGACT book review column soon.

—————————————————————————————————————-

Book : Boosting : Foundations and Algorithms (by Robert E. Schapire and Yoav Freund)
Reviewer : Shiva Kintali

Introduction

You have k friends, each one earning a small amount of money (say 100 dollars) every month by buying and selling stocks. One fine evening, at a dinner conversation, they told you their individual “strategies” (after all, they are your friends). Is it possible to “combine” these individual strategies and make million dollars in an year, assuming your initial capital is same as your average friend ?

You are managing a group of k “diverse” software engineers each one with only an “above-average” intelligence. Is it possible to build a world-class product using their skills ?

The above scenarios give rise to fundamental theoretical questions in machine learning and form the basis of Boosting. As you may know, the goal of machine learning is to build systems that can adapt to their environments and learn from their experience. In the last five decades, machine learning has impacted almost every aspect of our life, for example, computer vision, speech processing, web-search, information retrieval, biology and so on. In fact, it is very hard to name an area that cannot benefit from the theoretical and practical insights of machine learning.

The answer to the above mentioned questions is Boosting, an elegant method for driving down the error of the combined classifier by combining a number of weak classifiers. In the last two decades, several variants of Boosting are discovered. All these algorithms come with a set of theoretical guarantees and made a deep practical impact on the advances of machine learning, often providing new explanations for existing prediction algorithms.

Boosting : Foundations and Algorithms, written by the inventors of Boosting, deals with variants of AdaBoost, an adaptive boosting method. Here is a quick explanation of the basic version of AdaBoost.

AdaBoost makes iterative calls to the base learner. It maintains a distribution over training examples to choose the training sets provided to the base learner on each round. Each training example is assigned a weight, a measure of importance of correctly classifying an example on the current round. Initially, all weights are set equally. On each round, the weights of incorrectly classified examples are increased so that, “hard” examples get successively higher weight. This forces the base learner to focus its attention on the hard example and drive down the generalization errors.

AdaBoost is fast and easy to implement and the only parameter to tune is the number of rounds. The actual performance of boosting is dependent on the data.

Summary

Chapter 1 provides a quick introduction and overview of Boosting algorithms with practical examples. The rest of the book is divided into four major parts. Each part is divided into 3 to 4 chapters.

Part I studies the properties and effectiveness of AdaBoost and theoretical aspects of minimizing its training and generalization errors. It is proved that AdaBoost drives the training error down very fast (as a function of the error rates of the weak classifiers) and the generalization error arbitrarily close to zero. Basic theoretical bounds on the generalization error show that AdaBoost overfits, however empirical studies show that AdaBoost does not overfit. To explain this paradox, a margin-based analysis is presented to explain the absence of overfitting.
Part II explains several properties of AdaBoost using game-theoretic interpretations. It is shown that the principles of Boosting are very intimately related to the classic min-max theorem of von Neumann. A two-player (the boosting algorithm and the weak learning algorithm) game is considered and it is shown that AdaBoost is a special case of a more general algorithm for playing a repeated game. By reversing the roles of the players, a solution is obtained for the online prediction model thus establishing a connection between Boosting and online learning. Loss minimization is studied and AdaBoost is interpreted as an abstract geometric framework for optimizing a particular objective function. More interestingly, AdaBoost is viewed as a special case of more general methods for optimization of an objective function such as coordinate descent and functional gradient descent.

Part III explains several methods of extending AdaBoost to handle classifiers with more than two output classes. AdaBoost.M1, AdaBoost.MH and AdaBoost.MO are presented along with their theoretical analysis and practical applications. RankBoost, an extension of AdaBoost to study ranking problems is studied. Such an algorithm is very useful, for example, to rank webpages based on their relevance to a given query.

Part IV is dedicated to advanced theoretical topics. Under certain assumptions, it is proved that AdaBoost can handle noisy-data and converge to the best possible classifier. An optimal boost-by-majority algorithm is presented. This algorithm is then modified to be adaptive leading to an algorithm called BrownBoost.

Many examples are given throughout the book to illustrate the empirical performance of the algorithms presented. Every chapter ends with Summary and Bibliography mentioning the related publications. There are well-designed exercises at the end of every chapter. Appendix briefly outlines some required mathematical background.

Opinion

Boosting book is definitely a very good reference text for researchers in the area of machine learning. If you are new to machine learning, I encourage you to read an introductory machine learning book (for example, Machine Learning by Tom M. Mitchell) to better understand and appreciate the concepts. In terms of being used in a course, a graduate-level machine learning course can be designed from the topics covered in this book. The exercises in the book can be readily used for such a course.

Overall this book is a stimulating learning experience. It has provided me new perspectives on theory and practice of several variants of Boosting algorithms. Most of the algorithms in this book are new to me and I had no difficulties following the algorithms and the corresponding theorems. The exercises at the end of every chapter made these topics much more fun to learn.

The authors did a very good job compiling different variants of Boosting algorithms and achieved a nice balance between theoretical analysis and practical examples. I highly recommend this book for anyone interested in machine learning.

—————————————————————————————————————-

# TrueShelf 1.0

One year back (on 6/6/12) I announced a beta version of TrueShelf, a social-network for sharing exercises and puzzles especially in mathematics and computer science. After an year of testing and adding new features, now I can say that TrueShelf is out of beta.

TrueShelf turned out to be a very useful website. When students ask me for practice problems (or books) on a particular topic, I simply point them to trueshelf and tell them the tags related to that topic. When I am advising students on research projects, I first tell them to solve all related problems (in the first couple of weeks) to prepare them to read research papers.

Here are the features in TrueShelf 1.0.

• Post an exercise (or) multiple-choice question (or) video (or) notes.
• Solve any multiple-choice question directly on the website.
• Add topic and tags to any post
• Show text-books related to a post
• Show related posts for every post.
• View printable version (or) LaTex version of any post.
• Email / Tweet / share on facebook (or) Google+ any post directly from the post.
• Like (a.k.a upvote) any post.

Feel free to explore TrueShelf, contribute new exercises and let me know if you have any feedback (or) new features you want to see. You can also follow TrueShelf on facebooktwitter and google+. Here is a screenshot highlighting the important features.

# Recreational Math Books – Part I

Most of us encounter math puzzles during high-school. If you are really obsessed with puzzles, actively searching and solving them, you will very soon run out of puzzles !! One day you will simply realize that you are not encountering any new puzzles. No more new puzzles. Poof. They are all gone. You feel like screaming “Give me a new puzzle“. This happened to me around the end of my undergrad days. During this phase of searching for puzzles, I encountered Graceful Tree Conjecture and realized that there are lots of long-standing open “puzzles”. I don’t scream anymore. Well… sometimes I do scream when my proofs collapse. But that’s a different kind of screaming.

Sometimes, I do try to create new puzzles. Most of the puzzles I create are either very trivial to solve (or) very hard and related to long-standing conjectures. Often it takes lots of effort and ingenuity to create a puzzle with right level of difficulty.

In today’s post, I want to point you to some of the basic puzzle books that everybody should read. So, the next time you see a kid screaming ”Give me a new puzzle“, simply point him/her to these books. Hopefully they will stop screaming for sometime. If they comeback to you soon, point them to Graceful Tree Conjecture

I will mention more recreational math books in part 2 of this blog post.

# How to teach Algorithms ?

Algorithms are everywhere. They help us travel efficiently, retrieve information from huge data sets, secure money transactions, recommend movies, books, videos, predict stock market etc. It is very tough to think about a daily task that does not benefit from efficient algorithms. Often the algorithms behind most of these tasks are very simple, yet their impact is tremendous. When I say “simple”, they are simple to people who know them. Most common people consider algorithms too mathematical. They assume that it is beyond their capability to understand algorithms. What they do not realize is that algorithms are often simple extensions of our daily rational thinking process.

For example, almost everybody considers it stupid to buy an item for $24 and pay$6 shipping, if there is free shipping for orders of more than $25. If you add one more item of cost$1, you saved $5. Also, when we pack our bags for travel, most of us try to do it as “efficiently” as possible, trying to carry as many “valuable” things as possible while trying to avoid paying for extra luggage on airlines. We consider these rational choices. Algorithms are simply “step-by-step procedures to achieve these rational objectives“. If you are an instructor (like me), teaching Algorithms, you might have noticed that most students (around 70%) are intimidated when they take a basic algorithms course. Most of them DO end up doing well in the course, but they consider the process painful. If they reflect on the course, most often they say “that course was a nightmare, I don’t remember what I learnt in that course”. They do not seem to have enjoyed the course. Probably they might remember 30% of the material. This is definitely not acceptable for such a fundamental course. Often, when I comeback to my office after teaching, I say to myself ”I should have given them one more example, to help them get better intuition”. You can always do better job if you are given more time. Alas, we have time-bounded classes and almost infinite details to cover. We expect students to learn some concepts on their own and develop their own intuitions. We want to give “good” reading material. So, their understanding depends on how well these readings are written. Today’s post is about “How to teach Algorithms ?” Here is one of my experiences, while I was teaching an undergrad algorithms course at GeorgiaTech. I was teaching dynamic programming. I gave several examples to make sure that they understand the paradigm. At the end of the class, almost 50% of class had questions, because this is the first time they saw dynamic programming. I told them to see me in my office hours. I quickly implemented a java applet to show how the matrix entries are filled by the algorithm, step by step. When I showed them this applet and a pseudo-code side-by-side (highlighting every current line of code being executed), almost all of the students understood the main idea behind dynamic programming. Some of them also said “it is easy”. I was glad and wanted to add more algorithms in my code. The Kintali Language The goal is to have a very simple to understand “executable pseudo-code” along with an animation framework that “understands” this language. So I started designing a new language and called it Kintali language, for lack of a better word . I borrowed syntax from several pseudo-codes. It took me almost two years to implement all the necessary features keeping in mind a broad range of algorithms. I developed an interpreter to translate this language into an intermediate representation with callbacks to an animation library. This summer, I finally implemented the animation library and the front-end in Objective-C. The result is the Algorithms App for iPad, released on Sep 20, 2012. This is my attempt to teach as many algorithms as possible by intuitive visualization and friendly exercises. Features Current version has Sorting algorithms (Insertion Sort, Bubble Sort, Quick Sort and MergeSort). The main advantage of my framework will be demonstrated once I add graph algorithms. I will add some “adaptive” exercises and games too. For example, one of the games is to predict what is the next matrix entry that will be filled next by an algorithm. Also, I have the necessary framework to visually demonstrate recursion (by showing the recursion tree), dynamic programming (by showing the status (filled or waiting to be filled) of matrix entries), divide and conquer (by splitting the data) etc. Since the framework is ready, adding new algorithms will not take much time. Here is a screenshot of Quick Sort in action. Platforms After I developed the interpreter, I was wondering what platforms to support first. I went ahead with iPad because I developed the interpreter in C. Objective-C is a superset of C. The Mac desktop version should be available in couple of weeks. In the long run I will implement the Android, Linux and Windows 8 versions too. Goal The big goal here is to “almost” replace an algorithms textbook. I added a button to access relevant wikipedia articles (within the app) describing the corresponding algorithms. With simple pseudo-code, intuitive animations, adaptive exercises and easy access to online articles, I think this goal is definitely achievable. Questions I have some quick questions to all the instructors and students of Algorithms. • What algorithms do you want to see soon i.e., what algorithms did you have most difficulty learning/teaching ? • What are some current methods you use to overcome the limitations of “static” textbooks ? • Any more ideas to make algorithms more fun and cool to learn/teach ? I wanted to write this post after achieving at least 100 downloads. I assumed this will take a month. To my surprise, there were 100 downloads from 15 countries, in the first 40 hours. I guess I have to add new features faster than I planned. TrueShelf and Algorithms App are new additions to my hobbies. The others being Painting, BoardGames and Biking. Man’s got to have some hobbies. Follow Algorithms App on Facebook, Twitter and Google+. ## Download Algorithms App for iPad —————————————————————————————————————————————————— # Open Problems from Lovasz and Plummer’s Matching Theory Book I always have exactly one bed-time mathematical book to read (for an hour) before going to sleep. It helps me learn new concepts and hopefully stumble upon interesting open problems. Matching Theory by Laszlo Lovasz and Michael D. Plummer has been my bed-time book for the last six months. I bought this book 3 years back (during my PhD days) but never got a chance to read it. This book often disappears from Amazon’s stock. I guess they are printing it on-demand. If you are interested in learning the algorithmic and combinatorial foundations of Matching Theory (with a historic perspective), then this book is a must read. Today’s post is about the open problems mentioned in Matching Theory book. If you know the status (or progress) of these problems, please leave a comment. —————————————————————————- 1 . Consistent Labeling and Maximum Flow Conjecture (Fulkerson) : Any consistent labelling procedure results in a maximum flow in polynomial number of steps. —————————————————————————- 2. Toughness and Hamiltonicity The toughness of a graph $G$, $t(G)$ is defined to be $+\infty$, if $G = K_n$ and to be $min(|S|/c(G-S))$, if $G \neq K_n$. Here $c(G-S)$ is the number of components of $G-S$. Conjecture (Chvatal 1973) : There exists a positive real number $t_0$ such that for every graph $G$, $t(G) \geq t_0$ implies $G$ is Hamiltonian. —————————————————————————- 3. Perfect Matchings and Bipartite Graphs Theorem : Let $X$ be a set, $X_1, \dots, X_t \subseteq X$ and suppose that $|X_i| \leq r$ for $i = 1, \dots, t$. Let $G$ be a bipartite graph such that a) $X \subseteq V(G)$, b) $G - X_i$ has a perfect matching , and c) if any edge of $G$ is deleted, property (b) fails to hold in the resulting graph. Then, the number of vertices in $G$ with degree $\geq 3$ is at most $r^3 {t \choose 3}$. Conjecture : The conclusion of the above theorem holds for non-bipartite graphs as well. —————————————————————————- 4. Number of Perfect Matchings Conjecture (Schrijver and W.G.Valiant 1980) : Let $\Phi(n,k)$ denote the minimum number of perfect matchings a k-regular bipartite graph on 2n points can have. Then, $\lim_{n \to \infty} (\Phi(n,k))^{\frac{1}{n}} = \frac{(k-1)^{k-1}}{k^{k-2}}$. —————————————————————————- 5. Elementary Graphs Conjecture : For $k \geq 3$ there exist constants $c_1(k) > 1$ and $c_2(k) > 0$ such that every k-regular elementary graph on 2n vertices, without forbidden edges , contains at least $c_2(k){\cdot}c_1(k)^n$ perfect matchings. Furthermore $c_1(k) \to \infty$ as $k \to \infty$. —————————————————————————- 6. Number of colorations Conjecture (Schrijver’83) : Let G be a k-regular bipartite graph on 2n vertices. Then the number of colorings of the edges of G with k given colors is at least $(\frac{(k!)^2}{k^k})^n$. —————————————————————————- Theorem : A graph is perfect if and only if it does not contain, as an induced subgraph, an odd hole or an odd antihole. —————————————————————————- # TrueShelf I have been teaching (courses related to algorithms and complexity) for the past six years (five years as a PhD student at GeorgiaTech, and the past one year at Princeton). One of the most challenging and interesting part of teaching is creating new exercises to help teach the important concepts in an efficient way. We often need lots of problems to include in homeworks, midterms, final exams and also to create practice problem sets. We do not get enough time to teach all the concepts in class because the number of hours/week is bounded. I personally like to teach only the main concepts in class and design good problem sets so that students can learn the generalizations or extensions of the concepts by solving problems hands-on. This helps them develop their own intuitions about the concepts. Whenever I need a new exercise I hardly open a physical textbook. I usually search on internet and find exercises from a course website (or) “extract” an exercise from a research paper. There are hundreds of exercises “hidden” in pdf files across several course homepages. Instructors often spend lots of time designing them. If these exercises can reach all the instructors and students across the world in an efficiently-indexed form, that will help everybody. Instructors will be happy that the exercises they designed are not confined to just one course. Students will have an excellent supply of exercises to hone their problem-solving skills. During 2008, half-way through my PhD, I started collected the exercises I like in a private blog. At the same time I registered the domain trueshelf.com to make these exercises public. In 2011, towards the end of my PhD, I started using the trueshelf.com domain and made a public blog so that anybody can post an exercise. [ Notice that I did not use the trueshelf.com domain for three years. During these three years I got several offers ranging upto$5000 to sell the domain. So I knew I got the right name ] Soon, I realized that wordpress is somewhat “static” in nature and does not have enough “social” features I wanted. A screenshot of the old website is shown below.

The new version of TrueShelf is a social website enabling “crowd-sourcing” of exercises in any area. Here is the new logo, I am excited about

The goal of TrueShelf is to aid both the instructors and students by presenting quality exercises with tag-based indexing. Read the TrueShelf FAQ for more details. Note that we DO NOT allow users to post solutions. Each user may add his own “private” solution and notes to any exercise. I am planning to add more features soon.

In the long-run, I see TrueShelf becoming a ”Youtube for exercises”. Users will be able to create their own playlists of exercises (a.k.a problem sets) and will be recommended relevant exercises. Test-preparation agencies will be able to create their own channels to create sample tests.

Feel free to explore TrueShelf, contribute new exercises and let me know if you have any feedback (or) new features you want to see. You can also follow TrueShelf on facebook, twitter and google+.

Let’s see how TrueShelf evolves.

# Computing Bounded Path Decompositions in Logspace

Today’s post is a continuation of earlier posts (here, here, here, here) on graph isomorphism, treewidth and pathwidth. As mentioned earlier, the best known upper bound for Graph Isomorphism of partial k-trees is LogCFL.

Theorem ([Das, Toran and Wagner'10]) : Graph isomorphism of bounded treewidth graphs is in LogCFL.

One of the bottlenecks of the algorithm of [DTW'10] is computing bounded tree decompositions in logspace. This is recently resolved by an amazing result of Elberfeld, Jakoby and Tantau [EJT'10]. The results in this paper are very powerful. Unfortunately, it is still not clear how to improve the LogCFL upper bound.

Can we improve the upper bound for special cases of partial k-trees ? How about bounded pathwidth graphs ? Again, one bottleneck here is to compute bounded path decompositions in logspace. [EJT'10]‘s paper does not address this bottleneck and it is not clear how to extend their algorithm to compute path decompositions.

In joint work with Sinziana Munteanu, we resolved this bottleneck and proved the following theorem. Sinziana is a senior undergraduate student in our department. She is working with me on her senior thesis.

Theorem (Kintali, Munteanu’12) : For all constants $k, l \geq 1$, there exists a logspace algorithm that, when given a graph $G$ of treewidth $\leq l$, decides whether the pathwidth of $G$ is at most $k$, and if so, finds a path decomposition of $G$ of width $\leq k$ in logspace.

A draft of our results is available here. The above theorem is a logspace counterpart of the corresponding polynomial-time algorithm of [Bodlaender, Kloks'96]. Converting it into a logspace algorithm turned out to be a tedious task with some interesting tricks. Our work motivates the following open problem :

Open problem : What is the complexity of Graph Isomorphism of bounded pathwidth graphs ? Is there a logspace algorithm ?

Stay tuned for more papers related to graph isomorphism, treewidth and pathwidth. I am going through a phase of life, where I have more results than I can type. Is there an app that converts voice to latex ? Is there a journal that accepts hand-written proofs ?

# Complexity of Tessel

One of my hobbies (I developed during my PhD) is designing boardgames. I designed three boardgames so far, one of which is Tessel, a word-building game based on graph theory. I am glad that Tessel is getting good feedback especially from schools and families. One of the most time-consuming part of Tessel’s design is deciding what values to assign to the letters and deciding which pairs of letters to use in the tiles. The pairs of letters are carefully chosen based on computer simulations of frequency of letters in english words and their “relative” importance. The pairs are chosen so as to give fair share to both vocabulary skills and optimization skills.

Today’s post is about a nice theoretical problem arising from this game.

Before you read further, please read the rules of Tessel. Henceforth I will assume that you understood the rules and goal of this game.

I guess you observed that the tiles are being placed on the edges of a planar graph. Tessel uses a special planar graph that has cycles of length 3,4,5 and 6. In general, this game can be played on any planar graph. I am planning to design another board using Cairo tessellation. Anyways, here is a theoretical problem :

Let S be a set of finite alphabets. You are given two different words (using alphabets from S) of length l1 and l2. Construct a planar graph G and label each edge with two alphabets, such that there are two walks in G that correspond to the given two words. (Read the rules of tessel  and look at these examples to understand this correspondence). Your goal is to construct G with minimum number of vertices (or minimum number of edges).

In general you can ask the above question given k different words. What is the complexity of this problem ? I don’t know. I haven’t given it a deep thought. These days whatever I do for fun (to take my mind off open problems), ends up in another open problem

# Graph Isomorphism, Tree Width, Path Width and LogSpace

Every once in a while, I can’t help thinking about “the complexity of graph isomorphism for bounded treewidth graphs“. Today has been one of those days again. See my earlier post to get the context.

Theorem ([Das, Toran and Wagner'10]) : Graph isomorphism of bounded treewidth graphs is in LogCFL.

The proof of the above theorem is as follows

1. Graph isomorphism of bounded tree-distance width graphs is in L.
2. Given two graphs and their tree decompositions, computing the isomorphism respecting these tree decompositions is reducible to (1).
3. Given tree decomposition of only one graph, we can guess the tree decomposition of the other and guess the isomorphism (respecting the tree bags) and verify them using a non-deterministic auxiliary pushdown automata (a.k.a LogCFL).
4. Since tree decomposition of a graph can be computed in LogCFL, the above theorem follows.
One of the bottlenecks, finding a tree decomposition of bounded treewidth graphs in logspace, is resolved by [Elberfeld, Jakoby and Tantau'10]. The following seems to be another major bottleneck.
Given a graph $G$ and a decomposition $D$, how fast can we verify that $D$ is a valid tree decomposition of $G$ ? The upper bound of LogDCFL (the deterministic version of LogCFL) is clear from the above mentioned results. Can this verification be done in logspace ?
The answer is frustratingly unknown. An even more frustrating realization I had today is that “it is not clear how to beat the LogDCFL upper bound for the more restricted path decomposition“. Even though the underlying tree in a path decomposition is just a path, verifying the connectivity conditions of a path decomposition does seem to require recursion. It is not clear how to avoid recursion.
I thought that logspace upper bound is possible. Now I am much less confident about logspace upper bound. I cannot waste more time on this.
The truth is “this is a cute problem“. I need to do something to take my mind off this problem and move on. Easy enough, except I need an idea.
Update (Oct 12 2011) : Noticed that verification of path decompositions is easy.