How to learn Vocabulary efficiently ?

Today’s post is very similar to one of my earlier posts titled “How to teach Algorithms ?“. In that earlier post, I announced an Algorithms App for iPad. Recently, I ported Algorithms App to Mac. Today’s post is about “How to learn Vocabulary efficiently ?”.

This summer, I went to a local bookstore to checkout the vocabulary section. There are several expensive books with limited number of practice tests. I also noticed a box of paper flashcards (with only 300 words) for around $25 !!! After doing some more research, I realized that the existing solutions (to learn english vocabulary) are either too hard to use and/or expensive and/or old-fashioned.

So I started building an app with ‘adaptiveness’ and ‘usability’ as primary goals. The result is the Vocabulary App (for iPhone and iPad). Here is a short description of my app.

Vocabulary app uses a sophisticated algorithm (based on spaced repetition and Leitner system) to design adaptive multiple-choice vocabulary questions. It is built on a hypergraph of words constructed using lexical cohesion.

Learning tasks are divided into small sets of multiple-choice tests designed to help you master basic words before moving on to advanced words. Words that you have the hardest time are selected more frequently. For a fixed word, the correct and wrong answers are selected adaptively giving rise to hundreds of combinations. After each wrong answer, you receive a detailed feedback with the meaning and usage of the underlying word.

Works best when used every day. Take a test whenever you have free time.

Go ahead and download the Vocabulary App and let me know your feedback/opinion (or) suggest new features.

At any given waking moment I spend my time either (1) math monkeying around (or) (2) code monkeying around. During math monkeying phase, I work on math open problems (currently related to directed minors). During code monkeying phase, I work on developing apps (currently Algorithms App, Vocabulary App) or adding new features to my websites TrueShelf or Polytopix. I try to maintain a balance between (1) and (2), subject to the nearest available equipment (a laptop or pen-and-paper). My next post will be on one of my papers (on directed minors) that is nearing completion. Stay tuned.

Directed Minors III. Directed Linked Decompositions

This is a short post about the following paper in my directed minor series :

  • Shiva Kintali. “Directed Minors III. Directed Linked Decompositions“. Preprint available on my publications page.

Thomas [Tho’90] proved that every undirected graph admits a linked tree decomposition of width equal to its treewidth. This theorem is a key technical tool for proving that every set of bounded treewidth graphs is well-quasi-ordered. An analogous theorem for branch-width was proved by Geelen, Gerards and Whittle [GGW’02]. They used this result to prove that all matroids representable over a fixed finite field and with bounded branch-width are well-quasi-ordered under minors. Kim and Seymour [KS’12] proved that every semi-complete digraph admits a linked directed path decomposition of width equal to its directed pathwidth. They used this result to show that all semi-complete digraphs are well-quasi-ordered under “strong” minors.

In this paper, we generalize Thomas’s theorem to all digraphs.

Theorem : Every digraph G admits a linked directed path decomposition and a linked DAG decomposition of width equal to its directed pathwidth and DAG-width respectively.

The above theorem is crucial to prove well-quasi-ordering of some interesting classes of digraphs. I will release Directed Minors IV soon. Stay tuned !!

Directed Width Parameters and Circumference of Digraphs

This is a short post about the following paper :

  • Shiva Kintali. “Directed Width Parameters and Circumference of Digraphs“. Preprint available on my publications page.

We prove that the directed treewidth, DAG-width and Kelly-width of a digraph are bounded above by its circumference plus one. This generalizes a theorem of Birmele stating that the treewidth of an undirected graph is at most its circumference. 

Theorem : Let G be a digraph of circumference l. Then the directed treewidth, DAG-width and Kelly-width of G are at most l + 1.

The above theorem can be seen as a mini mini mini directed grid minor theorem. I will be using this theorem in future papers to make progress towards a directed grid minor theorem. Stay tuned !!

Open problems for 2014

Wish you all a Very Happy New Year. Here is a list of my 10 favorite open problems for 2014. They belong to several research areas inside discrete mathematics and theoretical computer science. Some of them are baby steps towards resolving much bigger open problems. May this new year shed new light on these open problems.

  • 2. Optimization : Improve the approximation factor for the undirected graphic TSP. The best known bound is 7/5 by Sebo and Vygen.
  • 3. Algorithms : Prove that the tree-width of a planar graph can be computed in polynomial time (or) is NP-complete.
  • 4. Fixed-parameter tractability : Treewidth and Pathwidth are known to be fixed-parameter tractable. Are directed treewidth/DAG-width/Kelly-width (generalizations  of  treewidth) and directed pathwidth (a generalization of pathwidth) fixed-parameter tractable ? This is a very important problem to understand the algorithmic and structural differences between undirected and directed width parameters.
  • 5. Space complexity : Is Planar ST-connectvity in logspace ? This is perhaps the most natural special case of the NL vs L problem. Planar ST-connectivity is known to be in UL \cap coUL. Recently, Imai, Nakagawa, Pavan, Vinodchandran and Watanabe proved that it can be solved simultaneously in polynomial time and approximately O(√n) space.
  • 6. Metric embedding : Is the minor-free embedding conjecture true for partial 3-trees (graphs of treewidth 3) ? Minor-free conjecture states that “every minor-free graph can be embedded in l_1 with constant distortion. The special case of planar graphs also seems very difficult. I think the special case of partial 3-trees is a very interesting baby step.
  • 7. Structural graph theory : Characterize pfaffians of tree-width at most 3 (i.e., partial 3-trees). It is a long-standing open problem to give a nice characterization of pfaffians and design a polynomial time algorithm to decide if an input graph is a pfaffian. The special of partial 3-trees is an interesting baby step.
  • 8. Structural graph theory : Prove that every minimal brick has at least four vertices of degree three. Bricks and braces are defined to better understand pfaffians. The characterization of pfaffian braces is known (more generally characterization of bipartite pfaffians is known). To understand pfaffians, it is important to understand the structure of bricks. Norine,Thomas proved that every minimal brick has at least three vertices of degree three and conjectured that every minimal brick has at least cn vertices of degree three.
  • 9. Communication Complexity : Improve bounds for the log-rank conjecture. The best known bound is O(\sqrt{rank})
  • 10. Approximation algorithms : Improve the approximation factor for the uniform sparsest cut problem. The best known factor is O(\sqrt{logn}).

Here are my conjectures for 2014 :)

  • Weak Conjecture : at least one of the above 10 problems will be resolved in 2014.
  • Conjecture : at least five of the above 10 problems will be resolved in 2014.
  • Strong Conjecture : All of the above 10 problems will be resolved in 2014.

Have fun !!

PolyTopix

In the last couple of years, I developed some (research) interest in recommendation algorithms and speech synthesis. My interests in these areas are geared towards developing an automated personalized news radio.

Almost all of us are interesting in consuming news. In this internet age, there is no dearth of news sources. Often we have too many sources. We tend to “read” news from several sources / news aggregators, spending several hours per week. Most of the time we are simply interested in the top and relevant headlines.

PolyTopix is my way of simplifying the process of consuming top and relevant news. The initial prototype is here. The website “reads” several news tweets (collected from different sources) and ordered based on a machine learning algorithm. Users can login and specify their individual interests (and zip code) to narrow down the news.

logo

Try PolyTopix let me know your feedback. Here are some upcoming features :

  • Automatically collect weather news (and local news) based on your location.
  • Reading more details of most important news.
  • News will be classified as exciting/sad/happy etc., (based on a machine learning algorithm) and read with the corresponding emotional voice.

Essentially PolyTopix is aimed towards a completely automated and personalized news radio, that can “read” news from across the world anytime with one click.

————————————————————————————————————————

Forbidden Directed Minors and Kelly-width

Today’s post is about the following paper, a joint work with Qiuyi Zhang, one of my advisees. Qiuyi Zhang is an undergraduate (rising senior) in our mathematics department.

  • Shiva Kintali, Qiuyi Zhang. Forbidden Directed Minors and Kelly-width. (Preprint available on my publications page)

It is well-known that an undirected graph is a partial 1-tree (i.e., a forest) if and only if it has no K_3 minor. We generalized this characterization to partial 1-DAGs. We proved that partial 1-DAGs are characterized by three forbidden directed minors, K_3, N_4 and M_5, shown in the following figure. We named the last two graphs as N_4 and M_5 because their bidirected edges resemble the letters N and M.

minors

Partial k-trees characterize bounded treewidth graphs. Similarly, partial k-DAGs characterize bounded Kelly-width digraphs. Kelly-width is the best known generalization of treewidth to digraphs.

As mentioned in the paper, I have a series of upcoming papers (called Directed Minors) making progress towards a directed graph minor theorem (i.e., all digraphs are well-quasi-ordered by the directed minor relation). For more details of the directed minor relation, read the current paper. I will post about the upcoming results as and when the preprints are ready.

Book Review of “Boosting : Foundations and Algorithms”

Following is my review of Boosting : Foundations and Algorithms (by Robert E. Schapire and Yoav Freund) to appear in the  SIGACT book review column soon.

Boosting

—————————————————————————————————————-

Book : Boosting : Foundations and Algorithms (by Robert E. Schapire and Yoav Freund)
Reviewer : Shiva Kintali

Introduction

You have k friends, each one earning a small amount of money (say 100 dollars) every month by buying and selling stocks. One fine evening, at a dinner conversation, they told you their individual “strategies” (after all, they are your friends). Is it possible to “combine” these individual strategies and make million dollars in an year, assuming your initial capital is same as your average friend ?

You are managing a group of k “diverse” software engineers each one with only an “above-average” intelligence. Is it possible to build a world-class product using their skills ?

The above scenarios give rise to fundamental theoretical questions in machine learning and form the basis of Boosting. As you may know, the goal of machine learning is to build systems that can adapt to their environments and learn from their experience. In the last five decades, machine learning has impacted almost every aspect of our life, for example, computer vision, speech processing, web-search, information retrieval, biology and so on. In fact, it is very hard to name an area that cannot benefit from the theoretical and practical insights of machine learning.

The answer to the above mentioned questions is Boosting, an elegant method for driving down the error of the combined classifier by combining a number of weak classifiers. In the last two decades, several variants of Boosting are discovered. All these algorithms come with a set of theoretical guarantees and made a deep practical impact on the advances of machine learning, often providing new explanations for existing prediction algorithms.

Boosting : Foundations and Algorithms, written by the inventors of Boosting, deals with variants of AdaBoost, an adaptive boosting method. Here is a quick explanation of the basic version of AdaBoost.

AdaBoost makes iterative calls to the base learner. It maintains a distribution over training examples to choose the training sets provided to the base learner on each round. Each training example is assigned a weight, a measure of importance of correctly classifying an example on the current round. Initially, all weights are set equally. On each round, the weights of incorrectly classified examples are increased so that, “hard” examples get successively higher weight. This forces the base learner to focus its attention on the hard example and drive down the generalization errors.

AdaBoost is fast and easy to implement and the only parameter to tune is the number of rounds. The actual performance of boosting is dependent on the data.

Summary

Chapter 1 provides a quick introduction and overview of Boosting algorithms with practical examples. The rest of the book is divided into four major parts. Each part is divided into 3 to 4 chapters.

Part I studies the properties and effectiveness of AdaBoost and theoretical aspects of minimizing its training and generalization errors. It is proved that AdaBoost drives the training error down very fast (as a function of the error rates of the weak classifiers) and the generalization error arbitrarily close to zero. Basic theoretical bounds on the generalization error show that AdaBoost overfits, however empirical studies show that AdaBoost does not overfit. To explain this paradox, a margin-based analysis is presented to explain the absence of overfitting.
Part II explains several properties of AdaBoost using game-theoretic interpretations. It is shown that the principles of Boosting are very intimately related to the classic min-max theorem of von Neumann. A two-player (the boosting algorithm and the weak learning algorithm) game is considered and it is shown that AdaBoost is a special case of a more general algorithm for playing a repeated game. By reversing the roles of the players, a solution is obtained for the online prediction model thus establishing a connection between Boosting and online learning. Loss minimization is studied and AdaBoost is interpreted as an abstract geometric framework for optimizing a particular objective function. More interestingly, AdaBoost is viewed as a special case of more general methods for optimization of an objective function such as coordinate descent and functional gradient descent.

Part III explains several methods of extending AdaBoost to handle classifiers with more than two output classes. AdaBoost.M1, AdaBoost.MH and AdaBoost.MO are presented along with their theoretical analysis and practical applications. RankBoost, an extension of AdaBoost to study ranking problems is studied. Such an algorithm is very useful, for example, to rank webpages based on their relevance to a given query.

Part IV is dedicated to advanced theoretical topics. Under certain assumptions, it is proved that AdaBoost can handle noisy-data and converge to the best possible classifier. An optimal boost-by-majority algorithm is presented. This algorithm is then modified to be adaptive leading to an algorithm called BrownBoost.

Many examples are given throughout the book to illustrate the empirical performance of the algorithms presented. Every chapter ends with Summary and Bibliography mentioning the related publications. There are well-designed exercises at the end of every chapter. Appendix briefly outlines some required mathematical background.

Opinion

Boosting book is definitely a very good reference text for researchers in the area of machine learning. If you are new to machine learning, I encourage you to read an introductory machine learning book (for example, Machine Learning by Tom M. Mitchell) to better understand and appreciate the concepts. In terms of being used in a course, a graduate-level machine learning course can be designed from the topics covered in this book. The exercises in the book can be readily used for such a course.

Overall this book is a stimulating learning experience. It has provided me new perspectives on theory and practice of several variants of Boosting algorithms. Most of the algorithms in this book are new to me and I had no difficulties following the algorithms and the corresponding theorems. The exercises at the end of every chapter made these topics much more fun to learn.

The authors did a very good job compiling different variants of Boosting algorithms and achieved a nice balance between theoretical analysis and practical examples. I highly recommend this book for anyone interested in machine learning.

—————————————————————————————————————-