Center for Algorithms and Machine Learning (CAML) is housed at the School of Informatics and Computing at Indiana University, Bloomington. In the recent years machine learning has flourished with the availability of data and computational resources leading to unprecedented successes in prediction and control.
Our mission is to bring together researchers across different departments at the school in order to foster excellence in algorithm development for machine learning. Our center is at the forefront of theoretical foundations of algorithm development as well as large scale applications to computer vision and health. The core set of members of CAML also includes faculty from the Department of Statistics which plays a crucial role in applications of machine learning to data analysis.
How do infants learn their first words in a noisy environment? How do they progress from being slow incremental learners to rapid learners who appropriately generalize categories and concepts from minimal experience. In this talk, I will present evidence that the answer to these questions lies in the structure of the learning environment itself, which is not like that assumed by most theorists of early word learning and not like that used in language learning experiments. We have used head cameras to collect egocentric views (and parent talk) in the home from the perspective of infants and toddlers (8 month olds to 30 month olds, with no experimenters present, 500 hours of head camera video) and in a naturalistic toy room environment in the laboratory (about 200 hours of head-mounted eye tracking yielding both the ego-centric view and the gaze within that view). Our analyses of the everyday experiences indicate four principles we believe to be key to learning to becoming a rapid learner of object names and a robust learner across domains more generally. The four principles are: (1) Learn a massive amount about very few individual entities (and little bit about lots of other individual things); (2) Learn a massive amount about a very few categories (and a little bit about lots of other categories); (3) Learn about small selective sets at different points in time; (4) Self-generate the data for learning (with some help from mom and dad).
Major job search engines aggregate tens of millions of job postings online to enable job seekers to find valuable employment opportunities. Predicting the probability that a given user clicks on jobs is crucial to job search engines as the prediction can be used to provide personalized job recommendations for job seeker. In this paper, we proposed a probabilistic model to cluster users based on missing features and learn the corresponding prediction models for individual clusters. The parameters in this clustering-prediction process are jointly estimated by EM algorithm. We conduct experiments on a real-world testbed by comparing various models (especially a state-of-art learning to rank). The results demonstrate the effectiveness of our proposed personalized approach to user click prediction.
In high dimensional problems, the usual two groups problem of model selection is impossible due to the combinatorial complexity of the model space. In recent years, a set of one group models that approximates the two groups problem have been developed. Of these, the Horseshoe prior is is perhaps the most famous and places a Beta(1/2,1/2) prior on the local shrinkage parameters. There are many modifications and extensions of this framework, and we propose a new modification. Specifically, we model the local shrinkage parameter as a Beta(p,1-p) for each parameter under consideration in order to mimic the model selection problem. Placing priors on the p produces a prior distribution with extremely heavy tails that produces both very strong shrinkage of small signals and unbiased estimation of large signals, producing overall better risk behavior.
Andrew Womack is an Assistant Professor and Zikun Yang is a PhD candidate in the Department of Statistics. Their joint research efforts focus on theoretical and algorithmic aspects of model selection from a Bayesian perspective.
Scene understanding from images opens the doorway for solving various complex tasks such as visual navigation, object manipulation and reasoning about the world in 3D. These challenging tasks can be better tackled with the availability of depth information. This talk covers techniques that effectively exploit both the image data and the depth data for scene understanding. First, a semantic label propagation framework is presented that utilizes a few annotated keyframes and propagates the annotated labels to a large number of unlabeled video frames. The proposed technique alleviates the need for a labor-intensive manual annotation effort, and the propagated labels have been utilized to improve a deep neural network model for semantic segmentation - the task of simultaneously segmenting and labeling an image. This talk concludes with an approach to generate training data for distant objects in outdoor scenes and proposes two deep learning solutions for depth recovery of those objects.
Reza is a Postdoctoral Associate in the Department of Informatics, Computer Science, and Engineering (SICE) at Indiana University Bloomington. He received his Ph.D. in Computer Science from George Mason University in 2018 and M.S. in Computer Science from Drexel University in 2011. His research focuses on developing novel algorithms for perception tasks such as semantic segmentation, object detection exploiting the 3D information.
I will describe Badger Rampage, a simple general framework for designing practical algorithms for d-dimensional balanced partitioning of graphs with several billions of vertices and up to 10^12 edges. Badger Rampage is based on applying noisy projected gradient descent to a non-convex relaxation of the problem. For an appropriately chosen relaxation, the gradient descent step can be implemented simply as local averaging of the neighboring values. The technical core of our work is optimizing the computationally expensive projection step for which we give an algorithm with complexity O(|V| log^d |V|) for d <= 2. Experiments on Apache Giraph using subsets of the Facebook graph show comparable or superior performance against existing literature on one-dimensional graph partitioning with the added benefits of simplicity and generalizability to the multi-dimensional case. Analysis of convergence properties of Badger Rampage is left as an open problem.
Joint work with Dmitrii Avdiukhin (Indiana University) and Sergey Pupyrev (Facebook).
In this talk, sparse representation based and manifold optimization based image signal processing techniques are mainly discussed. These two techniques are ones of the dimension reduction methods which reduce the high dimensionality of system matrix to reduce the system complexity. Especially for deep learning process, the back propagation process for training is a time consuming process because of its iteration based optimization of nonlinear/nonconvex high dimensional cost function. While recent technologies such as deep learning make statistical decision more accurate, computational time to handle high dimensional cost function for big data is getting slower. For above reasons, dimension reduction is a key technology to achieve real-time process. Among lots of dimension reduction related techniques, sparse representation is one of the matrix factorization techniques to factor to low dimension matrices using sparse properties, and manifold optimization is a kind of piece-wise linear approximation of nonconvex/nonlinear space cost function which represents on non-Euclidean space. Additionally, random projection and tensor factorization techniques will be discussed for another possibilities to speed up the optimization processes.
Standard approaches to probabilistic reasoning require that one possesses an explicit model of the distribution in question. But, the empirical learning of models of probability distributions from partial observations is a problem for which efficient algorithms are generally not known. In this work we consider the use of bounded-degree fragments of the "sum-of-squares" logic as a probability logic. Prior work has shown that we can decide refutability for such fragments in polynomial-time. We propose to use such fragments to answer queries about whether a given probability distribution satisfies a given system of constraints and bounds on expected values. We show that in answering such queries, such constraints and bounds can be implicitly learned from partial observations in polynomial-time as well. It is known that this logic is capable of deriving many bounds that are useful in probabilistic analysis. We observe that it furthermore captures key polynomial-time fragments of resolution. Thus, these fragments are also quite expressive.
The multi-armed bandit model is a basic setting in reinforcement learning. It is one of the simplest model to capture the tradeoff between exploration and exploitation. It has many applications such as A/B testing, crowdsourcing, and personalized recommendation. In the pure exploration form, perhaps one of the most popular goals is to identify the best arm which is the arm with the largest mean. In this talk, I will give a tutorial on best arm identification in the multi-armed bandit model.
In the multi-arm bandit setting one considers a row of slot machines, each of which gives a reward based on some distribution, and asks questions such has how to identify the best machine, or how to maximize the sum of the expected rewards after a sequence of pulls. In this talk we consider the special setting where each of the rewards depends linearly on an unknown parameter $\theta$, and our goal is to identify the slot machine (arm) with the largest expected reward. We present an algorithm that is built upon a new estimator for $\theta$ that produces better theoretical guarantees and experimental results than previous algorithms. This is joint work with Chao Tao and his advisor Yuan Zhou.
From Coarse Attention to Fine-Grained Gaze: A Two-stage 3D Fully Convolutional Network for Predicting Eye Gaze in First Person Video
While predicting where people will look when viewing static scenes has been well- studied, a more challenging problem is to predict gaze within the first-person, ego-centric field of view as people go about daily life. This problem is difficult because where a person looks depends not just on their visual surroundings, but also on the task they have in mind, their own internal state, their past gaze patterns and actions, and non-visual cues (e.g., sounds) that might attract their attention. Using data from head-mounted cameras and eye trackers that record people’s egocentric fields of view and gaze, we propose and learn a two-stage 3D fully convolutional network to predict gaze in each egocentric frame. The model estimates a coarse attention region in the first stage, combining it with spatial and temporal features to predict a more precise gaze point in the second stage. We evaluate on a public dataset in which adults carry out specific tasks as well as on a new challenging dataset in which parents and toddlers freely interact with toys and each other, and demonstrate that our model outperforms state-of-the-art baselines.
This paper appeared at BMVC 2018, and is joint work with Sven Bambach, Chen Yu, David Crandall.---
Joint Person Segmentation and Identification in Synchronized First- and Third-person Videos
In a world of pervasive cameras, public spaces are often captured from multiple perspectives by cameras of different types, both fixed and mobile. An important problem is to organize these heterogeneous collections of videos by finding connections between them, such as identifying correspondences between the people appearing in the videos and the people holding or wearing the cameras. In this paper, we wish to solve two specific problems: (1) given two or more synchronized third-person videos of a scene, produce a pixel-level segmentation of each visible person and identify corresponding people across different views (i.e., determine who in camera A corresponds with whom in camera B), and (2) given one or more synchronized third-person videos as well as a first-person video taken by a mobile or wearable camera, segment and identify the camera wearer in the third-person videos. Unlike previous work which requires ground truth bounding boxes to estimate the correspondences, we perform person segmentation and identification jointly. We find that solving these two problems simultaneously is mutually beneficial, because better fine-grained segmentation allows us to better perform matching across views, and information from multiple views helps us perform more accurate segmentation. We evaluate our approach on two challenging datasets of interacting people captured from multiple wearable cameras, and show that our proposed method performs significantly better than the state-of-the-art on both person segmentation and identification.
This paper appeared at ECCV 2018, and is joint work with Mingze Xu, Chenyou Fan, Yuchen Wang, Michael Ryoo, David Crandall.
Many tasks in machine learning and data mining, such as data diversification, non-parametric learning, kernel machines, clustering etc., require extracting a small but representative summary from a massive dataset. Often, such problems can be posed as maximizing a submodular set function subject to a cardinality constraint. We consider this question in the streaming setting, where elements arrive over time at a fast pace and thus we need to design an efficient, low-memory algorithm. One such method, proposed by Badanidiyuru et al.(2014), always finds a 0.5-approximate solution. Can this approximation factor be improved? We answer this question affirmatively by designing a new algorithm SALSA for streaming submodular maximization.
SALSA is the first low-memory, single-pass algorithm that improves the factor 0.5, under the natural assumption that elements arrive in a random order. We also show that this assumption is necessary, i.e., that there is no such algorithm with better than 0.5-approximation when elements arrive in arbitrary order. Our experiments demonstrate that SALSA significantly outperforms the state of the art in applications related to exemplar-based clustering, social graph analysis, and recommender systems. In this talk I will outline our proof of the lower bound and present the main ideas behind SALSA.
This is based on a joint work with Ashkan Norouzi-Fard, Jakub Tarnawski, Amir Zandieh, Aida Mousavifar, and Ola Svensson.
Machine translation is one of the oldest and arguably most difficult problems in AI. So let's talk about MT, some of its history, and some current problems being worked on! Specifically, I'll talk about my own academic work on MT for under-resourced languages, and how to improve word choices; then I'll talk about issues that we're working on at Google Translate, specifically concerning translation and gender.
Deep learning has been used recently to address several problems, including those related to computer vision, medical imaging, and financial predictions. As such, it has also resulted in tremendous gains in the field of speech processing. During this talk, I will discuss how deep learning has been used to enhance the perceptual quality of noisy speech in various environments. I also will discuss how deep learning has been used to assess how much noise exists in an environment.
Hierarchical clustering is a popular form of data analysis that forms a hierarchy of clusters by recursively partitioning the data set into successively smaller clusters. In this talk, we discuss a simple evaluation function for a clustering on a set of points given pairwise similarities on these points (Dasgupta, STOC 2016). We show this function has properties desirable for a canonical hierarchical clustering and admits a reasonable approximation with a simple top-down algorithm, although it is NP-hard to compute the optimal clustering. We then introduce a maximization version of this objective function (Moseley, Wang, NIPS’17) and show under this function, we can obtain reasonably better approximations, even in expectation under simple random procedures (Charikar, et. al., to appear in SODA 2019). We introduce the average-linkage agglomerative procedure and show it cannot perform better than 1/3-approximation. Finally, we outline an SDP-based algorithm with provably better guarantees.
Consider an instance of Euclidean k-means or k-medians clustering. We show that the cost of the optimal solution is preserved up to a factor of (1+ε) under a projection onto a random -dimensional subspace. Further, the cost of every clustering is preserved within (1+ε). More generally, our result applies to any dimension reduction map satisfying a mild sub-Gaussian-tail condition. Our bound on the dimension is nearly optimal. Additionally, our result applies to Euclidean k-clustering with the distances raised to the p-th power for any constant p.
For k-means, our result resolves an open problem posed by Cohen, Elder, Musco, Musco, and Persu (STOC 2015); for k-medians, it answers a question raised by Kannan.
Joint work with Yury Makarychev and Ilya Razenshteyn.
David Crandall is an Associate Professor in the School of Informatics and Computing at Indiana University, Bloomington, where he is a member of the programs in Computer Science, Informatics, Cognitive Science, and Data Science. He received the Ph.D. in computer science from Cornell University in 2008 and the M.S. and B.S. degrees in computer science and engineering from the Pennsylvania State University in 2001. He was a Postdoctoral Research Associate at Cornell from 2008-2010, and a Senior Research Scientist with Eastman Kodak Company from 2001-2003. His research in computer vision and applied machine learning has been funded by the National Science Foundation (including a CAREER award in 2013), IARPA, U.S. Navy, the Air Force Office of Scientific Research, Google, the Lilly Endowment, and Eastman Kodak Company.
David Leake is Executive Associate Dean and Professor of Computer Science in the School of Informatics and Computing at Indiana University Bloomington, a member of the Indiana University Cognitive Science faculty and data science program, and affiliate member of the Data to Insight Center. He received his Ph.D. in Computer Science from Yale University in 1990. His research interests include case-based reasoning, context, explanation, human-centered computing, intelligent user interfaces, introspective reasoning, and knowledge capture and management. He has over 150 publications in these areas, with over 7000 Google Scholar citations. He chairs the AAAI Publications Committee and served as Editor in Chief of AI Magazine, the official magazine of AAAI, for 17 years.
Chris Raphael heads the Music Informatics program in the School of Informatics, Computing, and Engineering at Indiana University, as well as holding adjunct appointments in the Jacobs School of Music, Cognitive Science, and Statistics. After receiving his PhD in Applied Mathematics from Brown University in 1991, he worked on a wide range of problems in both industry and academia including Arabic character recognition, magnetic resonance spectroscopy and mine detection before coming to focus on music. His musical research includes accompaniment systems, computer generated musical analysis, musical signal processing, and modeling of musical interpretation.
Michael S. Ryoo is an Assistant Professor of the School of Informatics and Computing at Indiana University. His research interest is within the areas of Computer Vision and Robotics, with a particular emphasis on human activity recognition/learning, first-person computer vision, and human-robot interaction. Before joining IU, Dr. Ryoo was a staff researcher within the Robotics Section of the NASA's Jet Propulsion Laboratory (JPL) from 2011 to 2015. Dr. Ryoo received the Ph.D. degree from the University of Texas at Austin in 2008, and the B.S. degree from the Korea Advanced Institute of Science and Technology (KAIST) in 2004. He has been organizing a number of tutorials and workshops at major Computer Vision conferences including CVPR 2011/2014/2016, and is the winner of the best vision paper award at ICRA 2016.
Chung-chieh Shan is an assistant professor at the School of Informatics and Computing at Indiana University, Bloomington. He studies what things mean that matter. He works to tap into and enhance the amazing human ability to create concepts, combine concepts, and share concepts, by lining up formal representations and what they represent. To this end, in the short term, he develops programming languages that divide what to do and how to do it into modules that can be built and reused separately. In particular, he develops so-called probabilistic programming languages, which divide stochastic models and inference algorithms into modules that can be built and reused separately. In the long term, he hopes to supplant first-order logic by something that does not presuppose a fact of the matter what things there are, though there may be a fact of the matter what stuff there is.
Donald Williamson is an assistant professor in the School of Informatics and Computing at Indiana University. His research broadly addresses ways that enable computers to process, understand, and response to sound information. He has specific interests in the areas of speech separation, speech recognition, speaker identification, and music processing, to name a few, where he is interested in using these methods in real-world devices, such as cell phones, hearing aids, and robots. A combination of machine learning, signal processing, and statistical-based techniques are used. Donald completed his Ph.D. in the Computer Science and Engineering department at The Ohio State University, under the supervision of Prof. DeLiang Wang. Prior to that, he was fortunate to be a Member of the Engineering Staff at Lockheed Martin for a few years. He received a Masters of Science degree from the Department of Electrical and Computer Engineering at Drexel University and a Bachelor's degree from the Department of Electrical and Computer Engineering at the University of Delaware.
Grigory Yaroslavtsev is an assistant professor of Computer Science at Indiana University. Prior to that he was a postdoctoral fellow at the Warren Center for Network and Data Sciences at the University of Pennsylvania. He was previously a Postdoctoral Fellow in Mathematics at Brown University, ICERM. He received his Ph.D. in Theoretical Computer Science in 2014 from Pennsylvania State University and an M.Sc. in Applied Mathematics and Physics from the Academic University of the Russian Academy of Sciences in 2010. Grigory works on efficient algorithms for sparsification, summarization and testing properties of large data, including approximation, parallel and online algorithms, learning theory and property testing, communication and information complexity and private data release.
Yuan Zhou is an assistant professor of Computer Science at Indiana University. Prior to that he was an instructor in applied mathematics at MIT. Yuan received his Ph.D. from Carnegie Mellon University in 2014. His research interests span theoretical computer science and operations research with emphasis on linear programming and semidefinite programming relaxations, discrete optimization, approximation algorithms and hardness of approximation, harmonic analysis of discrete functions, process flexibility and decision under uncertainty with applications to crowdsourcing.
Minje Kim received his PhD degree in Computer Science from the University of Illinois at Urbana-Champaign (2016). Before joining UIUC, he worked as a researcher in ETRI, a national lab in Korea, from 2006 to 2011. He did his Bachelor’s and Master’s studies in the Division of Information and Computer Engineering at Ajou University and in the Department of Computer Science and Engineering at POSTECH in 2004 and 2006, respectively. His research focuses on designing machine learning models applied to signal processing (e.g. audio and the other time series data), stressing their computational efficiency in the resource-constrained environments or in the implementations involving large unorganized datasets. He received Richard T. Cheng Endowed Fellowship from UIUC in 2011. Google and Starkey grants also honored his ICASSP papers as the outstanding student papers in 2013 and 2014, respectively.
Daniel McDonald is an Assistant Professor of Statistics and Adjunct Assistant Professor of Computer Science at Indiana University, Bloomington. He was an undergraduate at IU, receiving a B.S.O.F. in cello performance from the IU Jacobs School of Music and a B.A. in economics and mathematics. After college, he worked at the Federal Reserve Bank of St. Louis before enrolling at Carnegie Mellon to pursue a doctorate in statistics. Daniel's main research interests involve the estimation and quantification of prediction risk. He is especially interested in developing methods for evaluating the predictions made using complex dependent data. This includes the application of statistical learning techniques to time-series prediction problems in the context of economic forecasting, as well as investigations of cross-validation and the bootstrap for risk estimation, and examining the trade-offs between prediction risk and computational costs.
Michael W. Trosset is Professor and Chair of IU’s Department of Statistics. As an undergraduate, he studied mathematics at Rice University; as a graduate student, he studied statistics at the University of California at Berkeley. Before joining IU in 2006, he taught at the University of Arizona and the College of William & Mary. His primary research interests lie in the general areas of high-dimensional multivariate data analysis and computational statistics. Much of his work is concerned with techniques for constructing low-dimensional Euclidean representations of data, as in multidimensional scaling, graph embedding, and manifold learning.
Andrew Womack is an assistant professor in the Department of Statistics and Indiana University. Before joining IU, he was a postdoctoral researcher at the University of Florida and received his doctorate from Washington University in St. Louis. Andrew's research focuses on "objective" Bayesian methodologies in complex problems and considerations of model space uncertainty in inference. This includes the development of model space priors for low dimensional representations of high dimensional objects and the efficient computation thereof.
Russell Lyons is James H. Rudy Professor of Mathematics and Adjunct Professor of Statistics. He obtained his Ph.D. at the University of Michigan in 1983, then enjoyed a postdoc in Paris and a job at Stanford University. He frequently visits Microsoft Research. His primary area of research is discrete probability and its connections to other areas of mathematics, including ergodic theory, geometric group theory, and combinatorics. He is also very interested in the teaching of statistics and has done some research in statistics. Lyons was a Sloan Foundation Fellow, a Visiting Miller Research Professor, an Institute of Mathematical Statistics Medallion Lecturer, an Invited Speaker at the International Congress of Mathematicians, and gave an Hour Address at the Joint Mathematics Meetings. He is a Fellow of the American Mathematical Society.
John Langford studied Physics and Computer Science at the California Institute of Technology, earning a double bachelor's degree in 1997, and received his Ph.D. from Carnegie Mellon University in 2002. Since then, he has worked at Yahoo!, Toyota Technological Institute, and IBM's Watson Research Center. He is also the primary author of the popular Machine Learning weblog, hunch.net and the principle developer of Vowpal Wabbit. Previous research projects include Isomap, Captcha, Learning Reductions, Cover Trees, and Contextual Bandit learning. For more information visit his homepage.
Edo Liberty is a Principal Scientist at Amazon's AWS Machine Learning group. Previously he was a Research Director and Yahoo and the head of Yahoo Research in New York. He received his BSc in CS and Physics from Tel Aviv University and his PhD in CS from Yale. After his postdoctoral position at Yale in the Applied Math department he co-founded a New York based startup. In 2009 he joined Yahoo Research. His research focuses on the theory and practice of large scale data algorithms.
Vahab Mirrokni is a Principal Research Scientist, heading the algorithms research group at Google Research, New York. He received his PhD from MIT in 2005 and his B.Sc. from Sharif University of Technology in 1999. He joined Google Research in New York in 2008, after spending a couple of years at Microsoft Research, MIT and Amazon.com. He is the co-winner of a SODA'05 best student paper award and ACM EC'08 best paper award. His research areas include algorithms, algorithmic game theory, combinatorial optimization, and social networks analysis. At Google, he is mainly working on algorithmic and economic problems related to search and online advertising. Recently he is working on online ad allocation problems, distributed algorithms for large-scale graph mining, and mechanism design for advertising exchanges.
Maxim Sviridenko is a Principal Research Scientist at Yahoo! Research, NYC where he manages the Scalable Algorithms and Machine Learning group. The group focuses on solving Machine Learning, Mathematical Optimization and Computational Economics problems on Web-scale data using latest techniques and algorithms from these fields. Previously Maxim was a staff member at the IBM T.J. Watson Research Center and a professor at the University of Warwick. His research interests span optimization, scheduling, distributed and approximation algorithms among other things.
Dirk van Gucht is a professor of Computer Science at Indiana University, Bloomington. His general areas of research are database theory and data mining.
Richard Shiffrin is a Distinguished Professor and Luther Dana Waterman Professor of Psychological and Brain Sciences. His research interest include short and long term memory, attention stages of information processing, retrieval and forgetting, mathematical computer simulation, neural network models.
Haixu Tang is a professor of Informatics and Computing at Indiana University, Bloomington. He is working on algorithmic and statistical problems in bioinformatics including computational mass spectrometry, mobile genetic elements, genome privacy, bacterial genomics and metagenomics.
Stanley Wasserman is the James H. Rudy Professor of Statistics, Psychology, and Sociology, and is a member of the Department of Psychological and Brain Sciences and the Department of Statistics. His research interests are network science and, specifically, statistical models for social networks.
Computer Science and Informatics