My current research interests include Large-Scale Optimization, Fairness in AI, Bayesian Optimization and Causal Inference on Networks. During my PhD, I focused on Quasi Monte Carlo Methods in Non-Cubical Spaces.
Solving extremely large linear programs arising from several web-focused applications.
Key problems arising in web applications (with millions of users and thousands of items) can be formulated as Linear Programs (LP) involving billions to trillions of decision variables and constraints. Despite the appeal of LP formulations, solving problems at these scales is well beyond the capabilities of existing LP solvers. Through the years, I've worked in this space to develop large-scale optimization solvers:
- ECLIPSE: An extreme-scale LP Solver for structured problems such as matching and multi-objective optimization ( ICML 2020 )
- QCQP via Low-Dicrepancy Sequences: Solving quadratically constrained quadratic programs arising from modeling dependencies in various applications (NeurIPS 2017 )
Fairness in AI
Reducing unfairness and bias across large-scale AI models and AI driven products.
Biases can arise from several avenues in AI driven products. Whether it is intrinsic bias in the training data that has been collected or bias introduced during model training. In many situations these bias gets percolated through cyclic looping. We are focusing on building methodologies and systems that tackle such bias in end-to-end AI systems to make it fair. Towards that we have developed a few frameworks focusing on large-scale AI applications:
- Evaluating Fairness using Permutation Tests: We develop a detection mechanism via a flexible framework that allows practioners to identify significant bias exhibited by machine learning models via rigorous statistical hypothesis tests. (KDD 2020)
- A Framework for Fairness in Two-Sided Marketplaces: We propose a definition and develop an end-to-end framework for achieving fairness while building marketplace products at scale. (Preprint)
- Scalable Assessment and Mitigation Strategies for Fairness in Rankings : We also focus on large-scale ranking problems and develope detection and mitigation strategies for them. (Preprint)
Tuning parameters for optimizing offline and online machine learning systems.
In any machine learning system, there exists parameters which when appropriately tuned can drastically change the efficiency and predictive accuracy of the model. This is an growing area of research that focuses on doing this automatically without any human in the loop. We develop on the exisiting literature to design systems that can easily scale to online problems as well as large ML pipelines offlines.
- Online Parameter Selection for Web-based Ranking: We develop a mechanism to automatically choose optimal parameters to balance multiple objectives in a large-scale ranking system. (KDD 2018)
- Adaptive Rate of Convergence of Thompson Sampling for Gaussian Process Optimization: We derive a theoretical rate of convergence for the Thompson Sampling algorithm (Preprint)
Causal Inference on Networks
Accurate hypothesis testing in the presense of interference and heterogeneity.
Since we primarily work on network and graphical data, it is of utmost importance to consider problems with a strong network effect or intereference. This is especially crucial in developing methodologies for statistical tests. The usual assumptions of A/B testing breaks in such conditions and we focus on developing techniques that can give us accurate estimates of causal effect even in the presence of interference in dense large networks.
- OASIS: We introduce OASIS, a mechanism for A/B testing in dense large-scale networks. We design an approximate randomized controlled experiment by solving an optimization problem and then apply an importance sampling adjustment to correct in bias in order to estimate the causal effect. (NeurIPS 2020 Spotlight)
- Heterogenous Causal Effects: We derive a framework to personalize and optimize of decision parameters in web-based systems via heterogenous causal effects. We are able to show that by capturing the heterogenous member behavior we can drastical improve the overall metrics for a large-scale AI system. (Preprint)