NEC orchestrating a brighter world
NEC Laboratories Europe

Machine Learning
Publications

Shujian Yu, Francesco Alesiani, Xi Yu, Robert Jenssen, Jose C. Príncipe: “Measuring Dependence with Matrix-based Entropy Functional,” AAAI 2021

Paper Details

Abstract
Measuring the dependence of data plays a central role in statistics and machine learning. In this work, we summarize and generalize the main idea of existing information-theoretic dependence measures into a higher-level perspective by the Shearer’s inequality. Based on our generalization, we then propose two measures, namely the matrix-based normalized total correlation Tα* and the matrix-based normalized dual total correlation Dα* to quantify the dependence of multiple variables in arbitrary dimensional space, without explicit estimation of the underlying data distributions. We show that our measures are differentiable and statistically more powerful than prevalent ones. We also show the impact of our measures in four different machine learning problems, namely the gene regulatory network inference, the robust machine learning under covariate shift and non-Gaussian noises, the subspace outlier detection, and the understanding of the learning dynamics of convolutional neural networks (CNNs), to demonstrate their utilities, advantages, as well as implications to those problems. Code of our dependence measure is available at: https://bit.ly/AAAI-dependence.

Full author details: Shujian Yu, NEC Laboratories Europe; Francesco Alesiani, NEC Laboratories Europe; Xi Yu, University of Florida; Robert Jenssen, UiT - The Arctic University of Norway; Jose C. Príncipe, University of Florida

Presented at: 35th Conference on Artificial Intelligence (AAAI-21)

Carolin Lawrence, Timo Sztyler and Mathias Niepert: “Explaining Neural Matrix Factorization with Gradient Rollback”, AAAI 2021

Paper Details

Abstract

Explaining the predictions of neural black-box models is an important problem, especially when such models are used in applications where user trust is crucial. Estimating the influence of training examples on a learned neural model's behavior allows us to identify training examples most responsible for a given prediction and, therefore, to faithfully explain the output of a black-box model. The most generally applicable existing method is based on influence functions, which scale poorly for larger sample sizes and models.

We propose gradient rollback, a general approach for influence estimation, applicable to neural models where each parameter update step during gradient descent touches a smaller number of parameters, even if the overall number of parameters is large. Neural matrix factorization models trained with gradient descent are part of this model class. These models are popular and have found a wide range of applications in industry. Especially knowledge graph embedding methods, which belong to this class, are used extensively. We show that gradient rollback is highly efficient at both training and test time. Moreover, we show theoretically that the difference between gradient rollback's influence approximation and the true influence on a model's behavior is smaller than known bounds on the stability of stochastic gradient descent. This establishes that gradient rollback is robustly estimating example influence. We also conduct experiments which show that gradient rollback provides faithful explanations for knowledge base completion and recommender datasets.

Presented at: 35th Conference on Artificial Intelligence (AAAI-21)

Bhushan Kotnis, Carolin Lawrence and Mathias Niepert: “Answering Complex Queries in Knowledge Graphs with Bidirectional Sequence Encoders”, AAAI 2021

Paper Details

Abstract
Representation learning for knowledge graphs (KGs) has focused on the problem of answering simple link prediction queries. In this work, we address the more ambitious challenge of predicting the answers of conjunctive queries with multiple missing entities. We propose Bidirectional Query Embedding (BIQE), a method that embeds conjunctive queries with models based on bi-directional attention mechanisms. Contrary to prior work, bidirectional self-attention can capture interactions among all the elements of a query graph. We introduce two new challenging data sets for studying conjunctive query inference and conduct experiments on several benchmark datasets that demonstrate BIQE significantly outperforms state of the art baselines.

Presented at: 35th Conference on Artificial Intelligence (AAAI-21)

Shujian Yu, Ammar Shaker, Francesco Alesiani and Jose Principe: “Measuring the Discrepancy between Conditional Distributions: Methods, Properties and Applications”, IJCAI 2020

Paper Details

Abstract
We propose a simple yet powerful test statistic to quantify the discrepancy between two conditional distributions. The new statistic avoids the explicit estimation of the underlying distributions in high dimensional space and it operates on the cone of symmetric positive semidefinite (SPS) matrix using the Bregman matrix divergence. Moreover, it inherits the merits of the correntropy function to explicitly incorporate high-order statistics in the data. We present the properties of our new statistic and illustrate its connections to prior art. We finally show the applications of our new statistic on three different machine learning problems, namely the multi-task learning over graphs, the concept drift detection, and the information-theoretic feature selection, to demonstrate its utility and advantage. Code of our statistic is available at https: //bit.ly/BregmanCorrentropy.

Presented at: International Joint Conference on Artificial Intelligence – Pacific Rim International Conference on Artificial Intelligence, 2020

Full paper download: Measuring the Discrepancy between Conditional Distributions: Methods, Properties and Applications (pdf)

Francesco Alesiani, Shujian Yu, Ammar Shaker: “Towards Interpretable Multi Task Learning”, ECML PKDD 2020

Paper Details

Abstract
Interpretable Multi-Task Learning can be expressed as learn-ing a sparse graph of the task relationship based on the predictionperformance of the learned models. Since many natural phenomenonexhibit sparse structures, enforcing sparsity on learned models reveals theunderlying task relationship. Moreover, different sparsification degreesfrom a fully connected graph uncover various types of structures, likecliques, trees, lines, clusters or fully disconnected graphs. In this paper,we propose a bilevel formulation of multi-task learning that induces sparsegraphs, thus, revealing the underlying task relationships, and an efficientmethod for its computation. We show empirically how the induced sparsegraph improves the interpretability of the learned models and their re-lationship on synthetic and real data, without sacrificing generalizationperformance. Code athttps://bit.ly/GraphGuidedMTL

Ammar Shaker, Shujian Yu, Xiao He, Christoph Gärtner: “Online Meta-Forest for Regression Data Streams”, IJCNN 2020 and WCCI 2020

Paper Details

Abstract
Stream learning is essential when there is lim-ited memory, time and computational power. However, existingstreaming methods are mostly designed for classification withonly a few exceptions for regression problems. Although beingfast, the performance of these online regression methods isinadequate due to their dependence on merely linear models.Besides, only a few stream methods are based on meta-learningthat aims at facilitating the dynamic choice of the right model.Nevertheless, these approaches are restricted to recommendlearners on a window and not on the instance level. In thispaper, we present a novel approach, named Online Meta-Forest,that incrementally induces an ensemble of meta-learners thatselects the best set of predictors for each test example. Eachmeta-learner has the ability to find a non-linear mapping of theinput space to the set of induced models. We conduct a series ofexperiments demonstrating that Online Meta-Forest outperformsrelated methods on16out of25evaluated benchmark anddomain datasets in transportation.Index Terms—Learning from Data Streams, Adaptive Learn-ing, Meta-Learning, Regression Streams, Data Streams, OnlineBagging, Ensemble Learning

Shujian Yu, Ammar Shaker, Francesco Alesiani, Jose C. Principe: “Measuring the Discrepancy between two Conditional Distributions: Methods, Properties and Applications”, IJCAL20

Paper Details

Abstract
We propose a simple yet powerful test statistic toquantify the discrepancy between two conditionaldistributions. The new statistic avoids the explicitestimation of the underlying distributions in high-dimensional space and it operates on the cone ofsymmetric positive semidefinite (SPS) matrix usingthe Bregman matrix divergence. Moreover, it in-herits the merits of the correntropy function to ex-plicitly incorporate high-order statistics in the da-ta. We present the properties of our new statisticand illustrate its connections to prior art. We fi-nally show the applications of our new statistic onthree different machine learning problems, name-ly the multi-task learning over graphs, the conceptdrift detection, and the information-theoretic fea-ture selection, to demonstrate its utility and advan-tage. Code of our statistic is available at bit.ly/BregmanCorrentropy.

A. Garcıa-Duran, R. Gonzalez, D. Onoro-Rubio, M. Niepert, H. Li: "TransRev: Modeling Reviews as Translations from Users to Items", 42nd European Conference on Information Retrieval (ECIR 2020), April 2020

C. Lawrence, B. Kotnis, M. Niepert: “Attending to Future Tokens for Bidirectional Sequence Generation”, EMNLP 2019

K. Akimoto, T. Hiraoka, K. Sadamasa and M. Niepert: “Cross-Sentence N-ary Relation Extraction using Lower-Arity Universal Schemas”, EMNLP 2019

Luca Franceschi, Xiao He, Mathias Niepert, Massimiliano Pontil, “Graph structure learning for GCNs”, ICLR, July 2019

C. Wang, M.Niepert: “State-Regularized Recurrent Neural Networks” ICML 2019 (Thirty-sixth International Conference on Machine Learning), May 2019

L. Franceschi, X. He, M. Niepert, M. Pontil:“Learning Discrete Structures for Graph Neural Networks” ICML 2019 (Thirty-sixth International Conference on Machine Learning), 2019

C. Wang, M. Niepert, H. Li, "RecSys-DAN: Discriminative Adversarial Networks for Cross-Domain Recommender Systems" in IEEE Transactions on Neural Networks and Learning Systems. March 2019

A. G. Duran, D. Rubio, M. Niepert, Y. Liu, H. Li, D. Rosenblum, "MMKG: Multi-Modal Knowledge Graphs" in ESWC 2019, the 16th Extended Semantic Web Conference. March 2019

D. Rubio, A. G. Duran, M. Niepert, R. Gonzales, R. Lopez-Sastre, "Answering Visual-Relational Queries in Web-Extracted Knowledge Graphs" in AKBC 2019, Automated Knowledge Base Construction Conference. March 2019

B. Kotnis, A. G. Duran, "Learning Numerical Attributes in Knowledge Bases" in AKBC 2019, Automated Knowledge Base Construction Conference. March 2019

Top of this page