[ English  Japanese ]
Software
This page is outdated. For more recent software on reliable and robust machine learning, please see here, which is maintained by Imperfect Information Learning Team, Center for Advanced Intelligence Project (AIP), RIKEN.The software available below is free of charge for research and education purposes. However, you must obtain a license from the author(s) to use it for commercial purposes. The software must not be distributed without prior permission of the author(s).
The software is supplied "as is" without warranty of any kind, and the author(s) disclaim any and all warranties, including but not limited to any implied warranties of merchantability and fitness for a particular purpose, and any warranties or non infringement. The user assumes all liability and responsibility for use of the software and in no event shall the author(s) be liable for damages of any kind resulting from its use.
Fundamentals
 Density ratio estimation
 KLIEP (KullbackLeibler importance estimation procedure): MATLAB
 GMKLIEP (Gaussianmixture KLIEP): MATLAB (by Makoto Yamada)
 LSIF (leastsquares importance fitting): R (by Takafumi Kanamori)
 uLSIF (unconstrained LSIF): MATLAB, R (by Takafumi Kanamori), C++ (by Issei Sato)
 RuLSIF (relative uLSIF): MATLAB (by Makoto Yamada), R (by Max Wornowizki), Python (by Song Liu)
 Density difference estimation
 LSDD (leastsquares density difference): MATLAB, Python (by Marthinus Christoffel du Plessis)
 Density derivative estimation
 LSLDG (leastsquares logdensity gradient): MATLAB (by Hiroaki Sasaki)
 Mutual information estimation
 MLMI (maximumlikelihood mutual information): MATLAB (with Taiji Suzuki)
 LSMI (leastsquares mutual information): MATLAB (with Taiji Suzuki)
 LSMI (multiplicative kernel model): MATLAB (by Tomoya Sakai)
 LSQMI (leastsquares quadratic mutual information): MATLAB
 Heterodistributional subspace search
 LHSS (leastsquares heterodistributional search): MATLAB (with Makoto Yamada)
Applications
 Covariate shift adaptation
 IWLS+IWCV+uLSIF (importanceweighted leastsquares + importanceweighted crossvalidation + unconstrained leastsquares importance fitting): MATLAB
 IWLR+KLIEP (importanceweighted logistic regression + KullbackLeibler importance estimation procedure): MATLAB (by Makoto Yamada)
 IWLSPC+IWCV+KLIEP (importanceweighted leastsquares probabilistic classifier + importanceweighted crossvalidation + KullbackLeibler importance estimation procedure): MATLAB (by Hirotaka Hachiya)
 Class prior change adaptation
 uLSIFbased method: MATLAB (by Marthinus Christoffel du Plessis)
 LSDDbased method: MATLAB (by Marthinus Christoffel du Plessis)
 Inlierbased outlier detection
 MLOD (maximumlikelihood outlier detection): MATLAB
 LSOD (leastsquares outlier detection): MATLAB
 LSAD (leastsquares anomaly detection): Python (by John Quinn)
 Feature selection
 MLFS (maximumlikelihood feature selection in supervised regression/classification): MATLAB (with Taiji Suzuki)
 LSFS (leastsquares feature selection in supervised regression/classification): MATLAB (with Taiji Suzuki)
 L1LSMI (L1LSMIbased feature selection for supervised regression/classification): MATLAB (by Wittawat Jitkrittum)
 HSICLASSO (HilbertSchmidt independence criterion + least absolute shrinkage and selection operator for highdimensional feature selection in supervised regression/classification): MATLAB (by Makoto Yamada)
 Dimensionality reduction/feature extraction/metric learning
 NGCA (nonGaussian component analysis, unsupervised linear dimensionality reduction): MATLAB (by Gilles Blanchard)
 LSDR (leastsquares dimensionality reduction, supervised linear dimensionality reduction for regression/classification): MATLAB (with Taiji Suzuki)
 SCA (sufficient component analysis, supervised linear dimensionality reduction for regression/classification): MATLAB (by Makoto Yamada)
 LSQMID (leastsquares quadratic mutual information derivative, supervised linear dimensionality reduction for regression/classification): MATLAB (by Voot Tangkaratt)
 LFDA (local Fisher discriminant analysis, supervised linear dimensionality reduction for classification): MATLAB
 SELF (semisupervised LFDA, semisupervised linear dimensionality reduction for classification): MATLAB
 LSCDA (leastsquares canonical dependency analysis, linear dimensionality reduction for paired data): MATLAB (by Masayuki Karasuyama)
 SERAPH (semisupervised metric learning paradigm with hypersparsity, semisupervised metric learning for classification): MATLAB (by Gang Niu)
 Classification
 Conditinonal probability estimation
 LSCDE (leastsquares conditional density estimation): MATLAB
 LSPC (leastsquares probabilitic classifier): MATLAB, Python (by John Quinn)
 SMIR (squaredloss mutual information regularization, semisupervised probabilistic classification): MATLAB (by Gang Niu and by Wittawat Jitkrittum)
 Independence test
 LSIT (leastsquares independence test): MATLAB
 Twosample test
 LSTT (leastsquares twosample test): MATLAB
 Change detection
 CDRuLSIF (distributional change detection by RuLSIF): MATLAB (by Song Liu)
 CDKLIEP (structural change detection by sparse KLIEP): MATLAB (by Song Liu)
 Clustering
 Independent component analysis
 LICA (independent component analysis): MATLAB (by Taiji Suzuki)
 Causal direction inference
 LSIR (leastsquares independence regression): MATLAB (by Makoto Yamada)
 Crossdomain object matching
 LSOM (leastsquares object matching): MATLAB (by Makoto Yamada)
 Hidden Markov Model
 DRHMM (densityratio hidden Markov model): MATLAB and Python (by John Quinn)
 Sparse learning
 DAL (l1/groupedl1/tracenorm regularization solver): MATLAB (by Ryota Tomioka)
 Matrix/tensor factorization
 VBMF (variational Bayesian matrix factorization): MATLAB
 Multitask learning with tensor factorization: MATLAB (by Kishan Wimalawarne)
 Reinforcement learning
 IWPGPEOB (modelfree policy gradient method with sample reuse): MATLAB
 Crowdsourcing
 BBTA (banditbased task assignment): Python (by Hao Zhang)
KullbackLeibler Importance Estimation Procedure (KLIEP)
 KullbackLeibler Importance Estimation Procedure (KLIEP) is an algorithm to directly estimate the ratio of two density functions without going through density estimation. The optimization problem involved with KLIEP is convex so the unique global optimal solution can be obtained. Furthermore, the KLIEP solution tends to be sparse, which contributes to reducing the computation time.

MATLAB implementation of KLIEP:
KLIEP.zip
 "KLIEP.m" is the main function.
 "demo_KLIEP.m" is a demo script.

Examples:

References:

Sugiyama, M., Suzuki, T., Nakajima, S., Kashima, H., von Bünau, P. & Kawanabe, M.
Direct importance estimation for covariate shift adaptation.
Annals of the Institute of Statistical Mathematics, vol.60, no.4, pp.699746, 2008.
[ paper ] 
Sugiyama, M., Nakajima, S., Kashima, H., von Bünau, P. & Kawanabe, M.
Direct importance estimation with model selection and its application to covariate shift adaptation.
In J. C. Platt, D. Koller, Y. Singer, and S. Roweis (Eds.), Advances in Neural Information Processing Systems 20, pp.14331440, Cambridge, MA, MIT Press, 2008.
(Presented at Neural Information Processing Systems (NIPS2007), Vancouver, B.C., Canada, Dec. 38, 2007.)
[ paper, poster ]

Sugiyama, M., Suzuki, T., Nakajima, S., Kashima, H., von Bünau, P. & Kawanabe, M.
Unconstrained LeastSquares Importance Fitting (uLSIF)
 Unconstrained LeastSquares Importance Fitting (uLSIF) is an algorithm to directly estimate the ratio of two density functions without going through density estimation. The solution of uLSIF as well as the leaveoneout score can be computed analytically, thus uLSIF is computationally very efficient and stable. Furthermore, the uLSIF solution tends to be sparse, which contributes to reducing the computation time.

MATLAB implementation of uLSIF:
uLSIF.zip
 "uLSIFP.m" is the main function.
 "demo_uLSIF.m" is a demo script.

Examples:

References:

Kanamori, T., Hido, S., & Sugiyama, M.
A leastsquares approach to direct importance estimation.
Journal of Machine Learning Research, vol.10 (Jul.), pp.13911445, 2009.
[ paper ] 
Kanamori, T., Hido, S., & Sugiyama, M.
Efficient direct density ratio estimation for nonstationarity adaptation and outlier detection,
In D. Koller, D. Schuurmans, Y. Bengio, L. Botton (Eds.), Advances in Neural Information Processing Systems 21, pp.809816, Cambridge, MA, MIT Press, 2009.
(Presented at Neural Information Processing Systems (NIPS2008), Vancouver, B.C., Canada, Dec. 813, 2008.)
[ paper, poster ]

Kanamori, T., Hido, S., & Sugiyama, M.
LeastSquares DensityDifference (LSDD)
 LeastSquares DensityDifference (LSDD) is an estimator of the difference between two densities, which could be used for, e.g., approximating the L2distance.

MATLAB implementation of LSDD:
LSDD.zip
 "LSDD.m" is the main function.
 "demo_LSDD.m" is a demo script.
 "lsdd.py" is the main function.
 "demo_lsdd.py" is a demo script.

Sugiyama, M., Suzuki, T., Kanamori, T., Du Plessis, M. C., Liu, S., & Takeuchi, I.
Densitydifference estimation.
Neural Computation, vol.25, no.10, pp.27342775, 2013.
[ paper ] 
Sugiyama, M., Suzuki, T., Kanamori, T., du Plessis, M. C., Liu, S., & Takeuchi, I.
Densitydifference estimation.
In P. Bartlett, F. C. N. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger (Eds.), Advances in Neural Information Processing Systems 25, pp.692700, 2012.
(Presented at Neural Information Processing Systems (NIPS2012), Lake Tahoe, Nevada, USA, Dec. 36, 2012)
[ paper, poster ]
LeastSquares LogDensity Gradient (LSLDG)
 LeastSquares LogDensity Gradient (LSLDG) is an algorithm which directly estimates the gradient of a logdensity without going through density estimation. The solution is computed analytically.

The application of LSLDG is clustering based on mode
seeking. The clustering method has the following advantages:
 We do not need to set the number of clusters in advance.
 All the parameters (e.g. bandwidth) can be optimized by cross validation.
 It works significantly better than meanshift clustering in highdimensional data.

MATLAB implementation of LSLDG:
LSLDG.zip
 "demo_LSLDG.m" is a demo script for logdensity gradient estimation.
 "demo_LSLDGClust.m" is a demo script for clustering.

Examples:
 Logdensity gradient estimation.
 Clustering by seeking modes.

References:

Sasaki, H., Hyvärinen, A., & Sugiyama, M.
Clustering via mode seeking by direct estimation of the gradient of a logdensity.
In T. Calders, F. Esposito, E. Hullermeier, and R. Meo (Eds.), Machine Learning and Knowledge Discovery in Databases, Part III, Lecture Notes in Computer Science, vol.8725, pp.1934, Berlin, Springer, 2014.
(Presented at the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD2014), Nancy, France, Sep. 1519, 2014)
[ paper ]

Sasaki, H., Hyvärinen, A., & Sugiyama, M.
Maximum Likelihood Mutual Information (MLMI)
 Maximum Likelihood Mutual Information (MLMI) is an estimator of mutual information based on the densityratio estimation method KLIEP. A mutual information estimator could be used as a measure of statistical independence between random variables (smaller is more independent).

MATLAB implementation of MLMI:
MLMI.zip
 "MLMI.m" is the main function.
 "demo_MLMI.m" is a demo script.

Examples:

References:

Suzuki, T., Sugiyama, M., & Tanaka, T.
Mutual information approximation via maximum likelihood estimation of density ratio.
In Proceedings of 2009 IEEE International Symposium on Information Theory (ISIT2009), pp.463467, Seoul, Korea, Jun. 28Jul. 3, 2009.
[ paper ] 
Suzuki, T., Sugiyama, M., Sese, J. & Kanamori, T.
Approximating mutual information by maximum likelihood density ratio estimation.
In Y. Saeys, H. Liu, I. Inza, L. Wehenkel and Y. Van de Peer (Eds.), Proceedings of the Workshop on New Challenges for Feature Selection in Data Mining and Knowledge Discovery 2008 (FSDM2008) , JMLR Workshop and Conference Proceedings, vol. 4, pp.520, 2008.
[ paper ]

Suzuki, T., Sugiyama, M., & Tanaka, T.
LeastSquares Mutual Information (LSMI)
 LeastSquares Mutual Information (LSMI) is an estimator of a squaredloss variant of mutual information based on the densityratio estimation method uLSIF. A mutual information estimator could be used as a measure of statistical independence between random variables (smaller is more independent).

MATLAB implementation of LSMI for plain kernel models:
LSMI.zip
 "LSMIregression.m" and "LSMIclassification.m" are the main functions.
 "demo_LSMI.m" is a demo script.

MATLAB implementation of LSMI for multiplicative kernel models:
mLSMI.zip
 "mLSMI.m" is the main function.
 "demo_mLSMI.m" is a demo script.

Examples:

References:

Suzuki, T., Sugiyama, M., Kanamori, T., & Sese, J.
Mutual information estimation reveals global associations between stimuli and biological processes.
BMC Bioinformatics, vol.10, no.1, pp.S52, 2009.
[ paper ] 
Sakai, T. & Sugiyama, M.
Computationally efficient estimation of squaredloss mutual information with multiplicative kernel models.
IEICE Transactions on Information and Systems, vol.E97D, no.4, pp.968971, 2014.
[ paper ]

Suzuki, T., Sugiyama, M., Kanamori, T., & Sese, J.
LeastSquares Quadratic Mutual Information (LSQMI)
 LeastSquares Quadratic Mutual Information (LSQMI) is an estimator of a L2loss variant of mutual information called quadratic mutual information (QMI) based on the densitydifference estimation method LSDD. An QMI estimator could be used as a measure of statistical independence between random variables (smaller is more independent).

MATLAB implementation of LSQMI:
LSQMI.zip
 "LSQMIregression.m" and "LSQMIclassification.m" are the main functions.
 "demo_LSQMI.m" is a demo script.

Examples:
 References:
Leastsquares Heterodistributional Subspace Search (LHSS)
 Leastsquares Heterodistributional Subspace Search (LHSS) is an algorithm to find a subspace in which two probability distributions are similar (which is called the heterodistributional subspace). LHSS can be used for improving the accuracy of direct density ratio estimatoin in high dimensions: first identify the heterodistributional subspace by LHSS and then perform density ratio estimation only in the heterodistributional subspace. This is called direct densityratio estimation with dimensionality reduction (D3).

MATLAB implementation of LHSS:
LHSS.zip
 "demo_LHSS.m" is a demo script.
 "LHSS_train.m" is the function to find the heterodistributional subspace.
 "LHSS_test.m" is the function to estimate the density ratio based on LHSS.

Examples:

References:

Sugiyama, M., Yamada, M., von Bünau, P., Suzuki, T., Kanamori, T., & Kawanabe, M.
Direct densityratio estimation with dimensionality reduction via leastsquares heterodistributional subspace search.
Neural Networks, vol.24, no.2, pp.183198, 2011.
[ paper ]

Sugiyama, M., Yamada, M., von Bünau, P., Suzuki, T., Kanamori, T., & Kawanabe, M.
ImportanceWeighted LeastSquares (IWLS)

ImportanceWeighted LeastSquares (IWLS)
is an importanceweighted version of regularized kernel leastsquares
for covariate shift adaptation,
where the training and test input distributions differ
but the conditional distribution of outputs given inputs
is unchanged between training and test phases.
uLSIF is used for importance estimation,
and
ImportanceWeighted CrossValidation (IWCV) is used for model selection.

MATLAB implementation of IWLS:
IWLS.zip
 "demo_IWLS.m" is a demo script.

Examples:

References:

Kanamori, T., Hido, S., & Sugiyama, M.
A leastsquares approach to direct importance estimation.
Journal of Machine Learning Research, vol.10 (Jul.), pp.13911445, 2009.
[ paper ] 
Sugiyama, M., Krauledat, M., & Müller, K.R.
Covariate shift adaptation by importance weighted cross validation.
Journal of Machine Learning Research, vol.8 (May), pp.9851005, 2007.
[ paper ]

Kanamori, T., Hido, S., & Sugiyama, M.
ImportanceWeighted LeastSquares Probabilistic Classifier (IWLSPC)
 The ImportanceWeighted LeastSquares Probabilistic Classifier (IWLSPC) is an importanceweighted version of the LSPC for covariate shift adaptation, where training and test input distributions differ but the conditional distribution of outputs given inputs is unchanged between the training and test phases. uLSIF is used for importance estimation, and ImportanceWeighted CrossValidation (IWCV) is used for model selection.

MATLAB implementation of IWLSPC:
IWLSPC.zip
 "demo_IWLSPC.m" is a demo script.

Examples:
Training and test samples
Training and test labels predicted by plain LSPC
Training and test labels predicted by IWLSPC

References:

Hachiya, H., Sugiyama, M., & Ueda, N.
Importanceweighted leastsquares probabilistic classifier for covariate shift adaptation with application to human activity recognition.
Neurocomputing, vol.80, pp.93101, 2012.
[ paper ] 
Kanamori, T., Hido, S., & Sugiyama, M.
A leastsquares approach to direct importance estimation.
Journal of Machine Learning Research, vol.10 (Jul.), pp.13911445, 2009.
[ paper ] 
Sugiyama, M., Krauledat, M., & Müller, K.R.
Covariate shift adaptation by importance weighted cross validation.
Journal of Machine Learning Research, vol.8 (May), pp.9851005, 2007.
[ paper ]

Hachiya, H., Sugiyama, M., & Ueda, N.
Maximum Likelihood Outlier Detection (MLOD)
 Maximum Likelihood Outlier Detection (MLOD) is an inlierbased outlier detection algorithm. The problem of inlierbased outlier detection is to find outliers in a set of samples (called the evaluation set) using another set of samples which consists only of inliers (called the model set). MLOD orders the samples in the evaluation set according to their degree of outlyingness. The degree of outlyingness is measured by the ratio of probability densities of evaluation and model samples. The ratio is estimated by the densityratio estimation method KLIEP.

MATLAB implementation of MLOD:
MLOD.zip
 "MLOD.m" is the main function.
 "demo_MLOD.m" is a demo script.

Examples:

References:

Hido, S., Tsuboi, Y., Kashima, H., Sugiyama, M., & Kanamori, T.
Statistical outlier detection using direct density ratio estimation.
Knowledge and Information Systems, vol.26, no.2, pp.309336, 2011.
[ paper ] 
Sugiyama, M., Suzuki, T., Nakajima, S., Kashima, H.,
von Bünau, P. & Kawanabe, M.
Direct importance estimation for covariate shift adaptation.
Annals of the Institute of Statistical Mathematics, vol.60, no.4, pp.699746, 2008.
[ paper ]

Hido, S., Tsuboi, Y., Kashima, H., Sugiyama, M., & Kanamori, T.
LeastSquares Outlier Detection (LSOD)
 LeastSquares Outlier Detection (LSOD) is an inlierbased outlier detection algorithm. The problem of inlierbased outlier detection is to find outliers in a set of samples (called the evaluation set) using another set of samples which consists only of inliers (called the model set). LSOD orders the samples in the evaluation set according to their degree of outlyingness. The degree of outlyingness is measured by the ratio of probability densities of evaluation and model samples. The ratio is estimated by the densityratio estimation method uLSIF.

MATLAB implementation of LSOD:
LSOD.zip
 "LSOD.m" is the main function.
 "demo_LSOD.m" is a demo script.

Examples:

References:

Hido, S., Tsuboi, Y., Kashima, H., Sugiyama, M., & Kanamori, T.
Statistical outlier detection using direct density ratio estimation.
Knowledge and Information Systems, vol.26, no.2, pp.309336, 2011.
[ paper ] 
Kanamori, T., Hido, S., & Sugiyama, M.
A leastsquares approach to direct importance estimation.
Journal of Machine Learning Research, vol.10 (Jul.), pp.13911445, 2009.
[ paper ]

Hido, S., Tsuboi, Y., Kashima, H., Sugiyama, M., & Kanamori, T.
Maximum Likelihood Feature Selection (MLFS)

Maximum Likelihood Feature Selection (MLFS)
is a feature selection method for supervised regression and classification.
MLFS orders input features according to their dependence on output values.
Dependency between inputs and outputs is evaluated
based on an estimator of mutual information
called MLMI.

MATLAB implementation of MLFS:
MLFS.zip
 "MLFSP.m" is the main function.
 "demo_MLFS.m" is a demo script.

Examples:

References:

Suzuki, T., Sugiyama, M., Sese, J. & Kanamori, T.
Approximating mutual information by maximum likelihood density ratio estimation,
In Y. Saeys, H. Liu, I. Inza, L. Wehenkel and Y. Van de Peer (Eds.), Proceedings of the Workshop on New Challenges for Feature Selection in Data Mining and Knowledge Discovery 2008 (FSDM2008) , JMLR Workshop and Conference Proceedings, vol. 4, pp.520, 2008.
[ paper ] 
Sugiyama, M., Suzuki, T., Nakajima, S., Kashima, H., von Bünau, P. & Kawanabe, M.
Direct importance estimation for covariate shift adaptation.
Annals of the Institute of Statistical Mathematics, vol.60, no.4, pp.699746, 2008.
[ paper ]

Suzuki, T., Sugiyama, M., Sese, J. & Kanamori, T.
LeastSquares Feature Selection (LSFS)
 LeastSquares Feature Selection (LSFS) is a feature selection method for supervised regression and classification. LSFS orders input features according to their dependence on output values. Dependency between inputs and outputs is evaluated based on an estimator of squaredloss mutual information called LSMI.

MATLAB implementation of LSFS:
LSFS.zip
 "LSFSP.m" is the main function.
 "demo_LSFS.m" is a demo script.

Examples:

References:

Suzuki, T., Sugiyama, M., Kanamori, T., & Sese, J.
Mutual information estimation reveals global associations between stimuli and biological processes.
BMC Bioinformatics, vol.10, no.1, pp.S52, 2009.
[ paper ] 
Kanamori, T., Hido, S., & Sugiyama, M.
A leastsquares approach to direct importance estimation.
Journal of Machine Learning Research, vol.10 (Jul.), pp.13911445, 2009.
[ paper ]

Suzuki, T., Sugiyama, M., Kanamori, T., & Sese, J.
LeastSquares Dimensionality Reduction (LSDR)
 LeastSquares Dimensionality Reduction (LSDR) is a supervised dimensionality reduction method. LSDR adopts a squaredloss variant of mutual information as an independence measure and estimates it using the densityratio estimation method uLSIF. Thanks to this formulation, all tuning parameters such as the Gaussian width and the regularization parameter can be automatically chosen based on a crossvalidation method. Then LSDR maximizes this independence measure (making the complementary features conditional independent of outputs) by a natural gradient algorithm.

MATLAB implementation of LSDR:
LSDR.zip
 "demo_LSDR.m" is a demo script.

Examples:

Reference:

Suzuki, T. & Sugiyama, M.
Sufficient dimension reduction via squaredloss mutual information estimation.
Neural Computation, vol.25, no.3, pp.725758, 2013.
[ paper ]

Suzuki, T. & Sugiyama, M.
LeastSquares Quadratic Mutual Information Derivative (LSQMID)
 The LeastSquares Quadratic Mutual Information Derivative (LSQMID) is a supervised dimensionality reduction method. LSQMID aims to find a linear projection of input such that quadratic mutual information (QMI) between projected input and output is maximized. LSQMID directly estimates the derivative of QMI without estimating QMI itself. Then, an QMI maximizer is obtained by fixedpoint iteration. An important property of LSQMID is its robustness against outliers. Moreover, all tuning parameters such as the Gaussian width and the regularization parameter can be automatically chosen based on crossvalidation.

MATLAB implementation of LSQMID:
LSQMID.zip
 "demo_LSQMID_SDR.m" is a demo script.

Examples:

Reference:

Tangkaratt, V., Sasaki, H., & Sugiyama, M.
Direct estimation of the derivative of quadratic mutual information with application in supervised dimension reduction.
arXiv, 1508.01019, 2015.

Tangkaratt, V., Sasaki, H., & Sugiyama, M.
Local Fisher Discriminant Analysis (LFDA)
 Local Fisher Discriminant Analysis (LFDA) is a linear supervised dimensionality reduction method and is particularly useful when some class consists of separate clusters. LFDA has an analytic form of the embedding matrix and the solution can be easily computed just by solving a generalized eigenvalue problem. Therefore, LFDA is scalable to large datasets and computationally reliable. A kernelized variant of LFDA called Kernel LFDA (KLFDA) is also available.

MATLAB implementation of LFDA:
LFDA.zip
 "LFDA.m" is the main function.
 "demo_LFDA.m" is a demo script.

Examples:

MATLAB implementation of KLFDA:
KLFDA.zip
 "KLFDA.m" is the main function.
 "demo_KLFDA.m" is a demo script.

Examples:

References:

Sugiyama, M.
Dimensionality reduction of multimodal labeled data by local Fisher discriminant analysis.
Journal of Machine Learning Research, vol.8 (May), pp.10271061, 2007.
[ paper ] 
Sugiyama, M.
Local Fisher discriminant analysis for supervised dimensionality reduction.
In W. W. Cohen and A. Moore (Eds.), Proceedings of 23rd International Conference on Machine Learning (ICML2006), pp.905912, Pittsburgh, Pennsylvania, USA, Jun. 2529, 2006.
[ paper, slides ]

Sugiyama, M.
Semisupervised Local Fisher discriminant analysis (SELF)
 Semisupervised Local Fisher discriminant analysis (SELF) is a linear semisupervised dimensionality reduction method. SELF smoothly bridges supervised Local Fisher Discriminant Analysis (LFDA) and unsupervised Principal Component Analysis (PCA), by which a natural regularization effect can be obtained when only a small number of labeled samples are available. SELF has an analytic form of the embedding matrix and the solution can be easily computed just by solving a generalized eigenvalue problem. Therefore, SELF is scalable to large datasets and computationally reliable. Applying the standard kernel trick allows us to obtain a nonlinear extension of SELF called Kernel SELF (KSELF).
 When SELF is operated in the complete supervised mode, it is reduced to LFDA. However, its solution is generally slightly different from the one obtained by LFDA since nearest neighbor search (used for computing local data scaling in the affinity matrix) is carried out in a different manner: LFDA searches for nearest neighbors within each class, while nearest neighbor search in SELF is performed over all samples (including unlabeled samples). This is becuase SELF presumes that only a small number of labeled samples are available and searching for nearest neighbors within each class is not effective for capturing local data scaling in small sample cases. When SELF is operated in the complete unsupervised mode, it is reduced to PCA.

MATLAB implementation of SELF:
SELF.zip
 "SELF.m" is the main function.
 "demo_SELF.m" is a demo script.

Examples:

Reference:

Sugiyama, M., Idé, T., Nakajima, S., & Sese, J.
Semisupervised local Fisher discriminant analysis for dimensionality reduction.
Machine Learning, vol.78, no.12, pp.3561, 2010.
[ paper ]

Sugiyama, M., Idé, T., Nakajima, S., & Sese, J.
PU Classification
 Classification from positive and unlabeled samples, called PU classification, is aimed at learning the decision boundary between positive and negative samples only from positive and unlabeled samples.

MATLAB implementation of PU classification with the squared loss:
PU.zip
 "PU_SL.m" is a function for training a classifier.
 "demo.m" is a demo function.

Example:

References:

du Plessis, M. C., Niu, G., & Sugiyama, M.
Convex formulation for learning from positive and unlabeled data.
In F. Bach and D. Blei (Eds.), Proceedings of 32nd International Conference on Machine Learning (ICML2015), JMLR Workshop and Conference Proceedings, vol.37, pp.13861394, Lille, France, Jul. 611, 2015.
[ paper ] 
du Plessis, M. C., Niu, G., & Sugiyama, M.
Analysis of learning from positive and unlabeled data.
In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger (Eds.), Advances in Neural Information Processing Systems 27, pp.703711, 2014.
(Presented at Neural Information Processing Systems (NIPS2014), Montreal, Quebec, Canada, Dec. 811, 2014)
[ paper ]

du Plessis, M. C., Niu, G., & Sugiyama, M.
PNU Classification
 PNU classification is a semisupervised classification method that combines PN classification (classification from positive and negative samples, ordinary supervised classification) with PU classification (classification from positive and unlabeled samples) or NU classification (classification from negative and unlabeled samples). Unlike existing semisupervised classification methods, PNU classification does not require any distributional assumptions such as the cluster assumption and the manifold assumption.

MATLAB implementation of PNU classification with the squared loss:
PNU.zip
 "PNU_SL.m" is a function for training a classifier.
 "demo.m" is a demo function.

Examples:

Reference:

Sakai, T., du Plessis, M. C., Niu, G., & Sugiyama, M.
Semisupervised classification based on classification from positive and unlabeled data.
arXiv:1605.06955 [cs.LG]
[ paper ]

Sakai, T., du Plessis, M. C., Niu, G., & Sugiyama, M.
LeastSquares Conditional Density Estimation (LSCDE)
 LeastSquares Conditional Density Estimation (LSCDE) is an algorithm to estimate the conditional density function for multidimensional continuous variables. The solution of LSCDE can be computed analytically and all the tuning parameters such as the kernel width and regularization parameters can be automatically chosen by crossvalidation.

MATLAB implementation of LSCDE:
LSCDE.zip
 "LSCDE.m" is the main function.
 "demo_LSCDE.m" is a demo script.

Examples:
 References:
LeastSquares Probabilistic Classifier (LSPC)
 LeastSquares Probabilistic Classifier (LSPC) is a multiclass probabilistic classification algorithm. Its solution can be computed analytically in a classwise manner, so it is computationally very efficient.

MATLAB implementation of LSPC:
LSPC.zip
 "demo_LSPC.m" is a demo script.

Examples:

References:

Sugiyama, M.
Superfasttrainable multiclass probabilistic classifier by leastsquares posterior fitting.
IEICE Transactions on Information and Systems, vol.E93D, no.10, pp.26902701, 2010.
[ paper (revised version) ]

Yamada, M., Sugiyama, M., Wichern, G., & Simm, J.
Improving the accuracy of leastsquares probabilistic classifiers.
IEICE Transactions on Information and Systems, vol.E94D, no.6, pp.13371340, 2011.
[ paper ]

Sugiyama, M.
LeastSquares Independence Test (LSIT)
 LeastSquares Independence Test (LSIT) is a method of testing the null hypothesis that paired (inputoutput) samples are independent. LSIT adopts a squaredloss variant of mutual information as an independence measure and estimates it using the densityratio estimation method uLSIF. Thanks to this formulation, all tuning parameters such as the Gaussian width and the regularization parameter can be automatically chosen based on a crossvalidation method.

MATLAB implementation of LSIT:
LSIT.zip
 "demo_LSIT.m" is a demo script.

Examples:
 References:
LeastSquares TwoSample Test (LSTT)
 LeastSquares TwoSample Test (LSTT) is a method of testing the null hypothesis that two sets of samples are drawn from the same probability distribution. LSTT adopts a squaredloss variant of mutual information as an independence measure and estimates it using the densityratio estimation method uLSIF. Thanks to this formulation, all tuning parameters such as the Gaussian width and the regularization parameter can be automatically chosen based on a crossvalidation method.

MATLAB implementation of LSTT:
LSTT.zip
 "demo_LSTT.m" is a demo script.

Examples:

References:

Sugiyama, M., Suzuki, T., Itoh, Y., Kanamori, T., & Kimura, M.
Leastsquares twosample test.
Neural Networks, vol.24, no.7, pp.735751, 2011.
[ paper ]

Sugiyama, M., Suzuki, T., Itoh, Y., Kanamori, T., & Kimura, M.
SMIbased Clustering (SMIC)

SMIbased Clustering (SMIC)
is an informationmaximization clustering algorithm
based on the squaredloss mutual information (SMI).
SMIC is equipped with automatic tuning parameter selection based on an SMI estimator called
leastsquares mutual information (LSMI).

MATLAB implementation of SMIC:
SMIC.zip
 "SMIC.m" is the main function.
 "demo_SMIC.m" is a demo script.

Examples:

References:

Sugiyama, M., Gang, N., Yamada, M., Kimura, M., & Hachiya, H.
Informationmaximization clustering based on squaredloss mutual information.
Neural Computation, to appear.
[ paper ] 
Sugiyama, M., Yamada, M., Kimura, M., & Hachiya, H.
On informationmaximization clustering: tuning parameter selection and analytic solution.
In L. Getoor and T. Scheffer (Eds.), Proceedings of 28th International Conference on Machine Learning (ICML2011), pp.6572, Bellevue, Washington, USA, Jun. 28Jul. 2, 2011.
[ paper, slides ]

Sugiyama, M., Gang, N., Yamada, M., Kimura, M., & Hachiya, H.
Variational Bayesian Matrix Factorization (VBMF)
 Givan a fullyobserved noisy matrix V, Variational Bayesian Matrix Factorization (VBMF) denoises the matrix V under a lowrank assumption. Based on the empirical Bayesian method, VBMF automatically determines all the tuning parameters such as the rank of the denoised matrix, the noise variance, and the prior variances.

MATLAB implementation of VBMF:
VBMF.zip
 "VBMF.m" is the main function.
 "demo_VBMF.m" is a demo script.

Examples:

References:

Nakajima, S., Sugiyama, M., & Babacan, D.
Global solution of fullyobserved variational Bayesian matrix factorization is columnwise independent.
In J. ShaweTaylor, R. S. Zemel, P. Bartlett, F. C. N. Pereira, and K. Q. Weinberger (Eds.), Advances in Neural Information Processing Systems 24, pp.208216, 2011.
(Presented at Neural Information Processing Systems (NIPS2011), Granada, Spain, Dec. 1315, 2011)
[ paper ] 
Nakajima, S., Sugiyama, M., & Tomioka, M.
Global analytic solution for variational Bayesian matrix factorization.
In J. Lafferty, C. K. I. Williams, R. Zemel, J. ShaweTaylor, and A. Culotta (Eds.), Advances in Neural Information Processing Systems 23, pp.17591767, 2010.
(Presented at Neural Information Processing Systems (NIPS2010), Vancouver, British Columbia, Canada, Dec. 611, 2010)
[ paper, poster ]

Nakajima, S., Sugiyama, M., & Babacan, D.
Masashi Sugiyama (sugi [at] k.utokyo.ac.jp)