`procrustes.permutation`

Permutation Procrustes Module.

procrustes.permutation.permutation(a: numpy.ndarray, b: numpy.ndarray, pad: bool = True, translate: bool = False, scale: bool = False, unpad_col: bool = False, unpad_row: bool = False, check_finite: bool = True, weight: Optional[numpy.ndarray] = None) → procrustes.utils.ProcrustesResult[source]

Perform one-sided permutation Procrustes.

Given matrix \(\mathbf{A}_{m \times n}\) and a reference matrix \(\mathbf{B}_{m \times n}\), find the permutation transformation matrix \(\mathbf{P}_{n \times n}\) that makes \(\mathbf{AP}\) as close as possible to \(\mathbf{B}\). In other words,

\[\underbrace{\text{min}}_{\left\{\mathbf{P} \left| {[\mathbf{P}]_{ij} \in \{0, 1\} \atop \sum_{i=1}^n [\mathbf{P}]_{ij} = \sum_{j=1}^n [\mathbf{P}]_{ij} = 1} \right. \right\}} \|\mathbf{A} \mathbf{P} - \mathbf{B}\|_{F}^2\]

This Procrustes method requires the \(\mathbf{A}\) and \(\mathbf{B}\) matrices to have the same shape, which is guaranteed with the default pad=True argument for any given \(\mathbf{A}\) and \(\mathbf{B}\) matrices. In preparing the \(\mathbf{A}\) and \(\mathbf{B}\) matrices, the (optional) order of operations is: 1) unpad zero rows/columns, 2) translate the matrices to the origin, 3) weight entries of \(\mathbf{A}\), 4) scale the matrices to have unit norm, 5) pad matrices with zero rows/columns so they have the same shape.

Parameters

a (ndarray) -- The 2D-array \(\mathbf{A}\) which is going to be transformed.
b (ndarray) -- The 2D-array \(\mathbf{B}\) representing the reference matrix.
pad (bool, optional) -- Add zero rows (at the bottom) and/or columns (to the right-hand side) of matrices \(\mathbf{A}\) and \(\mathbf{B}\) so that they have the same shape.
translate (bool, optional) -- If True, both arrays are centered at origin (columns of the arrays will have mean zero).
scale (bool, optional) -- If True, both arrays are normalized with respect to the Frobenius norm, i.e., \(\text{Tr}\left[\mathbf{A}^\dagger\mathbf{A}\right] = 1\) and \(\text{Tr}\left[\mathbf{B}^\dagger\mathbf{B}\right] = 1\).
unpad_col (bool, optional) -- If True, zero columns (with values less than 1.0e-8) on the right-hand side are removed.
unpad_row (bool, optional) -- If True, zero rows (with values less than 1.0e-8) at the bottom are removed.
check_finite (bool, optional) -- If True, convert the input to an array, checking for NaNs or Infs.
weight (ndarray, optional) -- The 1D-array representing the weights of each row of \(\mathbf{A}\). This defines the elements of the diagonal matrix \(\mathbf{W}\) that is multiplied by \(\mathbf{A}\) matrix, i.e., \(\mathbf{A} \rightarrow \mathbf{WA}\).

Returns

res -- The Procrustes result represented as a class:utils.ProcrustesResult object.

Return type

ProcrustesResult

Notes

The optimal \(n \times n\) permutation matrix is obtained by,

\[\mathbf{P}^{\text{opt}} = \arg \underbrace{\text{min}}_{\left\{\mathbf{P} \left| {[\mathbf{P}]_{ij} \in \{0, 1\} \atop \sum_{i=1}^n [\mathbf{P}]_{ij} = \sum_{j=1}^n [\mathbf{P}]_{ij} = 1} \right. \right\}} \|\mathbf{A} \mathbf{P} - \mathbf{B}\|_{F}^2 = \underbrace{\text{max}}_{\left\{\mathbf{P} \left| {[\mathbf{P}]_{ij} \in \{0, 1\} \atop \sum_{i=1}^n [\mathbf{P}]_{ij} = \sum_{j=1}^n [\mathbf{P}]_{ij} = 1} \right. \right\}} \text{Tr}\left[\mathbf{P}^\dagger\mathbf{A}^\dagger\mathbf{B} \right]\]

The solution is found by relaxing the problem into a linear programming problem. The solution to a linear programming problem is always at the boundary of the allowed region. So,

\[\underbrace{\text{max}}_{\left\{\mathbf{P} \left| {[\mathbf{P}]_{ij} \in \{0, 1\} \atop \sum_{i=1}^n [\mathbf{P}]_{ij} = \sum_{j=1}^n [\mathbf{P}]_{ij} = 1} \right. \right\}} \text{Tr}\left[\mathbf{P}^\dagger\mathbf{A}^\dagger\mathbf{B} \right] = \underbrace{\text{max}}_{\left\{\mathbf{P} \left| {[\mathbf{P}]_{ij} \geq 0 \atop \sum_{i=1}^n [\mathbf{P}]_{ij} = \sum_{j=1}^n [\mathbf{P}]_{ij} = 1} \right. \right\}} \text{Tr}\left[\mathbf{P}^\dagger\left(\mathbf{A}^\dagger\mathbf{B}\right) \right]\]

This is a matching problem and can be solved by the Hungarian algorithm. The cost matrix is defined as \(\mathbf{A}^\dagger\mathbf{B}\) and the scipy.optimize.linear_sum_assignment is used to solve for the permutation that maximizes the linear sum assignment problem.

procrustes.permutation.permutation_2sided(a: numpy.ndarray, b: numpy.ndarray, single: bool = True, method: str = 'kopt', guess_p1: Optional[numpy.ndarray] = None, guess_p2: Optional[numpy.ndarray] = None, pad: bool = False, unpad_col: bool = False, unpad_row: bool = False, translate: bool = False, scale: bool = False, check_finite: bool = True, options: Optional[dict] = None, weight: Optional[numpy.ndarray] = None, lapack_driver: str = 'gesvd') → procrustes.utils.ProcrustesResult[source]

Perform two-sided permutation Procrustes.

Parameters

a (ndarray) -- The 2D-array \(\mathbf{A}\) which is going to be transformed.
b (ndarray) -- The 2D-array \(\mathbf{B}\) representing the reference matrix.
single (bool, optional) -- If True, the single-transformation Procrustes is performed to obtain \(\mathbf{P}\). If False, the two-transformations Procrustes is performed to obtain \(\mathbf{P}_1\) and \(\mathbf{P}_2\).
method (str, optional) -- The method to solve for permutation matrices. For single=False, these include "flip-flop" and "k-opt" methods. For single=True, these include "approx-normal1", "approx-normal2", "approx-umeyama", "approx-umeyama-svd", "k-opt", "soft-assign", and "nmf".
guess_p1 (np.ndarray, optional) -- Guess for \(\mathbf{P}_1\) matrix given as a 2D-array. This is only required for the two-transformations case specified by setting single=False.
guess_p2 (np.ndarray, optional) -- Guess for \(\mathbf{P}_2\) matrix given as a 2D-array.
pad (bool, optional) -- Add zero rows (at the bottom) and/or columns (to the right-hand side) of matrices \(\mathbf{A}\) and \(\mathbf{B}\) so that they have the same shape.
unpad_col (bool, optional) -- If True, zero columns (with values less than 1.0e-8) on the right-hand side are removed.
unpad_row (bool, optional) -- If True, zero rows (with values less than 1.0e-8) at the bottom are removed.
translate (bool, optional) -- If True, both arrays are centered at origin (columns of the arrays will have mean zero).
scale (bool, optional) -- If True, both arrays are normalized with respect to the Frobenius norm, i.e., \(\text{Tr}\left[\mathbf{A}^\dagger\mathbf{A}\right] = 1\) and \(\text{Tr}\left[\mathbf{B}^\dagger\mathbf{B}\right] = 1\).
check_finite (bool, optional) -- If True, convert the input to an array, checking for NaNs or Infs.
options (dict, optional) -- A dictionary of method options.
weight (ndarray, optional) -- The 1D-array representing the weights of each row of \(\mathbf{A}\). This defines the elements of the diagonal matrix \(\mathbf{W}\) that is multiplied by \(\mathbf{A}\) matrix, i.e., \(\mathbf{A} \rightarrow \mathbf{WA}\).
lapack_driver ({'gesvd', 'gesdd'}, optional) -- Whether to use the more efficient divide-and-conquer approach ('gesdd') or the more robust general rectangular approach ('gesvd') to compute the singular-value decomposition with scipy.linalg.svd.

Returns

res -- The Procrustes result represented as a class:utils.ProcrustesResult object.

Return type

ProcrustesResult

Notes

Given matrix \(\mathbf{A}_{n \times n}\) and a reference \(\mathbf{B}_{n \times n}\), find a permutation of rows/columns of \(\mathbf{A}_{n \times n}\) that makes it as close as possible to \(\mathbf{B}_{n \times n}\). I.e.,

\[\begin{split}&\underbrace{\text{min}}_{\left\{\mathbf{P} \left| {p_{ij} \in \{0, 1\} \atop \sum_{i=1}^n p_{ij} = \sum_{j=1}^n p_{ij} = 1} \right. \right\}} \|\mathbf{P}^\dagger \mathbf{A} \mathbf{P} - \mathbf{B}\|_{F}^2\\ = &\underbrace{\text{min}}_{\left\{\mathbf{P} \left| {p_{ij} \in \{0, 1\} \atop \sum_{i=1}^n p_{ij} = \sum_{j=1}^n p_{ij} = 1} \right. \right\}} \text{Tr}\left[\left(\mathbf{P}^\dagger\mathbf{A}\mathbf{P} - \mathbf{B} \right)^\dagger \left(\mathbf{P}^\dagger\mathbf{A}\mathbf{P} - \mathbf{B} \right)\right] \\ = &\underbrace{\text{max}}_{\left\{\mathbf{P} \left| {p_{ij} \in \{0, 1\} \atop \sum_{i=1}^n p_{ij} = \sum_{j=1}^n p_{ij} = 1} \right. \right\}} \text{Tr}\left[\mathbf{P}^\dagger\mathbf{A}^\dagger\mathbf{P}\mathbf{B} \right]\\\end{split}\]

Here, \(\mathbf{P}_{n \times n}\) is the permutation matrix. Given an intial guess, the best local minimum can be obtained by the iterative procedure,

\[p_{ij}^{(n + 1)} = p_{ij}^{(n)} \sqrt{ \frac{2\left[\mathbf{T}^{(n)}\right]_{ij}}{\left[ \mathbf{P}^{(n)} \left( \left(\mathbf{P}^{(n)}\right)^T \mathbf{T} + \left( \left(\mathbf{P}^{(n)}\right)^T \mathbf{T} \right)^T \right) \right]_{ij}} }\]

where,

\[\mathbf{T}^{(n)} = \mathbf{A} \mathbf{P}^{(n)} \mathbf{B}\]

Using an initial guess, the iteration can stops when the change in \(d\) is below the specified threshold,

\[d = \text{Tr} \left[\left(\mathbf{P}^{(n+1)} -\mathbf{P}^{(n)} \right)^T \left(\mathbf{P}^{(n+1)} -\mathbf{P}^{(n)} \right)\right]\]

The outcome of the iterative procedure \(\mathbf{P}^{(\infty)}\) is not a permutation matrix. So, the closest permutation can be found by setting refinement=True. This uses procrustes.permutation.PermutationProcrustes to find the closest permutation; that is,

\[\underbrace{\text{min}}_{\left\{\mathbf{P} \left| {p_{ij} \in \{0, 1\} \atop \sum_{i=1}^n p_{ij} = \sum_{j=1}^n p_{ij} = 1} \right. \right\}} \|\mathbf{P} - \mathbf{P}^{(\infty)}\|_{F}^2 = \underbrace{\text{max}}_{\left\{\mathbf{P} \left| {p_{ij} \in \{0, 1\} \atop \sum_{i=1}^n p_{ij} = \sum_{j=1}^n p_{ij} = 1} \right. \right\}} \text{Tr}\left[\mathbf{P}^\dagger\mathbf{P}^{(\infty)} \right]\]

The answer to this problem is a heuristic solution for the matrix-matching problem that seems to be relatively accurate.

Initial Guess:

Two possible initial guesses are inferred from the Umeyama procedure. One can find either the closest permutation matrix to \(\mathbf{U}_\text{Umeyama}\) or to \(\mathbf{U}_\text{Umeyama}^\text{approx.}\).

Considering the procrustes.permutation.PermutationProcrustes, the resulting permutation matrix can be specified as initial guess through guess=umeyama and guess=umeyama_approx, which solves:

\[\begin{split}\underbrace{\text{max}}_{\left\{\mathbf{P} \left| {p_{ij} \in \{0, 1\} \atop \sum_{i=1}^n p_{ij} = \sum_{j=1}^n p_{ij} = 1} \right. \right\}} \text{Tr}\left[\mathbf{P}^\dagger\mathbf{U}_\text{Umeyama} \right] \\ \underbrace{\text{max}}_{\left\{\mathbf{P} \left| {p_{ij} \in \{0, 1\} \atop \sum_{i=1}^n p_{ij} = \sum_{j=1}^n p_{ij} = 1} \right. \right\}} \text{Tr}\left[\mathbf{P}^\dagger\mathbf{U}_\text{Umeyama}^\text{approx.} \right]\end{split}\]

Another choice is to start by solving a normal permutation Procrustes problem. In other words, write new matrices, \(\mathbf{A}^0\) and \(\mathbf{B}^0\), with columns like,

\[\begin{split}\begin{bmatrix} a_{ii} \\ p \cdot \text{sgn}\left( a_{ij_\text{max}} \right) \underbrace{\text{max}}_{1 \le j \le n} \left(\left|a_{ij}\right|\right)\\ p^2 \cdot \text{sgn}\left( a_{ij_{\text{max}-1}} \right) \underbrace{\text{max}-1}_{1 \le j \le n} \left(\left|a_{ij}\right|\right)\\ \vdots \end{bmatrix}\end{split}\]

Here, \(\text{max}-1\) denotes the second-largest absolute value of elements, \(\text{max}-2\) is the third-largest abosule value of elements, etc.

The matrices \(\mathbf{A}^0\) and \(\mathbf{B}^0\) have the diagonal elements of \(\mathbf{A}\) and \(\mathbf{B}\) in the first row, and below the first row has the largest off-diagonal element in row \(i\), the second-largest off-diagonal element, etc. The elements are weighted by a factor \(0 < p < 1\), so that the smaller elements are considered less important for matching. The matrices can be truncated after a few terms; for example, after the size of elements falls below some threshold. A reasonable choice would be to stop after \(\lfloor \frac{-2\ln 10}{\ln p} +1\rfloor\) rows; this ensures that the size of the elements in the last row is less than 1% of those in the first off-diagonal row.

There are obviously many different ways to construct the matrices \(\mathbf{A}^0\) and \(\mathbf{B}^0\). Another, even better, method would be to try to encode not only what the off-diagonal elements are, but which element in the matrix they correspond to. One could do that by not only listing the diagonal elements, but also listing the associated off-diagonal element. I.e., the columns of \(\mathbf{A}^0\) and \(\mathbf{B}^0\) would be,

\[\begin{split}\begin{bmatrix} a_{ii} \\ p \cdot a_{j_\text{max} j_\text{max}} \\ p \cdot \text{sgn}\left( a_{ij_\text{max}} \right) \underbrace{\text{max}}_{1 \le j \le n} \left(\left|a_{ij}\right|\right)\\ p^2 \cdot a_{j_{\text{max}-1} j_{\text{max}-1}} \\ p^2 \cdot \text{sgn}\left( a_{ij_{\text{max}-1}} \right) \underbrace{\text{max}-1}_{1 \le j \le n} \left(\left|a_{ij}\right|\right)\\ \vdots \end{bmatrix}\end{split}\]

In this case, you would stop the procedure after \(m = \left\lfloor {\frac{{ - 4\ln 10}}{{\ln p}} + 1} \right \rfloor\) rows.

Then one uses the procrustes.permutation.PermutationProcrustes to match the constructed matrices \(\mathbf{A}^0\) and \(\mathbf{B}^0\) instead of \(\mathbf{A}\) and \(\mathbf{B}\). I.e.,

\[\underbrace{\text{max}}_{\left\{\mathbf{P} \left| {p_{ij} \in \{0, 1\} \atop \sum_{i=1}^n p_{ij} = \sum_{j=1}^n p_{ij} = 1} \right. \right\}} \text{Tr}\left[\mathbf{P}^\dagger \left(\mathbf{A^0}^\dagger\mathbf{B^0}\right)\right]\]

Please note that the "umeyama_approx" might give inaccurate permutation matrix. More specificity, this is a approximated Umeyama method. One example we can give is that when we compute the permutation matrix that transforms \(A\) to \(B\), the "umeyama_approx" method can not give the exact permutation transformation matrix while "umeyama", "normal1" and "normal2" do.

\[\begin{split}A = \begin{bmatrix} 4 & 5 & -3 & 3 \\ 5 & 7 & 3 & -5 \\ -3 & 3 & 2 & 2 \\ 3 & -5 & 2 & 5 \\ \end{bmatrix} \\ B = \begin{bmatrix} 73 & 100 & 73 & -62 \\ 100 & 208 & -116 & 154 \\ 73 & -116 & 154 & 100 \\ -62 & 154 & 100 & 127 \\ \end{bmatrix} \\\end{split}\]

procrustes.permutation

`procrustes.permutation`