procrustes.permutation
Permutation Procrustes Module.
- procrustes.permutation.permutation(a: ndarray, b: ndarray, pad: bool = True, translate: bool = False, scale: bool = False, unpad_col: bool = False, unpad_row: bool = False, check_finite: bool = True, weight: ndarray | None = None) ProcrustesResult [source]
Perform one-sided permutation Procrustes.
Given matrix \(\mathbf{A}_{m \times n}\) and a reference matrix \(\mathbf{B}_{m \times n}\), find the permutation transformation matrix \(\mathbf{P}_{n \times n}\) that makes \(\mathbf{AP}\) as close as possible to \(\mathbf{B}\). In other words,
\[\underbrace{\text{min}}_{\left\{\mathbf{P} \left| {[\mathbf{P}]_{ij} \in \{0, 1\} \atop \sum_{i=1}^n [\mathbf{P}]_{ij} = \sum_{j=1}^n [\mathbf{P}]_{ij} = 1} \right. \right\}} \|\mathbf{A} \mathbf{P} - \mathbf{B}\|_{F}^2\]This Procrustes method requires the \(\mathbf{A}\) and \(\mathbf{B}\) matrices to have the same shape, which is guaranteed with the default
pad=True
argument for any given \(\mathbf{A}\) and \(\mathbf{B}\) matrices. In preparing the \(\mathbf{A}\) and \(\mathbf{B}\) matrices, the (optional) order of operations is: 1) unpad zero rows/columns, 2) translate the matrices to the origin, 3) weight entries of \(\mathbf{A}\), 4) scale the matrices to have unit norm, 5) pad matrices with zero rows/columns so they have the same shape.- Parameters:
a (ndarray) -- The 2D-array \(\mathbf{A}\) which is going to be transformed.
b (ndarray) -- The 2D-array \(\mathbf{B}\) representing the reference matrix.
pad (bool, optional) -- Add zero rows (at the bottom) and/or columns (to the right-hand side) of matrices \(\mathbf{A}\) and \(\mathbf{B}\) so that they have the same shape.
translate (bool, optional) -- If True, both arrays are centered at origin (columns of the arrays will have mean zero).
scale (bool, optional) -- If True, both arrays are normalized with respect to the Frobenius norm, i.e., \(\text{Tr}\left[\mathbf{A}^\dagger\mathbf{A}\right] = 1\) and \(\text{Tr}\left[\mathbf{B}^\dagger\mathbf{B}\right] = 1\).
unpad_col (bool, optional) -- If True, zero columns (with values less than 1.0e-8) on the right-hand side are removed.
unpad_row (bool, optional) -- If True, zero rows (with values less than 1.0e-8) at the bottom are removed.
check_finite (bool, optional) -- If True, convert the input to an array, checking for NaNs or Infs.
weight (ndarray, optional) -- The 1D-array representing the weights of each row of \(\mathbf{A}\). This defines the elements of the diagonal matrix \(\mathbf{W}\) that is multiplied by \(\mathbf{A}\) matrix, i.e., \(\mathbf{A} \rightarrow \mathbf{WA}\).
- Returns:
res -- The Procrustes result represented as a class:utils.ProcrustesResult object.
- Return type:
Notes
The optimal \(n \times n\) permutation matrix is obtained by,
\[\mathbf{P}^{\text{opt}} = \arg \underbrace{\text{min}}_{\left\{\mathbf{P} \left| {[\mathbf{P}]_{ij} \in \{0, 1\} \atop \sum_{i=1}^n [\mathbf{P}]_{ij} = \sum_{j=1}^n [\mathbf{P}]_{ij} = 1} \right. \right\}} \|\mathbf{A} \mathbf{P} - \mathbf{B}\|_{F}^2 = \underbrace{\text{max}}_{\left\{\mathbf{P} \left| {[\mathbf{P}]_{ij} \in \{0, 1\} \atop \sum_{i=1}^n [\mathbf{P}]_{ij} = \sum_{j=1}^n [\mathbf{P}]_{ij} = 1} \right. \right\}} \text{Tr}\left[\mathbf{P}^\dagger\mathbf{A}^\dagger\mathbf{B} \right]\]The solution is found by relaxing the problem into a linear programming problem. The solution to a linear programming problem is always at the boundary of the allowed region. So,
\[\underbrace{\text{max}}_{\left\{\mathbf{P} \left| {[\mathbf{P}]_{ij} \in \{0, 1\} \atop \sum_{i=1}^n [\mathbf{P}]_{ij} = \sum_{j=1}^n [\mathbf{P}]_{ij} = 1} \right. \right\}} \text{Tr}\left[\mathbf{P}^\dagger\mathbf{A}^\dagger\mathbf{B} \right] = \underbrace{\text{max}}_{\left\{\mathbf{P} \left| {[\mathbf{P}]_{ij} \geq 0 \atop \sum_{i=1}^n [\mathbf{P}]_{ij} = \sum_{j=1}^n [\mathbf{P}]_{ij} = 1} \right. \right\}} \text{Tr}\left[\mathbf{P}^\dagger\left(\mathbf{A}^\dagger\mathbf{B}\right) \right]\]This is a matching problem and can be solved by the Hungarian algorithm. The cost matrix is defined as \(\mathbf{A}^\dagger\mathbf{B}\) and the scipy.optimize.linear_sum_assignment is used to solve for the permutation that maximizes the linear sum assignment problem.
- procrustes.permutation.permutation_2sided(a: ndarray, b: ndarray, single: bool = True, method: str = 'kopt', guess_p1: ndarray | None = None, guess_p2: ndarray | None = None, pad: bool = False, unpad_col: bool = False, unpad_row: bool = False, translate: bool = False, scale: bool = False, check_finite: bool = True, options: dict | None = None, weight: ndarray | None = None, lapack_driver: str = 'gesvd') ProcrustesResult [source]
Perform two-sided permutation Procrustes.
- Parameters:
a (ndarray) -- The 2D-array \(\mathbf{A}\) which is going to be transformed.
b (ndarray) -- The 2D-array \(\mathbf{B}\) representing the reference matrix.
single (bool, optional) -- If True, the single-transformation Procrustes is performed to obtain \(\mathbf{P}\). If False, the two-transformations Procrustes is performed to obtain \(\mathbf{P}_1\) and \(\mathbf{P}_2\).
method (str, optional) -- The method to solve for permutation matrices. For single=False, these include "flip-flop" and "k-opt" methods. For single=True, these include "approx-normal1", "approx-normal2", "approx-umeyama", "approx-umeyama-svd", "k-opt", "soft-assign", and "nmf".
guess_p1 (np.ndarray, optional) -- Guess for \(\mathbf{P}_1\) matrix given as a 2D-array. This is only required for the two-transformations case specified by setting single=False.
guess_p2 (np.ndarray, optional) -- Guess for \(\mathbf{P}_2\) matrix given as a 2D-array.
pad (bool, optional) -- Add zero rows (at the bottom) and/or columns (to the right-hand side) of matrices \(\mathbf{A}\) and \(\mathbf{B}\) so that they have the same shape.
unpad_col (bool, optional) -- If True, zero columns (with values less than 1.0e-8) on the right-hand side are removed.
unpad_row (bool, optional) -- If True, zero rows (with values less than 1.0e-8) at the bottom are removed.
translate (bool, optional) -- If True, both arrays are centered at origin (columns of the arrays will have mean zero).
scale (bool, optional) -- If True, both arrays are normalized with respect to the Frobenius norm, i.e., \(\text{Tr}\left[\mathbf{A}^\dagger\mathbf{A}\right] = 1\) and \(\text{Tr}\left[\mathbf{B}^\dagger\mathbf{B}\right] = 1\).
check_finite (bool, optional) -- If True, convert the input to an array, checking for NaNs or Infs.
options (dict, optional) -- A dictionary of method options.
weight (ndarray, optional) -- The 1D-array representing the weights of each row of \(\mathbf{A}\). This defines the elements of the diagonal matrix \(\mathbf{W}\) that is multiplied by \(\mathbf{A}\) matrix, i.e., \(\mathbf{A} \rightarrow \mathbf{WA}\).
lapack_driver ({'gesvd', 'gesdd'}, optional) -- Whether to use the more efficient divide-and-conquer approach ('gesdd') or the more robust general rectangular approach ('gesvd') to compute the singular-value decomposition with scipy.linalg.svd.
- Returns:
res -- The Procrustes result represented as a class:utils.ProcrustesResult object.
- Return type:
Notes
Given matrix \(\mathbf{A}_{n \times n}\) and a reference \(\mathbf{B}_{n \times n}\), find a permutation of rows/columns of \(\mathbf{A}_{n \times n}\) that makes it as close as possible to \(\mathbf{B}_{n \times n}\). I.e.,
\[\begin{split}&\underbrace{\text{min}}_{\left\{\mathbf{P} \left| {p_{ij} \in \{0, 1\} \atop \sum_{i=1}^n p_{ij} = \sum_{j=1}^n p_{ij} = 1} \right. \right\}} \|\mathbf{P}^\dagger \mathbf{A} \mathbf{P} - \mathbf{B}\|_{F}^2\\ = &\underbrace{\text{min}}_{\left\{\mathbf{P} \left| {p_{ij} \in \{0, 1\} \atop \sum_{i=1}^n p_{ij} = \sum_{j=1}^n p_{ij} = 1} \right. \right\}} \text{Tr}\left[\left(\mathbf{P}^\dagger\mathbf{A}\mathbf{P} - \mathbf{B} \right)^\dagger \left(\mathbf{P}^\dagger\mathbf{A}\mathbf{P} - \mathbf{B} \right)\right] \\ = &\underbrace{\text{max}}_{\left\{\mathbf{P} \left| {p_{ij} \in \{0, 1\} \atop \sum_{i=1}^n p_{ij} = \sum_{j=1}^n p_{ij} = 1} \right. \right\}} \text{Tr}\left[\mathbf{P}^\dagger\mathbf{A}^\dagger\mathbf{P}\mathbf{B} \right]\\\end{split}\]Here, \(\mathbf{P}_{n \times n}\) is the permutation matrix. Given an intial guess, the best local minimum can be obtained by the iterative procedure,
\[p_{ij}^{(n + 1)} = p_{ij}^{(n)} \sqrt{ \frac{2\left[\mathbf{T}^{(n)}\right]_{ij}}{\left[ \mathbf{P}^{(n)} \left( \left(\mathbf{P}^{(n)}\right)^T \mathbf{T} + \left( \left(\mathbf{P}^{(n)}\right)^T \mathbf{T} \right)^T \right) \right]_{ij}} }\]where,
\[\mathbf{T}^{(n)} = \mathbf{A} \mathbf{P}^{(n)} \mathbf{B}\]Using an initial guess, the iteration can stops when the change in \(d\) is below the specified threshold,
\[d = \text{Tr} \left[\left(\mathbf{P}^{(n+1)} -\mathbf{P}^{(n)} \right)^T \left(\mathbf{P}^{(n+1)} -\mathbf{P}^{(n)} \right)\right]\]The outcome of the iterative procedure \(\mathbf{P}^{(\infty)}\) is not a permutation matrix. So, the closest permutation can be found by setting
refinement=True
. This usesprocrustes.permutation.PermutationProcrustes
to find the closest permutation; that is,\[\underbrace{\text{min}}_{\left\{\mathbf{P} \left| {p_{ij} \in \{0, 1\} \atop \sum_{i=1}^n p_{ij} = \sum_{j=1}^n p_{ij} = 1} \right. \right\}} \|\mathbf{P} - \mathbf{P}^{(\infty)}\|_{F}^2 = \underbrace{\text{max}}_{\left\{\mathbf{P} \left| {p_{ij} \in \{0, 1\} \atop \sum_{i=1}^n p_{ij} = \sum_{j=1}^n p_{ij} = 1} \right. \right\}} \text{Tr}\left[\mathbf{P}^\dagger\mathbf{P}^{(\infty)} \right]\]The answer to this problem is a heuristic solution for the matrix-matching problem that seems to be relatively accurate.
Initial Guess:
Two possible initial guesses are inferred from the Umeyama procedure. One can find either the closest permutation matrix to \(\mathbf{U}_\text{Umeyama}\) or to \(\mathbf{U}_\text{Umeyama}^\text{approx.}\).
Considering the
procrustes.permutation.PermutationProcrustes
, the resulting permutation matrix can be specified as initial guess throughguess=umeyama
andguess=umeyama_approx
, which solves:\[\begin{split}\underbrace{\text{max}}_{\left\{\mathbf{P} \left| {p_{ij} \in \{0, 1\} \atop \sum_{i=1}^n p_{ij} = \sum_{j=1}^n p_{ij} = 1} \right. \right\}} \text{Tr}\left[\mathbf{P}^\dagger\mathbf{U}_\text{Umeyama} \right] \\ \underbrace{\text{max}}_{\left\{\mathbf{P} \left| {p_{ij} \in \{0, 1\} \atop \sum_{i=1}^n p_{ij} = \sum_{j=1}^n p_{ij} = 1} \right. \right\}} \text{Tr}\left[\mathbf{P}^\dagger\mathbf{U}_\text{Umeyama}^\text{approx.} \right]\end{split}\]Another choice is to start by solving a normal permutation Procrustes problem. In other words, write new matrices, \(\mathbf{A}^0\) and \(\mathbf{B}^0\), with columns like,
\[\begin{split}\begin{bmatrix} a_{ii} \\ p \cdot \text{sgn}\left( a_{ij_\text{max}} \right) \underbrace{\text{max}}_{1 \le j \le n} \left(\left|a_{ij}\right|\right)\\ p^2 \cdot \text{sgn}\left( a_{ij_{\text{max}-1}} \right) \underbrace{\text{max}-1}_{1 \le j \le n} \left(\left|a_{ij}\right|\right)\\ \vdots \end{bmatrix}\end{split}\]Here, \(\text{max}-1\) denotes the second-largest absolute value of elements, \(\text{max}-2\) is the third-largest abosule value of elements, etc.
The matrices \(\mathbf{A}^0\) and \(\mathbf{B}^0\) have the diagonal elements of \(\mathbf{A}\) and \(\mathbf{B}\) in the first row, and below the first row has the largest off-diagonal element in row \(i\), the second-largest off-diagonal element, etc. The elements are weighted by a factor \(0 < p < 1\), so that the smaller elements are considered less important for matching. The matrices can be truncated after a few terms; for example, after the size of elements falls below some threshold. A reasonable choice would be to stop after \(\lfloor \frac{-2\ln 10}{\ln p} +1\rfloor\) rows; this ensures that the size of the elements in the last row is less than 1% of those in the first off-diagonal row.
There are obviously many different ways to construct the matrices \(\mathbf{A}^0\) and \(\mathbf{B}^0\). Another, even better, method would be to try to encode not only what the off-diagonal elements are, but which element in the matrix they correspond to. One could do that by not only listing the diagonal elements, but also listing the associated off-diagonal element. I.e., the columns of \(\mathbf{A}^0\) and \(\mathbf{B}^0\) would be,
\[\begin{split}\begin{bmatrix} a_{ii} \\ p \cdot a_{j_\text{max} j_\text{max}} \\ p \cdot \text{sgn}\left( a_{ij_\text{max}} \right) \underbrace{\text{max}}_{1 \le j \le n} \left(\left|a_{ij}\right|\right)\\ p^2 \cdot a_{j_{\text{max}-1} j_{\text{max}-1}} \\ p^2 \cdot \text{sgn}\left( a_{ij_{\text{max}-1}} \right) \underbrace{\text{max}-1}_{1 \le j \le n} \left(\left|a_{ij}\right|\right)\\ \vdots \end{bmatrix}\end{split}\]In this case, you would stop the procedure after \(m = \left\lfloor {\frac{{ - 4\ln 10}}{{\ln p}} + 1} \right \rfloor\) rows.
Then one uses the
procrustes.permutation.PermutationProcrustes
to match the constructed matrices \(\mathbf{A}^0\) and \(\mathbf{B}^0\) instead of \(\mathbf{A}\) and \(\mathbf{B}\). I.e.,\[\underbrace{\text{max}}_{\left\{\mathbf{P} \left| {p_{ij} \in \{0, 1\} \atop \sum_{i=1}^n p_{ij} = \sum_{j=1}^n p_{ij} = 1} \right. \right\}} \text{Tr}\left[\mathbf{P}^\dagger \left(\mathbf{A^0}^\dagger\mathbf{B^0}\right)\right]\]Please note that the "umeyama_approx" might give inaccurate permutation matrix. More specificity, this is a approximated Umeyama method. One example we can give is that when we compute the permutation matrix that transforms \(A\) to \(B\), the "umeyama_approx" method can not give the exact permutation transformation matrix while "umeyama", "normal1" and "normal2" do.
\[\begin{split}A = \begin{bmatrix} 4 & 5 & -3 & 3 \\ 5 & 7 & 3 & -5 \\ -3 & 3 & 2 & 2 \\ 3 & -5 & 2 & 5 \\ \end{bmatrix} \\ B = \begin{bmatrix} 73 & 100 & 73 & -62 \\ 100 & 208 & -116 & 154 \\ 73 & -116 & 154 & 100 \\ -62 & 154 & 100 & 127 \\ \end{bmatrix} \\\end{split}\]