Expressing a polynomial as a sum of squares

Question

I encountered a degree 4 polynomial in 8 variables f(a1,a2,a3,a4,b1,b2,b3,b4) that I suspect can be written as a sum of squares. While sostools in MATLAB would find such a sum of squares decomposition, I am wondering whether a similar package exists for Mathematica.

f= 2 a1^2 b1^2 + a2^2 b1^2 + a3^2 b1^2 + 2 a1 a2 b1 b2 + 
 2 a3 a4 b1 b2 + a1^2 b2^2 + 2 a2^2 b2^2 + a4^2 b2^2 + 
 2 a1 a3 b1 b3 + 2 a2 a4 b1 b3 - 4 a2 a3 b2 b3 + 4 a1 a4 b2 b3 + 
 a1^2 b3^2 + 2 a3^2 b3^2 + a4^2 b3^2 + 4 a2 a3 b1 b4 - 
 4 a1 a4 b1 b4 + 2 a1 a3 b2 b4 + 2 a2 a4 b2 b4 + 2 a1 a2 b3 b4 + 
 2 a3 a4 b3 b4 + a2^2 b4^2 + a3^2 b4^2 + 2 a4^2 b4^2;

Daniel Lichtblau · Answer

Here is a somewhat heuristic approach, that relies in this case on the SOS having integer coefficients. I will indicate an alteration that gives a numerical approximation in the general case.

Start with the polynomial under scrutiny.

ff = 2 a1^2 b1^2 + a2^2 b1^2 + a3^2 b1^2 + 2 a1 a2 b1 b2 + 
   2 a3 a4 b1 b2 + a1^2 b2^2 + 2 a2^2 b2^2 + a4^2 b2^2 + 
   2 a1 a3 b1 b3 + 2 a2 a4 b1 b3 - 4 a2 a3 b2 b3 + 4 a1 a4 b2 b3 + 
   a1^2 b3^2 + 2 a3^2 b3^2 + a4^2 b3^2 + 4 a2 a3 b1 b4 - 
   4 a1 a4 b1 b4 + 2 a1 a3 b2 b4 + 2 a2 a4 b2 b4 + 2 a1 a2 b3 b4 + 
   2 a3 a4 b3 b4 + a2^2 b4^2 + a3^2 b4^2 + 2 a4^2 b4^2;

Extract variables. It is homogeneous of degree 4 so we know we seek squares involving products of two (not necessarily distinct) variables.

allvars = Variables[ff]
quads = Union[Flatten[Outer[Times, allvars, allvars]]];
n = Length[quads];

Create a symbolic matrix for the bilinear form. Use it to make a symbolic such form. Extract the terms, create a new variable for each one.

mat = Array[x, {n, n}] /. x[i_, j_] /; i < j :> x[j, i];
mvars = Flatten[mat];
qpoly = Expand[quads.mat.quads];
terms = Apply[List, ff] /. n_Integer*ab_ :> ab
tvars = Array[t, Length[terms]];

Now form the difference between this symbolic polynomial and the one of interest. Replace the terms by their new variables. We'll set all of those to zero to solve for the matrix variables.

reps = Thread[terms -> tvars];
diffpoly = qpoly - ff /. reps;

Now we separate out terms to get a linear system. The "constant" part is independent of terms of interest and we'll just set variables therein to zero. In general this might be a problem but it works fine for this example. We have that constant part, and the matrix variables within it, only because we included more quadratic monomials than we really needed.

linpolys = Normal[CoefficientArrays[diffpoly, tvars]];
unneeded = Cases[Variables[linpolys[[1]]], _x];
zrules = Thread[unneeded -> 0];
qpoly2 = Expand[quads.mat.quads] /. zrules;

Now use the zeroed variables to form a smaller set of equations.

diffpoly2 = qpoly2 - ff /. reps;
linpolys2 = Normal[CoefficientArrays[diffpoly2, tvars]];

(* {0, {-2 + x[11, 11] + 2 x[15, 1], -1 + x[12, 12] + 2 x[15, 3], -1 + 
   x[13, 13] + 2 x[15, 6], -2 + 2 x[16, 12] + 2 x[17, 11] + 
   2 x[20, 2], -2 + 2 x[18, 14] + 2 x[19, 13] + 2 x[20, 9], -1 + 
   x[16, 16] + 2 x[21, 1], -2 + x[17, 17] + 2 x[21, 3], -1 + 
   x[19, 19] + 2 x[21, 10], -2 + 2 x[22, 13] + 2 x[24, 11] + 
   2 x[26, 4], -2 + 2 x[23, 14] + 2 x[25, 12] + 2 x[26, 8], 
  4 + 2 x[23, 18] + 2 x[24, 17] + 2 x[27, 5], -4 + 2 x[22, 19] + 
   2 x[25, 16] + 2 x[27, 7], -1 + x[22, 22] + 2 x[28, 1], -2 + 
   x[24, 24] + 2 x[28, 6], -1 + x[25, 25] + 2 x[28, 10], -4 + 
   2 x[30, 13] + 2 x[31, 12] + 2 x[33, 5], 
  4 + 2 x[29, 14] + 2 x[32, 11] + 2 x[33, 7], -2 + 2 x[29, 18] + 
   2 x[31, 16] + 2 x[34, 4], -2 + 2 x[30, 19] + 2 x[32, 17] + 
   2 x[34, 8], -2 + 2 x[29, 23] + 2 x[30, 22] + 2 x[35, 2], -2 + 
   2 x[31, 25] + 2 x[32, 24] + 2 x[35, 9], -1 + x[30, 30] + 
   2 x[36, 3], -1 + x[31, 31] + 2 x[36, 6], -2 + x[32, 32] + 
   2 x[36, 10]}} *)

Solve the system that makes bmat into a correct bilinear form for this polynomial. Plug in the solutions. We do not get values for all variables so some work remains.

solns = Solve[linpolys2[[2]] == 0];
bmat = mat /. zrules /. solns[[1]];
bvars = Variables[bmat];

We'll adjust the "free parameters" so as to make the matrix positive definite. To that end, create a numeric function that finds the smallest eigenvalue, given numeric values for these variables.

obj[vals : {_?NumberQ ..}] := 
 Min[Eigenvalues[bmat /. Thread[bvars -> vals]]]

Now we want to maximize that smallest eigenvalue. This is fairly slow with NMaximize, faster with FindMaximum. The latter gives a weaker result and in fact does not quite deliver a viable matrix.

AbsoluteTiming[{max, vals} = NMaximize[obj[bvars], bvars]]

(* Out[255]= {111.376033, {-0.00907241340679, {x[11, 11] -> 
    2.01682304768, x[12, 12] -> 0.998652993047, 
   x[13, 13] -> 1.00192521005, x[16, 12] -> 0.995046130758, 
   x[16, 16] -> 0.99609862964, x[17, 11] -> 0.00336991638564, 
   x[17, 17] -> 2.00227052115, x[18, 14] -> 0.00906737456166, 
   x[19, 13] -> 0.986910720933, x[19, 19] -> 1.01376718577, 
   x[22, 13] -> 0.890037586566, x[22, 19] -> 0.985669306087, 
   x[22, 22] -> 1.00413101831, x[23, 14] -> -0.00906106269306, 
   x[23, 18] -> -0.00904096742103, x[24, 11] -> 0.110559251075, 
   x[24, 17] -> -1.98448194582, x[24, 24] -> 2.00494690946, 
   x[25, 12] -> 1.00921451048, x[25, 16] -> 1.01108280433, 
   x[25, 25] -> 1.01123628949, x[29, 14] -> -0.00906007908607, 
   x[29, 18] -> -0.00903941520969, x[29, 23] -> 0.00907237266086, 
   x[30, 13] -> 0.987345323683, x[30, 19] -> 1.02376193674, 
   x[30, 22] -> 0.987031961614, x[30, 30] -> 1.01561654825, 
   x[31, 12] -> 1.00883984845, x[31, 16] -> 1.00918739894, 
   x[31, 25] -> 1.01897692633, x[31, 31] -> 1.00867831993, 
   x[32, 11] -> -1.98669409665, x[32, 17] -> -0.032831980588, 
   x[32, 24] -> -0.0259559271915, x[32, 32] -> 1.99690556219}}} *)

One will notice that the smallest eigenvalue is somewhat below zero, which is bad since it means the matrix will not be quite positive semidefinite (so no positive sum-of-squares). But we also see that every value is "near" an integer. We'll round them off and use that instead.

vals2 = vals /. Rule[a_, b_] :> Rule[a, Round[b]];

Check that this gives our bilinear form.

matsolved = (bmat /. vals2);
Expand[quads.matsolved.quads - ff]

(* 0 *)

Check that the matrix is positive semidefinite (all eigenvalues must be >= 0).

Eigenvalues[matsolved]

(* Out[446]= {4, 4, 4, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0} *)

Form the eigensystem. Normalize the eigenvectors to that the transpose and inverse matrices are the same.

{evals, evecs} = Eigensystem[matsolved];
evecs2 = Map[#/Norm[#] &, evecs];

Now use square roots of the eigenvalues to form the Cholesky square root matrix.

cholmat = Chop[Sqrt[DiagonalMatrix[evals]].evecs2];

Check the result.

Expand[quads.Transpose[cholmat].cholmat.quads - ff]

(* Out[499]= 0 *)

Here is the explicit SoS.

(quads.Transpose[cholmat]).(cholmat.quads)

(* Out[501]= (-Sqrt[2] a2 b2 + Sqrt[2] a3 b3)^2 + (a3 b1 + a4 b2 + 
   a1 b3 + a2 b4)^2 + (a2 b1 + a1 b2 + a4 b3 + 
   a3 b4)^2 + (-Sqrt[2] a1 b1 + Sqrt[2] a4 b4)^2 *)

Had this rounding trick not worked we could still have found an SoS that approximates the input polynomial. Simply work with the approximate values for that matrix, alter negative eigenvalues to be zero, and use that to obtain the Cholesky matrix. For this purpose the lines below could be used.

matsolved = (bmat /. vals);
{evals, evecs} = Eigensystem[matsolved];
evals2 = evals /. aa_ /; aa < 0 :> 0;

When the dust settles one gets an SoS correct to two or so decimal places.

An alternative I have not tried is to assume that NMinimize is failing to deliver a high precision result, and take what it gives as an initialization for FindMinimum. It is possible that a local optimization step with a good starting value might improve on matters. In principle NMinimize ought to be doing this already, but it might be using heuristics for the method or other settings that are not good for this particular problem.

Cesareo · Answer

The positive definiteness will be handled with the NCAlgebra Suite 5,0 from here

With the codes for convexHull and ExtractElements given here you can proceed as follows

<< NC`
<< SDP`

PosChar[p_, c_] := ToExpression[StringJoin[ToString[p], ToString[c]]]
SymmetricalMatrix[name_, dim_] := Module[{dummy, vars = {}, i, j, k, c}, dummy = Table[0, {dim}, {dim}]; 
  For [i = 1; k = 1, i <= dim, i++, For[j = i, j <= dim, j++,
  c = PosChar[name, k];
  dummy[[i, j]] = c; 
  dummy[[j, i]] = c; 
  vars = Append[vars, c]; 
  k = k + 1]]; 
{dummy, vars}
]

p = 2 a1^2 b1^2 + a2^2 b1^2 + a3^2 b1^2 + 2 a1 a2 b1 b2 + 2 a3 a4 b1 b2 + a1^2 b2^2 + 2 a2^2 b2^2 + a4^2 b2^2 +  2 a1 a3 b1 b3 + 2 a2 a4 b1 b3 - 4 a2 a3 b2 b3 + 4 a1 a4 b2 b3 + a1^2 b3^2 + 2 a3^2 b3^2 + a4^2 b3^2 + 4 a2 a3 b1 b4 - 4 a1 a4 b1 b4 + 2 a1 a3 b2 b4 + 2 a2 a4 b2 b4 + 2 a1 a2 b3 b4 + 2 a3 a4 b3 b4 + a2^2 b4^2 + a3^2 b4^2 + 2 a4^2 b4^2;
vars = Variables[p]


CCV = convexHull[p]
(* {{2, 0, 0, 0, 2, 0, 0, 0}, {2, 0, 0, 0, 0, 2, 0, 0}, {2, 0, 0, 0, 0, 0, 2, 0}, {0, 2, 0, 0, 2, 0, 0, 0}, {0, 2, 0, 0, 0, 2, 0, 0}, {0, 2, 0, 0, 0, 0, 0, 2}, {0, 0, 2, 0, 2, 0, 0, 0}, {0, 0, 2, 0, 0, 0, 2, 0}, {0, 0, 2, 0, 0, 0, 0, 2}, {0, 0, 0, 2, 0, 2, 0, 0}, {0, 0, 0, 2, 0, 0, 2, 0}, {0, 0, 0, 2, 0, 0, 0, 2}} *)

BV = createBasis[CCV]
(* {{0, 0, 0, 1, 0, 0, 0, 1}, {0, 0, 0, 1, 0, 0, 1, 0}, {0, 0, 0, 1, 0,  1, 0, 0}, {0, 0, 1, 0, 0, 0, 0, 1}, {0, 0, 1, 0, 0, 0, 1, 0}, {0, 0, 1, 0, 1, 0, 0, 0}, {0, 1, 0, 0, 0, 0, 0, 1}, {0, 1, 0, 0, 0, 1, 0, 0}, {0, 1, 0, 0, 1, 0, 0, 0}, {1, 0, 0, 0, 0, 0, 1, 0}, {1, 0, 0, 0, 0, 1, 0, 0}, {1, 0, 0, 0, 1, 0, 0, 0}} *)

Z = FormMonomials[vars, BV]
(* {a4 b4, a4 b3, a4 b2, a3 b4, a3 b3, a3 b1, a2 b4, a2 b2, a2 b1, a1 b3, a1 b2, a1 b1} *)

{matM, varsM} = SymmetricalMatrix[m, Length[Z]];
V = Z.matM.Z
{eV, cV} = ExtractElements[V, vars]
{ep, cp} = ExtractElements[p, vars];
comp = Complement[ep, eV]

If[comp == {},
  {eVp, cVp} = ExtractElements[V - p, vars];
  vars2 = Variables[cVp];
  sol = Quiet@Solve[cVp == 0, vars2][[1]];
  matB0 = matB /. sol;
  Print[MatrixForm[Chop[matB0]]];
  y = Variables[matB0];
  Print[y];
  G = matB0;
  f = Total[y];
  abc = SDPMatrices[f, G, y];
  SetPrecision[abc, 50];
  {Y, X, S, flags} = SDPSolve[abc];
  Print[Flatten[Y]];
  Print[PositiveDefiniteMatrixQ[X[[1]]]];
  Print[PositiveDefiniteMatrixQ[S[[1]]]]
];

NOTE

The module createBasis was omitted because it is quite involved. It's purpose is to find the monomials which squared produce the powers found in $f$. As can be verified the polynomial is SOS.

LAST NOTE

As requested, the needed module createBasis

ConvexDepenentQ[corners_, cand_] := Module[{w, ws}, w = Array[ws, Length@corners];
 1 == Length@
 FindInstance[
  w.corners == cand && Total[w] == 1 && 
   And @@ Table[w[[i]] >= 0, {i, Length@w}], w]];

ConvexReduce[data_] := Module[{corners, ncorners, test}, corners = data;
Do[ncorners = Delete[corners, Position[corners, data[[i]]]];
test = ConvexDepenentQ[ncorners, data[[i]]];
If[test, corners = ncorners];, {i, Length@data}];
corners];

convexHull[data_] := Module[{corners, rd}, corners = {};
Do[corners = 
  Join[corners, 
   Select[data, 
    Min[data[[;; , i]]] == #[[i]] || 
      Max[data[[;; , i]]] == #[[i]] &]];, {i, Length@data[[1]]}];
corners = DeleteDuplicates@corners;
rd = Delete[data, First@Position[data, #] & /@ corners];
Do[If[ConvexDepenentQ[corners, rd[[i]]], , 
 AppendTo[corners, rd[[i]]]], {i, Length@rd}];
ConvexReduce@DeleteDuplicates@corners];

CierreConvexo[data_List, n_] := Module[{}, Return[{True, convexHull[data]}]]
MaxExps[coefs_] := 
 Module[{i, n, r}, n = Dimensions[coefs]; 
  If[Length[n] > 0, 
   For[i = 1; r = {}, i <= n[[2]], i++, r = Append[r, Max[Take[coefs, {1, n[[1]]}, {i}]]]]; Return[r], 
 Return[coefs]]]

GeneraBase[data_] := Module[{n, m, i, expmax, expbase, j, k, nn, m0, p},
  expmax = Ceiling[MaxExps[data]/2];
  expmax = Floor[MaxExps[data]/2];
  n = Length[expmax];
  For[i = 1; m = 1, i <= n, i++, m = m*(expmax[[i]] + 1)]; 
  expbase = Table[0, {n}, {m}];
  For[i = 1; m0 = m, i <= n, i++, p = 1; nn = m0/(expmax[[i]] + 1);
   While[p <= m, For[j = 0, j <= expmax[[i]], j++,
     For[k = 1, k <= nn, k++, expbase[[i, p]] = j; p = p + 1]]]; 
   m0 = nn]; Return[Transpose[expbase]]
  ]

Clear[createBasis]
createBasis[P_List] := Module[{S, nlp, ncp, nls, ncs, sos, ones, zeros, II, CC, BB, P0, m1, l1, l2, limits, si, m2, MM, X, f, s = {}, i},
  S = GeneraBase[P];
  {nlp, ncp} = Dimensions[P];
  If[nlp <= 10, {sos, P0} = CierreConvexo[P, 1], {sos, P0} = 
    CierreConvexo[P, 2]];
  {nlp, ncp} = Dimensions[P0];
   If[Length[Dimensions[S]] == 1, nls = 1; 
   ncs = ncp, {nls, ncs} = Dimensions[S]];
  ones = Table[1, {nlp}, {1}];
  zeros = Table[0, {nlp}, {1}];
  II = IdentityMatrix[nlp];
  CC = Join[
  Join[Join[{Table[0, {ncp}]}, {{0}}, 2], 1/nlp Transpose[ones], 
  2], {{1}}, 2][[1]];
  BB = Join[Transpose[ones], {{1}}, 2][[1]];
  m1 = Join[Join[Join[1/2 P0, -ones, 2], II, 2], zeros, 2];
  l1 = Table[{-1, 1}, {ncp + 1}];
  l2 = Table[{0, 1}, {nlp + 1}];
  limits = 10^20 Join[l1, l2];
  For[i = 1, i <= nls, i++,
   If[nls > 1, si = S[[i]], si = S];
   m2 = Join[Join[Join[{-si}, {{1}}, 2], Transpose[zeros], 2], {{1}}, 2];
   MM = Join[m1, m2];
   X = Quiet[LinearProgramming[CC, MM, BB, limits, Reals]];
   f = CC.X;
   If[f > 0, AppendTo[s, si]]]; Return[s]
 ]

Expressing a polynomial as a sum of squares

2 Answers

Add your own answers!

Ask a Question