TransWikia.com

Best approximation of sum of unit vectors by a smaller subset

Mathematics Asked by g g on November 2, 2021

Let $v_1,ldots,v_N$ be linear independent unit vectors in $mathbb{R}^N$ and denote their scaled sum by $s_N = frac{1}{N}sum_{k=1}^N v_k.$ I would like to find a small subset of size $n$ among those vectors such that their scaled sum approximates $s_N$ well. In other words find

$$ J = underset{Jinmathscr{J}}{operatorname{argmin}} bigglVert s_N – frac{1}{n}sum_{k=1}^n v_{J_k}biggrVert$$

where $J$ runs over the set $mathscr{J}$ of all subsets of ${1,ldots,N}$ with size $n$ and $lVert cdot rVert$ is the euclidean norm.

The set of vectors can be considered an iid sample drawn uniformly from the sphere. And, of course, in my case $N$ and $n$ are too large ($N$ will be of the order of 10’000 or 100’000 and $n$ maybe one or two magnitudes smaller) to just try all subsets. So I am looking for something more clever.

My approach so far

I tried

  • Repeated random subsampling, i.e. drawing many, many subsets of size $n$ in an iid fashion, calculating the approximation for each instance and retaining the best.
  • Greedy approach, starting with a single vector, and then increasing the set in steps every time by a single vector. The vector is that single vector which gives the best approximation for the enlarged set.

Questions

  • Is this a known problem with a proper name?
  • Is it hard (as in NP-hard for example) or are clever solutions known?
  • Are there better heuristic approaches?
  • Are there theoretic results/performance guarantees for the two heuristics I used?

Note: I edited the question to include scaling. Some of the answers/comments refer to the older version where vectors were not scaled.

2 Answers

As suggested by @BenGrossmann, you can use integer linear programming to minimize the 1-norm. Explicitly, let binary decision variable $x_j$ indicate whether $j in J$. The problem is to minimize $sum_{i=1}^N (z_i^+ + z_i^-)$ subject to linear constraints begin{align} (s_N)_i - frac{1}{n}sum_{j=1}^N (v_j)_i x_j &= z_i^+ - z_i^- &&text{for $i in {1,dots,N}$} \ sum_{j=1}^n x_j &= n \ z_i^+ &ge 0 &&text{for $i in {1,dots,N}$}\ z_i^- &ge 0 &&text{for $i in {1,dots,N}$} end{align}

This might provide a good approximation for your 2-norm objective or a good starting solution for an improvement heuristic.


For the 2-norm, the problem is to minimize $sum_{i=1}^N left((s_N)_i - frac{1}{n}sum_{j=1}^N (v_j)_i x_jright)^2$ subject to linear constraint $$ sum_{j=1}^n x_j = n tag1 $$

Because $x_i$ is binary, we have $x_i^2 = x_i$. For $i < j$, you can linearize each product $x_i x_j$ as described here.

You can also strengthen the formulation by multiplying both sides of the cardinality constraint $(1)$ by $x_i$, yielding: $$sum_{j=1}^{i-1} x_j x_i + x_i^2 + sum_{j=i+1}^n x_i x_j = n x_i$$ And then linearize this quadratic constraint by using the products from the objective linearization: $$sum_{j=1}^{i-1} y_{j,i} + sum_{j=i+1}^n y_{i,j} = (n - 1) x_i$$


Given a feasible solution, a simple improvement heuristic for either norm is to replace one vector in $J$ with one vector not in $J$ if it improves the objective value.

Answered by RobPratt on November 2, 2021

Let $A$ denote the matrix whose columns are $v_1,dots,v_N$. Then your problem is that of minimizing $|s_N - Ax|$ subject to the constraint that $x$ has $0,1$ entries and $|x| leq sqrt{n}$.

Removing the constraint that $x$ has $0,1$ entries leaves us with a much easier problem to deal with. I suspect that its solution will yield a useful heuristic.

If $A = U Sigma V^T$ is an SVD and we make the substitutions $b = U^Ts_N$ and $y = V^Tx$, we are left with the simplified problem $$ min |Sigma y - b| quad text{s.t. } quad |y| leq sqrt{n}. $$ This is easily solved with Lagrange multipliers. The squared objective and constraint functions have the forms $$ f(y) = |Sigma y - b|^2 implies nabla f = 2 [Sigma^2 y - Sigma b] \g(y) = |y|^2 implies nabla g = 2y $$ So, we have $$ nabla f = lambda nabla g implies Sigma^2 y - Sigma b = lambda y implies (Sigma^2 - I)y = lambda Sigma b implies y = lambda(Sigma^2 - I)^{-1}Sigma b. $$ Note: this assumes that $A$ does not have $1$ as a singular value, which occurs with probability $1$. Plugging into the constraint yields $$ |lambda(Sigma^2 - I)^{-1}Sigma b|^2 = n implies lambda = pm sqrt{frac{n}{|(Sigma^2 - I)^{-1}Sigma b|^2}}, $$ which is simply to say that this solution for $y$ should be normalized to the radius-$sqrt{n}$ sphere.

I'm not sure if this can be written in terms that remove the SVD. For what it's worth, though, we have $$ (Sigma^2 - I)^{-1}Sigma = V^T[(A^TA - I)^{-1}sqrt{A^TA}]V. $$

Answered by Ben Grossmann on November 2, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP