Error With Least Squares Approximation

16 vues (au cours des 30 derniers jours)
William Stewart
William Stewart le 23 Mar 2024
Modifié(e) : Bruno Luong le 25 Mar 2024
I'm trying to perform a least squares approximation on some data using the Vandermonde Matrix and QR decomposition.
I am generating the coeficients for the degree 20 approximation as follows.
% x = [Test Values, Test Values]'
v = GenVan(x,20);
[q,r] = qr(v'*v);
coef = fixed.backwardSubstitute(r,q'*v'*y);
This is... not very accurate.I'm testing the model using some randomly generated x-values within the range of the data and the results are nothing like the data.
Is there an error in my coeficient generation? I don't know if I did something wrong or if this is just the best it gets.

Réponses (3)

Torsten
Torsten le 23 Mar 2024
Modifié(e) : Torsten le 23 Mar 2024
Besides the fact that it's numerical nonsense to fit a polynomial of degree 20 to data:
What do you get if you simply use
coef = v\y
?
Or
p = polyfit(x,y,20)
?

John D'Errico
John D'Errico le 23 Mar 2024
Modifié(e) : John D'Errico le 24 Mar 2024
There are two serious issues in what you are doing, making this apprach, sadly, truly numerical nonsense. You think you did something good!
First, yes, fitting anything with a polynomial of order 20 is going to result in garbage. You are working in double precision arithmetic. Sorry. Garbage is all you will get. You cannot raise numbers to powers are large as 20, then perform linear algebra, and have anything meaningful happen here.
Next, you think you are being respectful of the linear algebra, by use of a QR. But what did you do? You first form the product v'*v. Again, this is something just blatantly bad. The matrix v will have an extemely large condition number as it is. But when you form v'*v, you SQUARE the condition number, so now it is bad squared. And the thing is, there was no reason to do it in the first place!
Compute the qr factorization of v instead. At least, this is what you need to do, IF you want to use better linear algebraic schemes to solve what was still an obscenely bad problem (with order 20 polynomials.)
The essential problem in what you are doing is you are trying to use what I always heard called the normal equations. That is, someone told you to solve a linear least squares:of the form
A*x = y
by multiplying both sides by the transpose of A. Then you use a matrix inverse. So we see this expression:
x = inv(A'*A)*A'*y
and that is effectively what you did, but you stuck a QR in there, thinking it makes things better. WRONG. Still complete crapola in terms of numerical methods.
Instead, USE BACKSLASH!!!!!!!!
x = A\y
backslash uses good linear algebra, to the extent possible. Even using backslash is still going to create complete crap on an order 20 polynomial. But it is less bad crap. Sorry. But it is.
Anyway, what would yo do to solve the linear least squares in a more efficient way? At least, if you absolutely did not want to use backslash? Sigh, now you probably want to use a column pivoted QR. You will apply that to the matrix A (or what you called v.)
For example,
A = linspace(1,10,50)'.^(0:10);
y = randn(50,1); % yes, completely random crap for a problem. So what? :)
I've formulated a matrix of size 50x11, corresponding to a degree 10 polynomial. 50 random y values. What is the condition number of A?
cond(A)
ans = 8.8900e+12
It is already pretty bad. Again, this is an obscenity of a problem. But consider
cond(A'*A)
ans = 1.8351e+24
The matrix A was already almost numerically singular. But now A'*A is indeed numerically singular. That QR factorization would be complete garbage. Avoid it. If we have a QR in the form
A(:,P) = Q*R
Now we solve the problem as:
[Q,R,P] = qr(A,0); % the economy QR. Q will have the same shape as A.
x = zeros(size(A,2),1);
x(P) = R\(Q'*y);
Is this the solution? Consider
[x,A\y]
ans = 11x2
-145.1281 -145.1281 420.9263 420.9263 -513.7454 -513.7454 348.8147 348.8147 -146.4349 -146.4349 39.8694 39.8694 -7.1593 -7.1593 0.8408 0.8408 -0.0621 -0.0621 0.0026 0.0026
So we formed the solution. Even there though I used backslash. Since R is upper triangular, a backward substitution would suffice instead, but backslash is smart enough to do the right thing.
Finally, why does what I did work? Go back to those horrid normal equations. Now, substitute
A = Q*R
in for A. See what happens. Think about the properties of Q and R. The one extra step I put in there was to use the column pivoting option for the QR. That just makes the whole thing more numerically stable.

Bruno Luong
Bruno Luong le 24 Mar 2024
Modifié(e) : Bruno Luong le 25 Mar 2024
Few other recommendations beside what Torsen and John altrady told you:
  • Lower the polynomial order, do not go above 7, But you can relax and go higher order if:
  • Working with normalized x vector (make it centered about 0 and scale about unity), not sure how it is possble with fixed point context
  • Make QR decomposition on V, not V'*V
  • Use QR decomposition with permutation (3 output syntax)
  • For s=safeguard, check diagonal of R and do the backsubstitution only on the columns with large enough main diagonal (projection on orthogonal of numerical nullspace)
% Generate test data
x = 1:100;
T = 3;
y = sin(x/T + 2*pi*rand);
noisestd = 0.1;
y = y + noisestd*randn(size(y));
degree = 20; % jusr to show the following code can deal reasonably but it is recommended not to go beyond 7
% Normalize data
x0 = (max(x)+min(x))/2;
xs = (max(x)-min(x))/2;
xn = (x - x0) / xs;
% Vandermond matrix, wrt xn
k = (degree:-1:0);
V = xn(:).^k;
% QR with permutation
[Q,R,P] = qr(V);
% Estimate the rank
d = abs(diag(R));
condthres = 1e8; % threshold to estimate the set of independent columns of V
r = find(d >= d(1)/condthres, 1, 'last');
% Solve system V*c = y, discard NULL space
zt = Q(:,1:r)'*y(:);
Rt = R(1:r,1:r);
c = Rt \ zt; % rather use backsubstitution for this
c(end+1:size(V,2)) = 0; % set the coefficients on numm space to 0
c = P*c;
yfit = polyval(c,xn); %close to V*c
close all
plot(x,y,'or',x,yfit,'-b')
legend('data', sprintf('polynomial (degree %d) fit', degree),"location",'best')
  1 commentaire
Bruno Luong
Bruno Luong le 24 Mar 2024
Modifié(e) : Bruno Luong le 24 Mar 2024
If you prefer to get the polynomial coefficients of y = p(x), instead of y=c(xn) do this
p = c(1);
xnc = [1 -x0]/xs;
for k=2:length(c)
p = conv(p, xnc);
p(end) = p(end) + c(k);
end
yfit = polyval(p,x); %close to polyval(c,xn); or to V*c

Connectez-vous pour commenter.

Catégories

En savoir plus sur Polynomials dans Help Center et File Exchange

Produits

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by