# Numerical integration of discrete data using higher precision

13 views (last 30 days)
Msimon q on 29 Dec 2021
Commented: Msimon q on 30 Dec 2021
I have discrete data of a function. I attach the data for cases with 10 points and 100k points and also the plotted function.
The "Y" values of the function near "X=1.57" are very close to each other and zero, like 9.25558265263186E-11 and 5.92357284527227E-11.
Using Trapz function directly, or using Integral function indirectly something like
f1 = griddedInterpolant(x,y); %I used also spline and other methods
f2 = @(t) f1(t);
Result = Integral(f2,0,pi/2)
fails with more than 30% relative error.
I assume it happens because of the roundoff problem and the 16digits precision of matlab.
Is there any way to force Trapz to do the integration using higher precision?
I tried VPA for the given input data to Trapz. But didnt work. Appreciate any suggestion.
Msimon q on 30 Dec 2021
your explanations sounds logical, so i try to find where else i might have made the mistake

John D'Errico on 29 Dec 2021
Edited: John D'Errico on 29 Dec 2021
I explained in your last question. NO. You cannot force trapz to give you higher precision. As for trapz giving 30% error here? FLAT OUT WRONG. PERIOD.
x = xy(:,1);
y = xy(:,2);
plot(x,y) If your function is really that smooth and well-behaved, then trapz will not be in that much error.
format long g
max(diff(x))
ans =
1.57081203500109e-05
min(diff(x))
ans =
1.57081203397968e-05
So your data is pretty uniformly spaced in x, but not perfectly so.
If you think the integral is something significantly different from this:
format long g
trapz(x,y)
ans =
0.299617122167918
then however else you have computed the integral is simply wrong.
spl = spline(x,y);
diff(fnval(fnint(spl),[min(x),max(x)]))
ans =
0.299617122167912
I'm sorry, but the area under that curve is quite close to 0.2996 (and stuff). It is not 30% different from that.
Of course, if your fuction has a spike in it somewhere, essentially a delta function between two of the data points, that is so narrow that your data does not show the spike, then this is simply fact.
Is this a question of double precision accuracy? Since your data is entirely positive, there is no issue of subtractive cancellation.
min(y)
ans =
5.62570579352345e-35
Can we do the integration using high precision arithmetic? Well, yes, we could. But the fact is,
y(1)
ans =
1.02293003278743
So anything you add to y(1) that is less than eps(y(1)) is meaningless.
eps(y(1))
ans =
2.22044604925031e-16
Is this a problem of the integration does not use those really small values of y? Um, not really. For example, we could do it in 100 decimal digits of precision. I could do that using syms and vpa, but since I wrote HPF, I'll use that facility, as I know exactly what it is doing. You can download HPF from the file exchange for free.
DefaultNumberOfDigits 100
hy = hpf(y);
hx = hpf(x);
Of course, since we started with double precision numbers, the result will really be little better than trapz.
dx = hx(2:end) - hx(1:end-1);
Now compute a trapezoidal rule. This is trivial to do, as...
sum((hy(1:end-1) + hy(2:end)) .* dx)/2
ans =
0.2996171221679184598357910258813612376824763020165488844962532387299484829118318852066613967171717782
This computation really is a trapzeoidal rule, done fully in 100 decimal digits of precision. So the high precision tapezoidal rule, done in 100 decimal digits of precision agrees with trapz down to the last few digits reported. Remember, trapz gave us...
0.299617122167918
One problem is, the values in the vectors x and y have only been stored to essentially 16 significant digits. We cannot get higher precision than that. Yes I suppose I could use a higher order Newton-Cotes rule. Do I really need to do so here? We won't gain that much at all, because your numbers were stored only to roughly 16 digits of precsion.
But you are telling us this result is in error by 30%? I'm sorry, but I'll claim that is flat out wrong. How do you assert that as fact?
I will say that when you do this:
f1 = griddedInterpolant(x,y);
this is a PURELY linear interpolant. As such, any integral based on that can be no better than trapz. And if you just use integral, that forgets you need to set the tolerance. So part of your problem lies in a lack of nderstanding of numerical methods. (I'm sorry, but this is clear.)
We could try this:
f = griddedInterpolant(x,y,'spline');
integral(@(x) f(x),min(x),max(x),'abstol',1e-15)
ans =
0.299617122167919
Here I have forced griddedInterpolant to be a spline. Then I cranked down on the tolerance. The result? A minor difference in the last decimal digit reported from that of a direct integral of a spline interpolant.
So I'm sorry, but to claim that the true integral is 30% different from this value is pure silliness. (I really want to call it worse than that, but I won't.) If you want to claim that the last significant digit or two may be in error, go ahead. But to do better, you need to have a better way to store these numbers than double precision. It still won't help a lot.
If you show HOW the number were computed, then I might be able to suggest using a tool like HPF to compute the integral in slightly better ways, by computing y directly in a high precision form.
But now please stop posting this same question repeatedly.
Jan on 30 Dec 2021
This is a useful answer. Obviously the actual problem is the value of the reference solution.
Some sentences of this answer sound more emotional than usual in this forum. The question concerns Matlab and it contains enough information for a useful answer and therefore it is welcome in this forum. A wrong expectation or mistakes are typical reasons for questions and not silly.

Jan on 29 Dec 2021
You explain, that you have discrete data, but the code you show uses a function. If the data are given in double precision, using a higher precision for the intration has a very limited use only: The result cannot be more accurate than the input.
But the sum is numerically instable. So you can try a summation with error compensation: FEX: XSum
Take a look into the code of trapz. You find this (for equidistant x values):
x * sum((y(1:end-1,:) + y(2:end,:)), 1)/2
Replace this by:
x * XSum((y(1:end-1,:) + y(2:end,:)), 1)/2
Msimon q on 30 Dec 2021
Thanks @Jan
My data for the function and the reference solution is coming from a physical problem(very long to explain). Since my code and the the computations have already been verified for a *better function, and since the same problem was reported in a paper (that singularity begins for this setting and the numerical integration becomes harder), I was pretty sure that there is nothing wrong in othe part of my code. However, after what you and others explained to me here, I try to find if i made any mistake there.
* in my physical problem, I can change some parameters, which leads to a better function (not near-zero value around x=1.57). The more my function becomes like this, the more my error increases.