Continue to Site

Eng-Tips is the largest engineering community on the Internet

Intelligent Work Forums for Engineering Professionals

  • Congratulations cowski on being selected by the Eng-Tips community for having the most helpful posts in the forums last week. Way to Go!

How to find the inital formula behind engineering test data ? 2

Status
Not open for further replies.

FreddyMusic

Mechanical
Dec 10, 2005
36
Recently, my job seems to work on a mathematic problem.

We did many efforts on engineer test and get a perfect graph curve by discrete data (100 points on EXCEL).
But my customer interest also the inital Formula behind those data. Perhaps it’s a function or calculus.....not clear.

Engineers and physicists distinguish between continuous data and discrete data.

Anyone leads us first step of this cross?
Books, software, method?



 
Replies continue below

Recommended for you

I must admit, running around waving my arms in the air didn't strike me as a natural response either. If you truly do not know what the underlying physical mechanism is then you start with astraight line, then a log log graph, and then just suck it and see.

If you think it is a polynomial, then very few processes are above eighth order.

Oh, and of course a dimensional analysis might well give you a clue for the likely relationship.







Cheers

Greg Locock

Please see FAQ731-376 for tips on how to make the best use of Eng-Tips.
 
Before anyone else says it, I will point out a serious weakness of the model I mentioned above (yhat_i = k0 + k11*x1_i + k12*x1_i^2 + k13*x1_i^3 + k21*x2_i + k22*x2_i^2 + k23*x2_i^3 + k31*x3_i + k32*x3_i^2 + k33*x3_i^3)

It does not allow for interaction among terms. For example a dependence upon x1*x2 would never be identified. So clearly it will not work for all cases.

The more inputs you have, the tougher the problem becomes. Any knowledge about the system that can be modeled into the estimate y_hat will help the problem solving process. On that point I agree with IRStuff and the others.

=====================================
Eng-tips forums: The best place on the web for engineering discussions.
 
Thanks all for comments and suggestion, I appreciate that.
Thanks for IFRs, that could be pretty close to what I desired….
Perhaps I am looking for a statistical analyze software.

My little experience with Mathematica a few years before...
If your input is math, the output is also logical math, with nice graphic.
Dose latest Math software, including such statistical Feature also?

electricpete
I do try with some basic function, it doesn’t fit exactly.

IRstuff
No more solid approved physical knowledge are available for me,
therefore we call it coefficient- curve.
Co-efficient means un-known.
 
You have 100 points on x-y plot i.e. single input/single output?

I would think it should not be too tough to fit in excel using solver

Can you paste the x/y pairs here?

=====================================
Eng-tips forums: The best place on the web for engineering discussions.
 
Here is X

0.0000
0.0500
0.1000
0.1500
0.2000
0.2500
0.3000
0.3500
0.4000
0.4500
0.5000
0.5500
0.6000
0.6500
0.7000
0.7500
0.8000
0.8100
0.8200
0.8300
0.8400
0.8500
0.8550
0.8600
0.8650
0.8700
0.8750
0.8800
0.8850
0.8900
0.8950
0.9000
0.9020
0.9040
0.9060
0.9080
0.9100
0.9120
0.9140
0.9160
0.9180
0.9200
0.9220
0.9240
0.9260
0.9280
0.9300
0.9320
0.9340
0.9360
0.9380
0.9400
0.9420
0.9440
0.9460
0.9480
0.9500
0.9510
0.9520
0.9530
0.9540
0.9550
0.9560
0.9570
0.9580
0.9590
0.9600
0.9610
0.9620
0.9630
0.9640
0.9650
0.9660
0.9670
0.9680
0.9690
0.9700
0.9705
0.9710
0.9715
0.9720
0.9725
0.9730
0.9735
0.9740
0.9745
0.9750
0.9755
0.9760
0.9765
0.9770
0.9775
0.9780
0.9785
0.9790
0.9795
0.9800
0.9805
0.9810
0.9815
0.9820
0.9825
0.9830
0.9835
0.9840
0.9845
0.9850
0.9855
0.9860
0.9865
0.9870
0.9875
0.9880
0.9885
0.9890
0.9895
0.9900
0.9905
0.9910
0.9915
0.9920
0.9925
0.9930
0.9935
0.9940
0.9945
0.9950
0.9955
0.9960
0.9965
0.9970
0.9975
0.9980
0.9985
0.9990
0.9995
1.0000

Here is Y

0.02363
0.02434
0.02528
0.02623
0.02717
0.02812
0.02930
0.03048
0.03190
0.03332
0.03497
0.03710
0.03923
0.04183
0.04513
0.04891
0.05435
0.05553
0.05671
0.05813
0.05978
0.06144
0.06238
0.06333
0.06427
0.06546
0.06664
0.06758
0.06900
0.07018
0.07160
0.07302
0.07373
0.07443
0.07491
0.07562
0.07632
0.07727
0.07798
0.07869
0.07940
0.08034
0.08105
0.08200
0.08294
0.08389
0.08483
0.08578
0.08696
0.08814
0.08932
0.09050
0.09168
0.09310
0.09428
0.09570
0.09736
0.09806
0.09901
0.09972
0.10066
0.10161
0.10255
0.10350
0.10444
0.10563
0.10657
0.10775
0.10893
0.10988
0.11106
0.11248
0.11366
0.11484
0.11626
0.11768
0.11933
0.12028
0.12122
0.12193
0.12288
0.12382
0.12477
0.12571
0.12642
0.12737
0.12855
0.12973
0.13067
0.13186
0.13304
0.13398
0.13516
0.13658
0.13776
0.13918
0.14036
0.14178
0.14320
0.14462
0.14627
0.14792
0.14958
0.15123
0.15289
0.15478
0.15690
0.15927
0.16163
0.16376
0.16588
0.16848
0.17132
0.17415
0.17699
0.18006
0.18337
0.18739
0.19140
0.19542
0.20015
0.20511
0.21078
0.21669
0.22354
0.23087
0.23984
0.24882
0.26040
0.27364
0.28970
0.31073
0.33673
0.37265
0.43786
0.56594
?

First Point : ( X= 0.0000, Y= 0.02363 )
Last pointt : ( X= 1.0000, Y= ? )
 
Our thought is like this.

We don't know A, we don't know B
We assume C ( specification )
We calucate to have D ( Physical + Math )
.....
.....
Then we test to have G ( Engineering Test data )
we analyse to have F ( Co-efficient curve)

Can we go from F to E ? How ?
 
A scatter diagram of your data presents a graph that can be easily fitted by a rational function R(x)=P(x)/Q(x) where P(x) is a polinomial of a degree n and Q(x) is a polynomial of degree n+1.
Of course this fit (or any other) does not tell us anything about the physical background of your system.
m777182
 
m777182, I'm sure you are right, what is the method that that is called?

Meanwhile

Y=1/($E$1*SQRT($E$2-x)+$E$3+$E$4*SQRT(SQRT($E$5-x)))

where e1 thru e5 are

39.26003141
1
1.240802562
-0.121279377
1.142702314

seems to fit pretty well. I got that using inspection and a variation on electricpete's method.

Looks like a stiffness function.







Cheers

Greg Locock

Please see FAQ731-376 for tips on how to make the best use of Eng-Tips.
 

First Point : ( X= 0.0000, Y= 0.02363 )
Last pointt : ( X= 1.0000, Y= 1.00000 )?

Are you saying that it cannot be measured?
 
Are these clearances or deflection
values in a ball bearing similar to
your other questions in the bearing
forum?
 
A plot of the residuals shows that GregLocock's formula has some issues. Residuals should be randomly distributed about 0. These residuals show a clear trend. HOWEVER, it appears that GregLocock is on the right track with the equation.
 
Well done. GregLocock
Now i assume, we have the first equation.

If i give this to my boss. He will accept only e2. OK.
Then he ask me, “ Freddy, what do you name e1 e3 e4 and e5 ? ”
I say, "Of course, this is a Factor."
He say, “Factor ? Factor means you are un-known."
.........
(I do think about to kill him in first year.)

My dear friends,
My suggestion is?
Is that possible to have one more less Factor
Or have a real e = 2.718282 or pi= 3.1415926.....or SQRT(2)=1.414.....
Is that a way, we approach from engineering world to physical world?

To Diamondjim
This curve has nothing to do with ball bearing.
I am new for bearing, therefore i start for ball bearing only.
I see, you have a lot of knowledge and experience on that field.
We made good discussion. (I am new, but not too new.)
I am glad to consult to you later, if possible.

 
Y=1/($E$1*SQRT($E$2-x)+$E$3+$E$4*SQRT(SQRT($E$5-x)))

You could say these constants came out of a least-squares fit program with an assumed solution including a term varying in x^1/2, a term varying in x^1/4, and a bias term.
E1 is magnitude scaling factor for the ½ power term (centered on E2)
E4 is the scaling factor for the ¼ power term (centered on E5).
E4 is a constant (bias) term.
Maybe Greg has a better way. To get more help on physical interpretation from the folks on the forum I would say you have to give more details of the physical problem (maybe that's something that you don't want to do to protect business secrets?).

Unfortunately I went the opposite direction. I spent (wasted?) some time messing around to add more terms to Greg’s model to reduce the residuals. I did get good reduction in residuals at a cost of substantial increase in complexity. This is probably not what you’re looking for but since I’ve done it, here it is:

The following columns are in my spreadsheet
X, Y = Original data
Ygreg = Greg’s fit * Note - Greg’s fit constants changed when I reoptimized with new terms
Rgreg = Greg’s residual = Y – Ygreg *Same Note
Fpete = my fit to greg’s residual
Ypete = Ygreg +Fpete = estimate of Y
Rpete = Y - Ypete
Rpete/Y = Fractional residuals

I had some difficulties minimizing errors on both the top and bottom end. What finally resolved the problem was to optimize sum of squares of FRACTIONAL residuals , rather than residuals themselves. Probably if I tried this approach from the beginning I could have reduced the complexity. I didn’t go back and try to clean it up. What is left is a complex formula which I’m sure could be simplified or done better. But this is what I have for now.

** The largest residual magnitude (Rpete) is 0.0075
** The largest fractional residual magnitude (Rpete/Y) is 1.7% over the entire range up to 0.9995. That worst performance occurs at 0.999. Below 0.9965 all fractional residuals are less than 1% of Y

If one wanted, one could start with my solution and look at the residuals and attempt to add terms to minimize those residuals (that’s what I did to Greg’s solution).

The spreadsheet is here:
(scroll to bottom of sheet 1 to see summary statistics)

The formulas are
Ygreg=1/($B$3*SQRT($B$4-C3)+$B$5+$B$6*SQRT(SQRT($B$7-C3)))

Ypete = Ygreg+$B$8/($B$11+(C3-0.995)^2) + $B$9*(1-C3)*EXP($B$10*(1-C3)) + $B$12*(0.5-C3)^2*(ATAN(1000*(C3-0.5))+PI()/2)+$B$13*EXP($B$15*C3)

where cells are as follows:
C3 is x (copy down as relative cell reference)
B3 50.63746184 (remaining items used as absolute reference)
B4 1.000239763
B5 0.823257421
B6 -0.006540995
B7 3.654165646
B8 0.015665365
B9 772.441016
B10 -2859.484503
B11 1.23573751
B12 0.008220401
B13 -0.002908095
B14 -10

=====================================
Eng-tips forums: The best place on the web for engineering discussions.
 
Why should he accept e2 any more than any other number?

Your suggestion, unfortunately, is naive. There should be no expectation of seeing e, or pi, or whatever, in real data fits.

You have yet to even describe what the data comes from. Without that critical information, there can be no meaningful discussion about the meaning and validity of the fitting coefficients.

TTFN



 
"centered" on E5 and E2 was not good terminology. Those functions (X^1/2 and X^1/4) are shifted along the x coordinate by E5 and E2.

=====================================
Eng-tips forums: The best place on the web for engineering discussions.
 
Looking at the original data, I think I could come up with a much simpler solution than I gave above starting again from scratch with simple functions:
x^n (n=>1)
(x+p)^n
(1-x)^m (m<=0)
a^(b*x)
Maybe log-log plot would give some clues. I'll give it a try when I have some time.

=====================================
Eng-tips forums: The best place on the web for engineering discussions.
 
Nope, I was wrong. Starting from scratch I couldn't find any simple way to fit the data.

=====================================
Eng-tips forums: The best place on the web for engineering discussions.
 
I took the inverse of the y values to give me a less alarming function. It looked like a parabola. It didn't quite fit so I chucked in another term. Then I used solver to 'minimise' the errors in the coefficients. Brutal, but quick. As melone pointed out, not a mathematically 'good' result.

Now, /if/ this is a Hertzian contact stress problem (which it intuitively looked like to me) then there are a wealth of papers on this stuff, including a good derivation in Timoshenko. I suspect that one problem is that we are looking at (say) force vs deflection, the mathematical derivation is more likely to be based on rate (stiffness) versus axial displacement. So, the data we are fitting is the derivative of a rapidly changing function. Anyway, if you understand what the problem is then you will be able to find some meaning for the terms, and maybe even justify my admittedly arbitrary choice of exponents.

Cheers

Greg Locock

Please see FAQ731-376 for tips on how to make the best use of Eng-Tips.
 
OK. You've got me intrigued.

Data appears to have two dominating characteristics: for small x it is close to a straight line with a positive intercept and a shallow slope; as x approaches 1 we have a vertical asymptote.

A model that satisfies these two is
Y = a + b*X +c*[sec(pi*X/2)]^d
for which Excel's Solver, when fed all the points, homes in on
a = -0.017917
b = 0.0083520
c = 0.042542
d = -0.36422
For these values, the maximum "error" over the full set of given points is 0.0032 in absolute terms or 4% in relative terms (at different points).

The maximum errors are larger that ElectricPete is citing above, but an argument can be mounted that my approach honours the asymptote (and the asymptote is a given, based on FreddyMusic's data).

And, I've even got a pi in there.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor