Azzera filtri
Azzera filtri

Regular expression help for capturing tokens from a c++ if_else function block

1 visualizzazione (ultimi 30 giorni)
I am trying to convert some c++ code matlab code and need help because trying to capture the conditions on the if_else statements. The sample code is posted below, normally it is much longer and contains many sets of the same repeating piecewise constraints with different functions to evaluate.
T[4][0] = if (x <= 2.0E2) {
t4 = x*-7.43939368315E2;
} else {
if (2.0E4 < x) {
t4 = x*-3.15202357052E2;
} else {
if ((2.0E2 < x) && (x <= 1.0E3)) {
t4 = -x*(x*-3.3581574807E-2+log(x)*4.7023733515E1+(log(x)*2.400503067842E3)/x+1.09445880410878E5/x+1.0/(x*x)*6.1158591370349E4+(x*x)*1.9946376054E-5-(x*x*x)*7.445595608E-9+(x*x*x*x)*1.243670758E-12-1.11582896139E2);
} else {
if ((1.0E3 < x) && (x <= 6.0E3)) {
t4 = -x*(x*-2.3265021934E-3+log(x)*4.8602554649E1+(log(x)*1.5974720059903E4)/x+3.623374787802E4/x+1.0/(x*x)*1.897212861710375E6+(x*x)*1.9151215358E-7-(x*x*x)*1.2237095959E-11+(x*x*x*x)*3.95116007E-16-1.62570932876E2);
} else {
if ((6.0E3 < x) && (x <= 2.0E4)) {
t4 = -x*(x*-1.6249864554E-1+log(x)*2.049900452325E3+(log(x)*6.16116287553448E6)/x-4.067298995421662E7/x+1.0/(x*x)*(5.126459124735035E23/1.40737488355328E14)+(x*x)*4.514672712E-6-(x*x*x)*9.025238189E-11+(x*x*x*x)*8.209541318E-16-1.8977498210781E4);
} else {
t4 = NAN;
}
}
}
}
};
T[5][0] = if (x <= 2.0E2) {
t5 = x*-6.99993596709E2;
} else {
if (6.0E3 < x) {
t5 = x*-2.95957496736E2;
} else {
if ((2.0E2 < x) && (x <= 1.0E3)) {
t5 = x*(x*-5.9732340052E-2+log(x)*1.344708739E1+(log(x)*6.623440520686E3)/x-1.30277758259044E5/x+1.0/(x*x)*1.93975305124744E5+(x*x)*2.359195784E-5-(x*x*x)*7.135910089E-9+(x*x*x*x)*1.0512057259E-12-2.8910827165E2);
} else {
if ((1.0E3 < x) && (x <= 6.0E3)) {
t5 = -x*(x*1.8859601673E-5+log(x)*3.6779467978E1+(log(x)*2.62795681346E2)/x+1.11227336090838E5/x-1.0/(x*x)*7.24982080986207E5+(x*x)*4.872002157E-8-(x*x*x)*9.084382373E-12+(x*x*x*x)*6.625791502E-16-4.3668943967E1);
} else {
t5 = NAN;
}
}
}
}
I have tried (and other variations)
' tokens = regexp(funcode,'if\s\((.+)\)\s\{','tokens') '
but it captures the whole segment after the first 'if (' and ends with the last ') {'
I would also like to eventually capture tokens for the expressions for 't4 = ... ' etc with each condition.
Any help would be greatly appreciated. P.S. Matlab needs to make MatlabFunction() work for piecewise symbolic functions.

Risposta accettata

Walter Roberson
Walter Roberson il 12 Lug 2011
Your most immediate problem is that .+ captures as many characters as possible and then backtracks only as much as is necessary to match the rest of the expression. If you use .+? then that will capture only as many characters as are necessary to match the rest of the expression.
However, you have a deeper problem that you really only want to stop when you encounter the balancing ')'. Determining whether a delimiter is balanced or not is something that is known to not be theoretically possible in pure regular expressions. MATLAB's "regular expressions" are, though, extensions to the standard regular expressions. MATLAB's expressions have much in common with Perl's "regular expressions", and it is possible in Perl to find the balancing delimiter. It has been a number of years since I looked at the relevant (tricky) Perl code; I think it is possible in the regular expressions that MATLAB provides, but I would not want to try to reinvent the technique -- too ugly and hard to debug.
The easiest thing to do might be to use MATLAB's perl() command to call a perl routine to do the parsing for you, having looked in the Perl FAQ to find the mechanism.
  1 Commento
Joseph
Joseph il 12 Lug 2011
Haha, that might work. Though I have no prior experience in perl. It is funny because I have been trying to develop this code to turn arrays of piecewise symbolic functions into matlab code, but to do it I have to create strings of function blocks from ccode(). ccode() is able to generate code from piecewise symbolic functions for example when defining specific heats or free energies over various temperature ranges when I need it for numeric problems such as constrained minimization. I have some code that works fine making the matlab functions, but is redundant because it retains all the redundant evaluations of the correct interval for each expression when they can all be grouped into one.

Accedi per commentare.

Più risposte (1)

Oleg Komarov
Oleg Komarov il 12 Lug 2011
tokens = regexp(s,'if\ ([\(\)\w\ \.><=&]+)\s+{','tokens')

Categorie

Scopri di più su Characters and Strings in Help Center e File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by