python - statsmodels: specifying non-linear regression models using patsy -


i trying calculate non-linear regression models using statsmodles. in particular have problems learning patsy syntax.

is there tutorial or example how formulate non-linear models using patsy syntax?

in particular, how non-linear model on example (http://statsmodels.sourceforge.net/devel/examples/generated/example_ols.html) specified using patsy?

thank in advance

andy

patsy isn't useful fitting general non-linear models, models on page link special sort of non-linear model -- they're using linear model fitting method (ols), , applying non-linear transformations of basic variables. standard , useful trick combine multiple non-linear transformations of same variable in order fit more general curves. this, patsy useful.

what want know how express variable transformations in patsy. pretty easy. way patsy works, given formula string "x1 + x2:x3", scans through , interprets special patsy operators + , :, , stuff that's left on (x1, x2, x3) interpreted arbitrary python code. can write "np.sin(x1) + np.log(x2):x3" or whatever.

the thing watch out if want write transformation uses python operators clash patsy operators. like, if want use + or ** in transformation, have careful make sure patsy doesn't interpret itself, , leaves them python. trick here patsy ignore operators appear inside function call (or other complex python expression patsy doesn't understand, function calls). if write "x1 + np.log(x2 + x3)", patsy treat 2 predictors, x1 , np.log(x2 + x3) -- can see interpreted first +, left second 1 alone python interpret.

but if wanted to, say, add 2 variables , use them predictor, without taking log? well, know already, can come simple hack: define function returns input (the identity function), , call it, like: "x1 + i(x2 + x3)". function call i(...) prevent patsy seeing second +, when evaluate term i(x2 + x3) same x2 plus x3.

and helpfully, patsy automatically provides function called i() works this, available use.

now know need know reproduce examples on page. first one, formula "x + i(x**2)". second, formula "x + np.sin(x) + i((x - 5)**2)".

and last example, it's easiest use patsy's built-in categorical coding support: "x + c(groups)". (here c special built-in function lets adjust how categorical data coded. here we're using tell patsy though groups looks numerical vector -- values 0, 1, 2 -- in fact should treat being categorical, each value representing different group. patsy applies default categorical coding)


Comments

Popular posts from this blog

blackberry 10 - how to add multiple markers on the google map just by url? -

php - guestbook returning database data to flash -

delphi - Dynamic file type icon -