ROSALI-SIM/Modules/ado/plus/p/prgen.hlp

{smcl}
{* 06Feb2010}{...}
{hline}
help for {hi:prgen}{right:06Feb2010}
{hline}

{title:Generate predicted values and confidence intervals for regression models}

{p 4 4 2}
To compute the predicted values with all variables but varname
held at values specified by x() and rest(). The program extends
{cmd:prgen} by allowing you to generate variables containing upper
and lower bounds for confidence intervals and marginal effects.

{p 8 15 2}{cmd:prgen} varname,
{cmdab:g:enerate(}{it:newvar}{cmd:)}
[{cmdab:f:rom(}{it:#}{cmd:)}
{cmdab:t:o(}{it:#}{cmd:)}]
[{cmd:x(}{it:variables_and_values}{cmd:)}
{cmdab:r:est(}{it:stat}{cmd:)}
{cmd:all}]
[{cmdab:b:rief}
{cmdab:max:cnt(}{it:#}{cmd:)}
{cmdab:noba:se}
{cmdab:nola:bel}
{cmdab:n:cases(}{it:#}{cmd:)}
{cmd:gap(}{it:#}{cmd:)}
{cmdab:noi:sily}
{cmdab:mar:ginal}
{cmdab:con:ditional}]
[{cmd:ci}
{it:prvalue_options}]

{p 4 4 2}
where {it:variables_and_values} is an alternating list of variables
and either numeric values or mean, median, min, max, upper, lower,
previous.

{p 4 4 2}
{it:stat} is either mean, median, min, max, upper, lower, previous,
grmean (group mean), grmedian, grmin, grmax.

{p 4 4 2}
See {help prvalue} for options that can be specified for computing
confidence intervals.

{title: Description}

{p 4 4 2}
{cmd:prgen} computes predicted values and confidence intervals
for regression with continuous, categorical, and count outcomes
in a way that is useful for making plots. Predicted values are computed
for the case in which one independent variable varies over a specified
range while the others are held constant. You can request variables
containing upper and lower bounds for these variables. You can also
create a variable containing the marginal change in the outcome with
respect to the specified variable, holding other variabels constant. New
variables are added to the existing dataset that contain these
predicted values that can be plotted.

{p 4 4 2}
Note: The new variables will contain data for the first k observations
in the dataset, where k is 11 if not specified with the {cmd: ncases()}
option or if not determined by the {cmd:gap} option.

{title: Options}

{p 4 8 2}
{cmd:from()} and {cmd:to()} specify the values over which varname
should vary when calculating predicted values.  The defaults are the
observed minimum and maximum values.

{p 4 8 2}
{cmd:generate()} is up to five letters to name the created variables.
By changing the name you can run -prgen- repeatedly to compute predictions
with variables held at various values. It is best to chose a name that is
different from the beginning letters of variables in your data set.
This is required.

{p 4 8 2}
{cmd:ci} indicates that you want to generate confidence intervals
corresponding to the predictions made by {cmd:prgen}.

{p 4 8 2}
{cmd:marginal} indicates that you want to generate a variable
containing the marginal change in the outcome relative to varname,
holding all other variables constant.

{p 4 8 2}
{cmd:conditional} indicates that you want to generate conditional
predictions rather than unconditional predictions for {cmd:ztp} and
{cmd:ztnb} models.

{p 4 8 2}
{cmd:ncases} is the number of predicted values computed as varname
varies from the start value to the end value. If {cmd:Ncases} is not
specified, 11 points are generated.

{p 4 8 2}
{cmd:gap} is an alternative to {cmd:ncases}. You specify the gap or
size between tic marks and {cmd:prgen} determines if the specified
value divides evenly into the from-to range. If it does, {cmd:prgen}
determines the appropriate value for {cmd:ncases}.

{p 4 8 2}
{cmd:x()} sets the values of independent variables for calculating
predicted values.  The list must alternate variable names and values.
The values may be either numeric values or can be mean, median, min, max,
previous, upper, or lower.  The latter cannot be used if rest()
specifies a group summary statistic (e.g., grmean).

{p 4 8 2}
{cmd:rest()} sets the independent variables not specified in x() to
their {cmd:mean} (default), {cmd:minimum}, {cmd:maximum},
{cmd:median} when calculating predicted values.{cmd:grmean} sets these
independent variables to the mean conditional on the variables and
values specified in x(); {cmd:grmedian}, {cmd:grmax}, and {cmd:grmin}
can also be used. If {cmd:prvalue} has already been run after the last
estimate, {cmd:previous} will set unspecified variables to their prior values.
For models other than mlogit, {cmd:upper}and {cmd:lower} can be used to set
independent variables to their minimum or maximum depending on which will yield
the upper or lower extreme predicted value.

{p 4 8 2}
{cmd:maxcnt()} sets the maximum count for which variables are generated
for count models. The value must be an integer between 0 and 30;
the default is 9.

{p 4 8 2}
{cmd:all} specifies that any calculations of means, medians, etc.,
should use the entire sample instead of the sample used to estimate
the model.

{p 4 8 2}
{cmd:brief} and {cmd:nobase} suppress the base values of x in the output.

{p 4 8 2}
{cmd:nolabel} uses values rather than value labels in output.

{p 4 8 2}
{cmd:noisily} indicates that you want to see the output from
{cmd:prvalue}that was was used to generate the predicted values.

{p 4 8 2}
{it:prvalue_options} control the calculation of confidence intervals;
see {help prvalue} for details about these options.

{title:Models and Predictions - * is the prefix}

all models:

    *x: value of x

    logit & probit:
      Predicted probability of each outcome: *p0, *p1

    ologit, oprobit
        Predicted probabilities: *p#1,*p#2,... where #1,#2,... are values of
            the outcome variable.
        Cumulative probabilities: *s#1,*s#2,... where #1,#2,... are values
            of the outcome variable. *s#k is the probability of all
            categories up to or equal to #k.

    mlogit:
        Predicted probabilities: *p#1,*p#2,... where #1,#2,... are values of
            the outcome variable.

    poisson & nbreg:
        Predicted rate: *mu;
        Predicted probabilities: *p0, *p1... where 0, 1, are counts
        Cumulative probabilities: *s0, *s1... where 0, 1 are counts.
            The cumulative probability of a given count is probability of
            observing count less than or equal to that count.

    regress, tobit, cnreg, intreg
        Predicted xb: *xb

{title: Examples}

{p 4 4 2}
To compute predicted values and confidencen intervals from an ordered probit
where warm has four categories SD, D, A and SA:

{p 4 8 2}{cmd:.oprobit warm yr89 male white age ed prst}

{p 4 8 2}{cmd:.prgen age, f(20) t(80) gen(mn) ci delta}

{p 4 8 2}{cmd:.prgen age, x(male=0) f(20) t(80) gen(fem)}

{p 4 8 2}{cmd:.prgen age, x(male=1) f(20) t(80) gen(mal)}

{p 4 8 2}
To plot the predicted probabilites for average males:

{p 4 8 2}{cmd:.twoway connected malp1 malp2 malp3 malp4 malx}

{hline}

{p 2 4 2}Authors: J. Scott Long, Jeremy Freese & Jun Xu{p_end}
{p 11 4 2}{browse www.indiana.edu/~jslsoc/spost.htm}{p_end}
{p 11 4 2}spostsup@indiana.edu{p_end}