จะบอกความแตกต่างระหว่างตัวแบบการถดถอยเชิงเส้นและแบบไม่เชิงเส้นได้อย่างไร?


27

ผมอ่านลิงค์ต่อไปนี้ไม่ใช่การถดถอยเชิงเส้นSAS องค์กรไม่เชิงเส้น ความเข้าใจของฉันจากการอ่านส่วนแรก "การถดถอยเชิงเส้นกับการถดถอยเชิงเส้น" คือว่าสมการด้านล่างนี้เป็นจริงแล้วการถดถอยเชิงเส้นนั้นถูกต้องหรือไม่ ถ้าเป็นเช่นนั้นทำไม

y=b1x3+b2x2+b3x+c

ฉันต้องเข้าใจด้วยหรือไม่ว่าในการถดถอยเชิงเส้นพหุสัมพันธ์ไม่ใช่ปัญหา? ฉันรู้ว่า multicollinearity สามารถเป็นปัญหาในการถดถอยเชิงเส้นได้ดังนั้นหากโมเดลข้างต้นเป็นจริงแล้วการถดถอยเชิงเส้นจะมีความหลากหลายทางชีวภาพหรือไม่


ที่เกี่ยวข้องอย่างใกล้ชิดstats.stackexchange.com/questions/33876
whuber

สิ่งที่เกี่ยวข้องด้วย: “ curvilinear” หมายถึงอะไร?
gung - Reinstate Monica

คำตอบ:


35

มีอย่างน้อยสามความรู้สึกในการถดถอยซึ่งถือได้ว่าเป็น "เส้นตรง" ในการแยกแยะพวกมันมาเริ่มกันที่แบบจำลองการถดถอยทั่วไป

Y=f(X,θ,ε).

เพื่อให้การอภิปรายง่ายขึ้นให้ใช้ตัวแปรอิสระเพื่อแก้ไขและวัดอย่างแม่นยำ (แทนที่จะเป็นตัวแปรสุ่ม) พวกเขาจำลองnสังเกตของพีแอตทริบิวต์แต่ละให้สูงขึ้นเพื่อnเวกเตอร์ของการตอบสนองY ตามอัตภาพXแสดงเป็นเมทริกซ์n × pและYเป็นคอลัมน์n -vector (ที่ จำกัดQเวกเตอร์) θประกอบด้วยพารามิเตอร์ εเป็นตัวแปรสุ่มที่มีค่าเวกเตอร์ มันมักจะมีnXnpnYXn×pYnqθεnส่วนประกอบ แต่บางครั้งก็มีน้อย ฟังก์ชั่นคือค่าเวกเตอร์ (โดยมีส่วนประกอบ nเพื่อให้ตรงกับ YfnY) และมักจะถือว่าต่อเนื่องในการขัดแย้งสองครั้งสุดท้าย ( θและε )

ตัวอย่างต้นแบบของการปรับข้อมูลให้สอดคล้องกับข้อมูลเป็นกรณีที่Xเป็นเวกเตอร์ของตัวเลข( x i(x,y)X(xi,i=1,2,,n)Yn numbers (yi); θ=(α,β) gives the intercept α and slope β; and ε=(ε1,ε2,,εn) is a vector of "random errors" whose components are independent (and usually assumed to have identical but unknown distributions of mean zero). In the preceding notation,

yi=α+βxi+εi=f(X,θ,ε)i

with θ=(α,β).

The regression function may be linear in any (or all) of its three arguments:

  • "Linear regression, or a "linear model," ordinarily means that f is linear as a function of the parameters θ. The SAS meaning of "nonlinear regression" is in this sense, with the added assumption that f is differentiable in its second argument (the parameters). This assumption makes it easier to find solutions.

  • A "linear relationship between X and Y" means f is linear as a function of X.

  • A model has additive errors when f is linear in ε. In such cases it is always assumed that E(ε)=0. (Otherwise, it wouldn't be right to think of ε as "errors" or "deviations" from "correct" values.)

Every possible combination of these characteristics can happen and is useful. Let's survey the possibilities.

  1. A linear model of a linear relationship with additive errors. This is ordinary (multiple) regression, already exhibited above and more generally written as

    Y=Xθ+ε.

    X has been augmented, if necessary, by adjoining a column of constants, and θ is a p-vector.

  2. A linear model of a nonlinear relationship with additive errors. This can be couched as a multiple regression by augmenting the columns of X with nonlinear functions of X itself. For instance,

    yi=α+βxi2+ε

    θ=(α,β)(1,xi2) even though xi2 is a nonlinear function of xi.

  3. A linear model of a linear relationship with nonadditive errors. An example is multiplicative error,

    yi=(α+βxi)εi.

    (In such cases the εi can be interpreted as "multiplicative errors" when the location of εi is 1. However, the proper sense of location is not necessarily the expectation E(εi) anymore: it might be the median or the geometric mean, for instance. A similar comment about location assumptions applies, mutatis mutandis, in all other non-additive-error contexts too.)

  4. A linear model of a nonlinear relationship with nonadditive errors. E.g.,

    yi=(α+βxi2)εi.
  5. โมเดลที่ไม่เชิงเส้นของความสัมพันธ์เชิงเส้นพร้อมข้อผิดพลาดเพิ่มเติม โมเดลที่ไม่ใช่เชิงเส้นเกี่ยวข้องกับการรวมกันของพารามิเตอร์ที่ไม่เพียง แต่เป็นแบบไม่เชิงเส้นเท่านั้น แต่ยังไม่สามารถทำให้เป็นเส้นตรงด้วยการแสดงพารามิเตอร์อีกครั้ง

    • เป็นตัวอย่างที่ไม่ใช่พิจารณา

      yi=αβ+β2xi+εi.

      α=αββ=β2β0, this model can be rewritten

      yi=α+βxi+εi,

      exhibiting it as a linear model (of a linear relationship with additive errors).

    • As an example, consider

      yi=α+α2xi+εi.

      It is impossible to find a new parameter α, depending on α, that will linearize this as a function of α (while keeping it linear in xi as well).

  6. A nonlinear model of a nonlinear relationship with additive errors.

    yi=α+α2xi2+εi.
  7. A nonlinear model of a linear relationship with nonadditive errors.

    yi=(α+α2xi)εi.
  8. A nonlinear model of a nonlinear relationship with nonadditive errors.

    yi=(α+α2xi2)εi.

Although these exhibit eight distinct forms of regression, they do not constitute a classification system because some forms can be converted into others. A standard example is the conversion of a linear model with nonadditive errors (assumed to have positive support)

yi=(α+βxi)εi

into a linear model of a nonlinear relationship with additive errors via the logarithm,

log(yi)=μi+log(α+βxi)+(log(εi)μi)

Here, the log geometric mean μi=E(log(εi)) has been removed from the error terms (to ensure they have zero means, as required) and incorporated into the other terms (where its value will need to be estimated). Indeed, one major reason to re-express the dependent variable Y is to create a model with additive errors. Re-expression can also linearize Y as a function of either (or both) of the parameters and explanatory variables.


Collinearity

Collinearity (of the column vectors in X) can be an issue in any form of regression. The key to understanding this is to recognize that collinearity leads to difficulties in estimating the parameters. Abstractly and quite generally, compare two models Y=f(X,θ,ε) and Y=f(X,θ,ε) where X is X with one column slightly changed. If this induces enormous changes in the estimates θ^ and θ^, then obviously we have a problem. One way in which this problem can arise is in a linear model, linear in X (that is, types (1) or (5) above), where the components of θ are in one-to-one correspondence with the columns of X. When one column is a non-trivial linear combination of the others, the estimate of its corresponding parameter can be any real number at all. That is an extreme example of such sensitivity.

From this point of view it should be clear that collinearity is a potential problem for linear models of nonlinear relationships (regardless of the additivity of the errors) and that this generalized concept of collinearity is potentially a problem in any regression model. When you have redundant variables, you will have problems identifying some parameters.


can you recommend a concise, introductory reading that will help me get a better sense of the linearization you mention, which is the heart of the difference between your example and non-example in point 5. Thank you.
ColorStatistics

@Color I'm not familiar with any. Under mild assumptions about the differentiability of possible transformations, this is addressed by the theory of Partial Differential Equations (PDEs).
whuber

0

You should start right now by making a difference between reality and the model you're using to describe it

The equation you just mentionned is a polynomial equation (x^power) ie. non-linear ... but you can still model it using a generlized linear model (using a link function) or polynomail regression since the parameters are linear (b1, b2, b3, c)

hope that helped, it actually is a bit sketchy : reality/model


3
This can be estimated via ordinary least squares since model is linear in parameters.
Analyst

so its all to do with the parameters? if we b3^2 * x it would still be linear?
mHelpMe

0

A model is linear if it is linear in parameters or can be transformed to be linear in parameters (linearizable). Linear models can model linear or non-linear relationships. Let's expand on each of these.

A model is linear in parameters if it can be written as the sum of terms, where each term is either a constant or a parameter multiplying a predictor (Xi):

enter image description here

Note that this definition is very narrow. Only the models meeting this definition are linear. Every other model, is non-linear.

There are a two types of linear models that are confused for non-linear models:

1. Linear models of non-linear relationships

For example, the model below models a non-linear relationship (because the derivative of Y with respect to X1 is a function of X1). By creating a new variable W1=X12, and re-writing the equation with W1 replacing X12, we have an equation that satisfies the definition of a linear model.

enter image description here

2. Models that aren't immediately linear but can become linear after a transformation (linearizable). Below are 2 examples of linearizable models:

Example 1:

enter image description here

This model may appear to be non-linear because it does not meet the definition of a model that is linear in parameters, however it can be transformed into a linear model hence it is linearizable/transformably linear, and is thus considered to be a linear model. The following transformations would linearize it. Start by taking the natural logarithm of both sides to obtain:

enter image description here

then make the following substitutions:

enter image description here

to obtain the linear model below:

enter image description here

Example 2:

enter image description here

This model may appear to be non-linear because it does not meet the definition of a model that is linear in parameters, however it can be transformed into a linear model hence it is linearizable/transformably linear, and is thus considered to be a linear model. The following transformations would linearize it. Start by taking the reciprocal of both sides to obtain:

enter image description here

then make the following substitutions:

enter image description here

to obtain the linear model below:

enter image description here

Any model that is not linear (not even through linearization) is non-linear. Think of it this way: If a model does not meet the definition of a linear model then it is a non-linear model, unless it can be proven to be linearizable, at which point it earns the right to be called a linear model.

Whuber's answer above as well as the Glen_b's answer in this link will add more color to my answer. Nonlinear vs. generalized linear model: How do you refer to logistic, Poisson, etc. regression?

โดยการใช้ไซต์ของเรา หมายความว่าคุณได้อ่านและทำความเข้าใจนโยบายคุกกี้และนโยบายความเป็นส่วนตัวของเราแล้ว
Licensed under cc by-sa 3.0 with attribution required.