Linear Relationship vs Correlation












2












$begingroup$


I am new to machine learning, and I'm trying to cover some of the basics. One of the assumptions of linear regression is a linear relationship.



However on Reddit I was told today that no machine learning model requires a correlation between any of the predictors and the output.
My question is is there a difference between correlation, and a linear relationship?










share|cite|improve this question











$endgroup$












  • $begingroup$
    Regarding the connection between simple linear regression and correlation, there are some useful answers on SE. See e. g. the answers to this Q and this answer.
    $endgroup$
    – statmerkur
    Dec 28 '18 at 10:43
















2












$begingroup$


I am new to machine learning, and I'm trying to cover some of the basics. One of the assumptions of linear regression is a linear relationship.



However on Reddit I was told today that no machine learning model requires a correlation between any of the predictors and the output.
My question is is there a difference between correlation, and a linear relationship?










share|cite|improve this question











$endgroup$












  • $begingroup$
    Regarding the connection between simple linear regression and correlation, there are some useful answers on SE. See e. g. the answers to this Q and this answer.
    $endgroup$
    – statmerkur
    Dec 28 '18 at 10:43














2












2








2





$begingroup$


I am new to machine learning, and I'm trying to cover some of the basics. One of the assumptions of linear regression is a linear relationship.



However on Reddit I was told today that no machine learning model requires a correlation between any of the predictors and the output.
My question is is there a difference between correlation, and a linear relationship?










share|cite|improve this question











$endgroup$




I am new to machine learning, and I'm trying to cover some of the basics. One of the assumptions of linear regression is a linear relationship.



However on Reddit I was told today that no machine learning model requires a correlation between any of the predictors and the output.
My question is is there a difference between correlation, and a linear relationship?







regression correlation






share|cite|improve this question















share|cite|improve this question













share|cite|improve this question




share|cite|improve this question








edited Dec 27 '18 at 23:23









kjetil b halvorsen

30.4k983220




30.4k983220










asked Dec 27 '18 at 23:17









Jweir136Jweir136

111




111












  • $begingroup$
    Regarding the connection between simple linear regression and correlation, there are some useful answers on SE. See e. g. the answers to this Q and this answer.
    $endgroup$
    – statmerkur
    Dec 28 '18 at 10:43


















  • $begingroup$
    Regarding the connection between simple linear regression and correlation, there are some useful answers on SE. See e. g. the answers to this Q and this answer.
    $endgroup$
    – statmerkur
    Dec 28 '18 at 10:43
















$begingroup$
Regarding the connection between simple linear regression and correlation, there are some useful answers on SE. See e. g. the answers to this Q and this answer.
$endgroup$
– statmerkur
Dec 28 '18 at 10:43




$begingroup$
Regarding the connection between simple linear regression and correlation, there are some useful answers on SE. See e. g. the answers to this Q and this answer.
$endgroup$
– statmerkur
Dec 28 '18 at 10:43










1 Answer
1






active

oldest

votes


















3












$begingroup$


One of the assumptions of linear regression is a linear relationship.




There is a fairly common confusion on this matter that makes the scope of linear regression look narrower than it actually is. In regression analysis, we model the expected value of a response variable $Y_i$ conditional on some regressors $mathbb{x}_i$. In general, we write the response variables as:



$$Y_i = mathbb{E}(Y_i|mathbb{x}_i) + varepsilon_i quad quad quad varepsilon_i equiv Y_i - mathbb{x}_i,$$



where the first part is the true regression function and the second part is the error term. (This model form implies that $mathbb{E}(varepsilon_i | mathbf{x}_i) = 0$.) In a linear regression we assume that the true regression function is a linear function of the parameter vector $boldsymbol{beta} = (beta_0,...,beta_m)$. This gives us the model form:



$$Y_i = sum_{k=0}^m beta_k x_{i,k}^* + varepsilon_i quad quad quad x_{i,k}^* equiv f_k(mathbf{x}_i).$$



You can see from this model form that we can transform the original regressors $mathbf{x}_i$ via any transform we want (including a non-linear transform). Hence, tThe important thing to notice about this is that linear regression does not necessarily assume linearity with respect to the regressor variables. The "linear" in linear regression comes from the fact that the model is linear with respect to the parameters in the regression function. Nonlinear regression occurs when the regression function has one or more parameters that cannot be linearised.




...is there a difference between correlation, and a linear relationship?




Correlation is a measure of the strength of a linear relationship between two variables. It occurs as a special case of linear regression. If we have use a simple linear regression model under standard assumptions then we have a single regressor $x_i$, with no transformation of this variable. The simple linear regression model is:



$$Y_i = beta_0 + beta_1 x_{i} + varepsilon_i quad quad quad varepsilon_i sim text{N}(0, sigma^2).$$



If we fit the simple linear regression model using ordinary least squares (OLS) estimation (the standard estimation method) then we get a coefficient of determination that is equal to the square of the sample correlation between the $y_i$ and $x_i$ values. This gives a close connection between simple linear regression and sample correlation analysis.






share|cite|improve this answer









$endgroup$













    Your Answer





    StackExchange.ifUsing("editor", function () {
    return StackExchange.using("mathjaxEditing", function () {
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
    });
    });
    }, "mathjax-editing");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "65"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f384699%2flinear-relationship-vs-correlation%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    3












    $begingroup$


    One of the assumptions of linear regression is a linear relationship.




    There is a fairly common confusion on this matter that makes the scope of linear regression look narrower than it actually is. In regression analysis, we model the expected value of a response variable $Y_i$ conditional on some regressors $mathbb{x}_i$. In general, we write the response variables as:



    $$Y_i = mathbb{E}(Y_i|mathbb{x}_i) + varepsilon_i quad quad quad varepsilon_i equiv Y_i - mathbb{x}_i,$$



    where the first part is the true regression function and the second part is the error term. (This model form implies that $mathbb{E}(varepsilon_i | mathbf{x}_i) = 0$.) In a linear regression we assume that the true regression function is a linear function of the parameter vector $boldsymbol{beta} = (beta_0,...,beta_m)$. This gives us the model form:



    $$Y_i = sum_{k=0}^m beta_k x_{i,k}^* + varepsilon_i quad quad quad x_{i,k}^* equiv f_k(mathbf{x}_i).$$



    You can see from this model form that we can transform the original regressors $mathbf{x}_i$ via any transform we want (including a non-linear transform). Hence, tThe important thing to notice about this is that linear regression does not necessarily assume linearity with respect to the regressor variables. The "linear" in linear regression comes from the fact that the model is linear with respect to the parameters in the regression function. Nonlinear regression occurs when the regression function has one or more parameters that cannot be linearised.




    ...is there a difference between correlation, and a linear relationship?




    Correlation is a measure of the strength of a linear relationship between two variables. It occurs as a special case of linear regression. If we have use a simple linear regression model under standard assumptions then we have a single regressor $x_i$, with no transformation of this variable. The simple linear regression model is:



    $$Y_i = beta_0 + beta_1 x_{i} + varepsilon_i quad quad quad varepsilon_i sim text{N}(0, sigma^2).$$



    If we fit the simple linear regression model using ordinary least squares (OLS) estimation (the standard estimation method) then we get a coefficient of determination that is equal to the square of the sample correlation between the $y_i$ and $x_i$ values. This gives a close connection between simple linear regression and sample correlation analysis.






    share|cite|improve this answer









    $endgroup$


















      3












      $begingroup$


      One of the assumptions of linear regression is a linear relationship.




      There is a fairly common confusion on this matter that makes the scope of linear regression look narrower than it actually is. In regression analysis, we model the expected value of a response variable $Y_i$ conditional on some regressors $mathbb{x}_i$. In general, we write the response variables as:



      $$Y_i = mathbb{E}(Y_i|mathbb{x}_i) + varepsilon_i quad quad quad varepsilon_i equiv Y_i - mathbb{x}_i,$$



      where the first part is the true regression function and the second part is the error term. (This model form implies that $mathbb{E}(varepsilon_i | mathbf{x}_i) = 0$.) In a linear regression we assume that the true regression function is a linear function of the parameter vector $boldsymbol{beta} = (beta_0,...,beta_m)$. This gives us the model form:



      $$Y_i = sum_{k=0}^m beta_k x_{i,k}^* + varepsilon_i quad quad quad x_{i,k}^* equiv f_k(mathbf{x}_i).$$



      You can see from this model form that we can transform the original regressors $mathbf{x}_i$ via any transform we want (including a non-linear transform). Hence, tThe important thing to notice about this is that linear regression does not necessarily assume linearity with respect to the regressor variables. The "linear" in linear regression comes from the fact that the model is linear with respect to the parameters in the regression function. Nonlinear regression occurs when the regression function has one or more parameters that cannot be linearised.




      ...is there a difference between correlation, and a linear relationship?




      Correlation is a measure of the strength of a linear relationship between two variables. It occurs as a special case of linear regression. If we have use a simple linear regression model under standard assumptions then we have a single regressor $x_i$, with no transformation of this variable. The simple linear regression model is:



      $$Y_i = beta_0 + beta_1 x_{i} + varepsilon_i quad quad quad varepsilon_i sim text{N}(0, sigma^2).$$



      If we fit the simple linear regression model using ordinary least squares (OLS) estimation (the standard estimation method) then we get a coefficient of determination that is equal to the square of the sample correlation between the $y_i$ and $x_i$ values. This gives a close connection between simple linear regression and sample correlation analysis.






      share|cite|improve this answer









      $endgroup$
















        3












        3








        3





        $begingroup$


        One of the assumptions of linear regression is a linear relationship.




        There is a fairly common confusion on this matter that makes the scope of linear regression look narrower than it actually is. In regression analysis, we model the expected value of a response variable $Y_i$ conditional on some regressors $mathbb{x}_i$. In general, we write the response variables as:



        $$Y_i = mathbb{E}(Y_i|mathbb{x}_i) + varepsilon_i quad quad quad varepsilon_i equiv Y_i - mathbb{x}_i,$$



        where the first part is the true regression function and the second part is the error term. (This model form implies that $mathbb{E}(varepsilon_i | mathbf{x}_i) = 0$.) In a linear regression we assume that the true regression function is a linear function of the parameter vector $boldsymbol{beta} = (beta_0,...,beta_m)$. This gives us the model form:



        $$Y_i = sum_{k=0}^m beta_k x_{i,k}^* + varepsilon_i quad quad quad x_{i,k}^* equiv f_k(mathbf{x}_i).$$



        You can see from this model form that we can transform the original regressors $mathbf{x}_i$ via any transform we want (including a non-linear transform). Hence, tThe important thing to notice about this is that linear regression does not necessarily assume linearity with respect to the regressor variables. The "linear" in linear regression comes from the fact that the model is linear with respect to the parameters in the regression function. Nonlinear regression occurs when the regression function has one or more parameters that cannot be linearised.




        ...is there a difference between correlation, and a linear relationship?




        Correlation is a measure of the strength of a linear relationship between two variables. It occurs as a special case of linear regression. If we have use a simple linear regression model under standard assumptions then we have a single regressor $x_i$, with no transformation of this variable. The simple linear regression model is:



        $$Y_i = beta_0 + beta_1 x_{i} + varepsilon_i quad quad quad varepsilon_i sim text{N}(0, sigma^2).$$



        If we fit the simple linear regression model using ordinary least squares (OLS) estimation (the standard estimation method) then we get a coefficient of determination that is equal to the square of the sample correlation between the $y_i$ and $x_i$ values. This gives a close connection between simple linear regression and sample correlation analysis.






        share|cite|improve this answer









        $endgroup$




        One of the assumptions of linear regression is a linear relationship.




        There is a fairly common confusion on this matter that makes the scope of linear regression look narrower than it actually is. In regression analysis, we model the expected value of a response variable $Y_i$ conditional on some regressors $mathbb{x}_i$. In general, we write the response variables as:



        $$Y_i = mathbb{E}(Y_i|mathbb{x}_i) + varepsilon_i quad quad quad varepsilon_i equiv Y_i - mathbb{x}_i,$$



        where the first part is the true regression function and the second part is the error term. (This model form implies that $mathbb{E}(varepsilon_i | mathbf{x}_i) = 0$.) In a linear regression we assume that the true regression function is a linear function of the parameter vector $boldsymbol{beta} = (beta_0,...,beta_m)$. This gives us the model form:



        $$Y_i = sum_{k=0}^m beta_k x_{i,k}^* + varepsilon_i quad quad quad x_{i,k}^* equiv f_k(mathbf{x}_i).$$



        You can see from this model form that we can transform the original regressors $mathbf{x}_i$ via any transform we want (including a non-linear transform). Hence, tThe important thing to notice about this is that linear regression does not necessarily assume linearity with respect to the regressor variables. The "linear" in linear regression comes from the fact that the model is linear with respect to the parameters in the regression function. Nonlinear regression occurs when the regression function has one or more parameters that cannot be linearised.




        ...is there a difference between correlation, and a linear relationship?




        Correlation is a measure of the strength of a linear relationship between two variables. It occurs as a special case of linear regression. If we have use a simple linear regression model under standard assumptions then we have a single regressor $x_i$, with no transformation of this variable. The simple linear regression model is:



        $$Y_i = beta_0 + beta_1 x_{i} + varepsilon_i quad quad quad varepsilon_i sim text{N}(0, sigma^2).$$



        If we fit the simple linear regression model using ordinary least squares (OLS) estimation (the standard estimation method) then we get a coefficient of determination that is equal to the square of the sample correlation between the $y_i$ and $x_i$ values. This gives a close connection between simple linear regression and sample correlation analysis.







        share|cite|improve this answer












        share|cite|improve this answer



        share|cite|improve this answer










        answered Dec 28 '18 at 1:37









        BenBen

        25.1k227119




        25.1k227119






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Cross Validated!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            Use MathJax to format equations. MathJax reference.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f384699%2flinear-relationship-vs-correlation%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Bressuire

            Cabo Verde

            Gyllenstierna