SOCP constraint vs Quadratic constraint - trust region methods












0












$begingroup$


I'm implementing code for performing sequential convex optimization based on trust regions, but I have some doubts because I haven't found a unique approach.



Suppose we want to minimize a convex function



$$J(x)$$
subject to



$$f(x) = 0$$
$$g(x) leq 0$$



and that we replace these constraints with



$$f(x_k)+nabla f(x_k) (x-x_k) = 0$$
$$g(x_k)+nabla g(x_k) (x-x_k) leq 0$$ .



to get meaningful results we impose



$$left|x - x_kright| < s$$



where $x in mathbb{R}^n$ is the solution I get at each iteration, and $s$ is a slack variable that we can include in the problem or define to bound the trust regions. This ensures that the algorithm moves towards better solutions while keeping limited steps, which ensure that the linearized forms are well representative of the nonlinear original constraints.



My question is the following: what's the difference between implementing



$$left|x - x_kright| leq s $$ (i.e., component-wise)



or



$$left|x - x_kright| leq s $$



or



$$(x - x_k)^T(x - x_k) leq s$$



Since the square root is a monotonic function, this should make no difference in terms of final solution between the last two, I guess? or am I mistaken?



moreover, I have seen in literature (starting from Nocedal's book) that some researchers use an updating rule for the radius of the trust region based on the ratio between change of augmented cost function in the actual nonlinear model and the change of augmented cost function in the prediction model:



$$rho = frac{ deltatilde J_{nonlin}}{delta tilde J_{lin}}$$



with $tilde J$ augmented cost function which takes into account violations of the constraints in linear and nonlinear form.



Others include the radius as further variable in the subproblem, which means the slack $s$ appears in the cost function to keep the radius of the trust region somehow limited directly when they solve the subproblem.



Anyhow in this last approach the problem is that there is no clear choice of how to scale the slack with respect to the original cost function, which means that if we define the new $J$ as



$$J_{new} = J + k s$$



what's a proper choice for $k$? Is there any rule I can adopt based on literature (e.g., also based no sequential quadratic programming maybe?=



thanks a million!










share|cite|improve this question









$endgroup$












  • $begingroup$
    The conditions $||x-x_k|| leq s$ and $(x-x_k)^T(x-x_k) leq s^2$ are obviously the same. As for parameter values (such as $k$), sometimes you can find a value in the literature, but you may have to tune it yourself to trade-off stability with performance.
    $endgroup$
    – LinAlg
    Jan 9 at 14:16
















0












$begingroup$


I'm implementing code for performing sequential convex optimization based on trust regions, but I have some doubts because I haven't found a unique approach.



Suppose we want to minimize a convex function



$$J(x)$$
subject to



$$f(x) = 0$$
$$g(x) leq 0$$



and that we replace these constraints with



$$f(x_k)+nabla f(x_k) (x-x_k) = 0$$
$$g(x_k)+nabla g(x_k) (x-x_k) leq 0$$ .



to get meaningful results we impose



$$left|x - x_kright| < s$$



where $x in mathbb{R}^n$ is the solution I get at each iteration, and $s$ is a slack variable that we can include in the problem or define to bound the trust regions. This ensures that the algorithm moves towards better solutions while keeping limited steps, which ensure that the linearized forms are well representative of the nonlinear original constraints.



My question is the following: what's the difference between implementing



$$left|x - x_kright| leq s $$ (i.e., component-wise)



or



$$left|x - x_kright| leq s $$



or



$$(x - x_k)^T(x - x_k) leq s$$



Since the square root is a monotonic function, this should make no difference in terms of final solution between the last two, I guess? or am I mistaken?



moreover, I have seen in literature (starting from Nocedal's book) that some researchers use an updating rule for the radius of the trust region based on the ratio between change of augmented cost function in the actual nonlinear model and the change of augmented cost function in the prediction model:



$$rho = frac{ deltatilde J_{nonlin}}{delta tilde J_{lin}}$$



with $tilde J$ augmented cost function which takes into account violations of the constraints in linear and nonlinear form.



Others include the radius as further variable in the subproblem, which means the slack $s$ appears in the cost function to keep the radius of the trust region somehow limited directly when they solve the subproblem.



Anyhow in this last approach the problem is that there is no clear choice of how to scale the slack with respect to the original cost function, which means that if we define the new $J$ as



$$J_{new} = J + k s$$



what's a proper choice for $k$? Is there any rule I can adopt based on literature (e.g., also based no sequential quadratic programming maybe?=



thanks a million!










share|cite|improve this question









$endgroup$












  • $begingroup$
    The conditions $||x-x_k|| leq s$ and $(x-x_k)^T(x-x_k) leq s^2$ are obviously the same. As for parameter values (such as $k$), sometimes you can find a value in the literature, but you may have to tune it yourself to trade-off stability with performance.
    $endgroup$
    – LinAlg
    Jan 9 at 14:16














0












0








0





$begingroup$


I'm implementing code for performing sequential convex optimization based on trust regions, but I have some doubts because I haven't found a unique approach.



Suppose we want to minimize a convex function



$$J(x)$$
subject to



$$f(x) = 0$$
$$g(x) leq 0$$



and that we replace these constraints with



$$f(x_k)+nabla f(x_k) (x-x_k) = 0$$
$$g(x_k)+nabla g(x_k) (x-x_k) leq 0$$ .



to get meaningful results we impose



$$left|x - x_kright| < s$$



where $x in mathbb{R}^n$ is the solution I get at each iteration, and $s$ is a slack variable that we can include in the problem or define to bound the trust regions. This ensures that the algorithm moves towards better solutions while keeping limited steps, which ensure that the linearized forms are well representative of the nonlinear original constraints.



My question is the following: what's the difference between implementing



$$left|x - x_kright| leq s $$ (i.e., component-wise)



or



$$left|x - x_kright| leq s $$



or



$$(x - x_k)^T(x - x_k) leq s$$



Since the square root is a monotonic function, this should make no difference in terms of final solution between the last two, I guess? or am I mistaken?



moreover, I have seen in literature (starting from Nocedal's book) that some researchers use an updating rule for the radius of the trust region based on the ratio between change of augmented cost function in the actual nonlinear model and the change of augmented cost function in the prediction model:



$$rho = frac{ deltatilde J_{nonlin}}{delta tilde J_{lin}}$$



with $tilde J$ augmented cost function which takes into account violations of the constraints in linear and nonlinear form.



Others include the radius as further variable in the subproblem, which means the slack $s$ appears in the cost function to keep the radius of the trust region somehow limited directly when they solve the subproblem.



Anyhow in this last approach the problem is that there is no clear choice of how to scale the slack with respect to the original cost function, which means that if we define the new $J$ as



$$J_{new} = J + k s$$



what's a proper choice for $k$? Is there any rule I can adopt based on literature (e.g., also based no sequential quadratic programming maybe?=



thanks a million!










share|cite|improve this question









$endgroup$




I'm implementing code for performing sequential convex optimization based on trust regions, but I have some doubts because I haven't found a unique approach.



Suppose we want to minimize a convex function



$$J(x)$$
subject to



$$f(x) = 0$$
$$g(x) leq 0$$



and that we replace these constraints with



$$f(x_k)+nabla f(x_k) (x-x_k) = 0$$
$$g(x_k)+nabla g(x_k) (x-x_k) leq 0$$ .



to get meaningful results we impose



$$left|x - x_kright| < s$$



where $x in mathbb{R}^n$ is the solution I get at each iteration, and $s$ is a slack variable that we can include in the problem or define to bound the trust regions. This ensures that the algorithm moves towards better solutions while keeping limited steps, which ensure that the linearized forms are well representative of the nonlinear original constraints.



My question is the following: what's the difference between implementing



$$left|x - x_kright| leq s $$ (i.e., component-wise)



or



$$left|x - x_kright| leq s $$



or



$$(x - x_k)^T(x - x_k) leq s$$



Since the square root is a monotonic function, this should make no difference in terms of final solution between the last two, I guess? or am I mistaken?



moreover, I have seen in literature (starting from Nocedal's book) that some researchers use an updating rule for the radius of the trust region based on the ratio between change of augmented cost function in the actual nonlinear model and the change of augmented cost function in the prediction model:



$$rho = frac{ deltatilde J_{nonlin}}{delta tilde J_{lin}}$$



with $tilde J$ augmented cost function which takes into account violations of the constraints in linear and nonlinear form.



Others include the radius as further variable in the subproblem, which means the slack $s$ appears in the cost function to keep the radius of the trust region somehow limited directly when they solve the subproblem.



Anyhow in this last approach the problem is that there is no clear choice of how to scale the slack with respect to the original cost function, which means that if we define the new $J$ as



$$J_{new} = J + k s$$



what's a proper choice for $k$? Is there any rule I can adopt based on literature (e.g., also based no sequential quadratic programming maybe?=



thanks a million!







optimization convex-optimization nonlinear-optimization numerical-optimization non-convex-optimization






share|cite|improve this question













share|cite|improve this question











share|cite|improve this question




share|cite|improve this question










asked Jan 9 at 6:00









venomvenom

296




296












  • $begingroup$
    The conditions $||x-x_k|| leq s$ and $(x-x_k)^T(x-x_k) leq s^2$ are obviously the same. As for parameter values (such as $k$), sometimes you can find a value in the literature, but you may have to tune it yourself to trade-off stability with performance.
    $endgroup$
    – LinAlg
    Jan 9 at 14:16


















  • $begingroup$
    The conditions $||x-x_k|| leq s$ and $(x-x_k)^T(x-x_k) leq s^2$ are obviously the same. As for parameter values (such as $k$), sometimes you can find a value in the literature, but you may have to tune it yourself to trade-off stability with performance.
    $endgroup$
    – LinAlg
    Jan 9 at 14:16
















$begingroup$
The conditions $||x-x_k|| leq s$ and $(x-x_k)^T(x-x_k) leq s^2$ are obviously the same. As for parameter values (such as $k$), sometimes you can find a value in the literature, but you may have to tune it yourself to trade-off stability with performance.
$endgroup$
– LinAlg
Jan 9 at 14:16




$begingroup$
The conditions $||x-x_k|| leq s$ and $(x-x_k)^T(x-x_k) leq s^2$ are obviously the same. As for parameter values (such as $k$), sometimes you can find a value in the literature, but you may have to tune it yourself to trade-off stability with performance.
$endgroup$
– LinAlg
Jan 9 at 14:16










0






active

oldest

votes












Your Answer





StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "69"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3067137%2fsocp-constraint-vs-quadratic-constraint-trust-region-methods%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes
















draft saved

draft discarded




















































Thanks for contributing an answer to Mathematics Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f3067137%2fsocp-constraint-vs-quadratic-constraint-trust-region-methods%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Bressuire

Cabo Verde

Gyllenstierna