A dai-liao hybrid conjugate gradient method for unconstrained optimization

One of todays’ best-performing CG methods is Dai-Liao (DL) method which depends on non-negative parameter  and conjugacy conditions for its computation. Although numerous optimal selections for the parameter were suggested, the best choice of  remains a subject of consideration. The pure conjugacy condition adopts an exact line search for numerical experiments and convergence analysis. Though, a practical mathematical experiment implies using an inexact line search to find the step size. To avoid such drawbacks, Dai and Liao substituted the earlier conjugacy condition with an extended conjugacy condition. Therefore, this paper suggests a new hybrid CG that combines the strength of Liu and Storey and Conjugate Descent CG methods by retaining a choice of Dai-Liao parameterthat is optimal. The theoretical analysis indicated that the search direction of the new CG scheme is descent and satisfies sufficient descent condition when the iterates jam under strong Wolfe line search. The algorithm is shown to converge globally using standard assumptions. The numerical experimentation of the scheme demonstrated that the proposed method is robust and promising than some known methods applying the performance profile Dolan and Mor´e on 250 unrestricted problems.  Numerical assessment of the tested CG algorithms with sparse signal reconstruction and image restoration in compressive sensing problems, file restoration, image video coding and other applications. The result shows that these CG schemes are comparable and can be applied in different fields such as temperature, fire, seismic sensors, and humidity detectors in forests, using wireless sensor network techniques.

One of todays' best performing CG methods is Dai-Liao (DL) method which depends on non-negative parameter and conjugacy conditions for its computation. Although numerous optimal selections for the parameter were suggested, the best choice of remains a subject of consideration. The pure conjugacy condition adopts an exact line search for numerical experiments and convergence analysis. Though, a practical mathematical experiment implies using an inexact line search to find the step size. To avoid such drawbacks, Dai and Liao substituted the earlier conjugacy condition with an extended conjugacy condition. Therefore, this paper suggests a new hybrid CG that combines the strength of Liu and Storey and Conjugate Descent CG methods by retaining a choice of Dai-Liao parameter that is optimal. The theoretical analysis indicated that the search direction of the new CG scheme is descent and satisfies sufficient descent condition when the iterates jam under strong Wolfe line search. The algorithm shown to converge globally using standard assumptions, where the numerical experimentation of the scheme demonstrated that the proposed method is robust and promising than some known methods applying the performance profile presented by Dolan and Mor´e on 250 unrestricted problems. Numerical assessment of the tested CG algorithms with sparse signal reconstruction and image restoration in compressive sensing problems, file restoration, image video coding and other applications show that these CG schemes are comparable and can be apply in different fields such as temperature, fire, seismic sensors and humidity detectors in forest and so on using the wireless sensor network techniques.

Introduction
Conjugate Gradient (CG) method was initially suggested for solving linear system of equation. Subsequently, the solution of a linear system is comparable to minimizing a positive definite quadratic function, for this reason (Babaie-Kafaki & Ghanbari, 2014a), the method was later modified to solve unconstrained minimization problems (Rao, 2009). Therefore, the method constitutes an excellent choice for solving optimization problems by scientists, engineers and mathematicians (Babaie-Kafaki, 2011). The method is categorized by absent of matrix storage with powerful theoretical properties (Djordjevic, 2017). The problem as the form: → is a function that is twice continuously differentiable, and the CG scheme that iteratively solves the problem is given where > 0 is a step-size obtained by a suitable line search and here is a search direction (Gilbert & Nocedal, 1992). Generally, the distance to move along the search direction can be attained by solving onedimensional minimization called an exact line search such that the objective function is minimized to find , that is, However, for large scale problems usually, an exact line search is not possible so any value of that satisfies certain properties called Wolfe conditions is accepted (Nocedal & Wright, 2006): where 0 < < < 1 , and that is a path towards minimum needs to be descent (Babaie-Kafaki & Ghanbari, 2014c). Whereas other value of constitute up of (4) and known strong Wolfe condition is also accepted (Nocedal & Wright, 2006). Therefore, the direction towards minimum can be obtained by the formula where is a scalar CG (update) parameter which is determined by some inner products (Ding, et al., 2010). Mainly, the CG schemes vary by the selection of coefficient. Some well-known CG schemes can be divided into two (Babaie-Kafaki, et al,2010). The schemes in the first category may perform poorly theoretically but numerically behave well due to an important feature known as restart that helps them circumvent jamming automatically (Babaie-Kafaki & Ghanbari, 2014c). These CG parameters were initially suggested by Hestenes and Stiefel (1952), Polak, Ribie're and Polyak (1967), Liu and Storey (1991) with the following coefficients, respectively: where ‖. ‖ symbolizes Euclidean norm and define = +1 − and = +1 − (Andrei, 2008a).
The other category is prone to poor numerical performance as a result of jamming, but they have powerful theoretical properties (Andrei, 2008b). These schemes were earlier proposed by Fletcher and Revees (1964), Dai and Yuan (1991) and Fletcher (1987) with the following CG parameters respectively: The schemes in (9) vary with other selections in theory because their theoretical properties require Lipchitz assumption only but not including boundedness assumption (Hager, & Zhang, 2006). The FR method's poor practical performance is associated to taking tiny steps without meaningful progress to reach the minimum (Powell, 1984). Specifically, if a bad path is taken, then tiny steps from −1 and will be generated, the next path along step are likely to be poor except a restart along the gradient direction is made (Babaie-Kafaki, 2013). Babaie-Kafaki, et al. (2011) Pointed out that despite such deficiency, the FR method was proved to be theoretically powerful with exact line search on general functions; later this result was extended to an in-exact line search to improve the efficiency of the scheme (Hager & Zhang, 2006). In general, the performance of the methods in the first category is efficient but their convergence is uncertain (Hager & Zhang, 2006). The behavior of these schemes needs to be improved to avoid jamming (Djordjevic, 2017). Therefore, Researchers were interested in combining CG schemes of the two schemes (Babaie-Kafaki & Mahdavi-Amiri, 2013). Although, CD scheme is closely related to FR scheme with exact line search but, the restriction < 1 2 in FR is not required for CD to attain sufficient descent using strong Wolfe line condition. Moreover, CD scheme is theoretically powerful for the generalized Wolfe conditions with < 1 and = 0 restrictions (Hager & Zhang, 2006). Meanwhile, Djordjevic (2017) pointed out that no much research has been done on the choice of except for the work of Liu & Storey (1991) initially, but the analysis of PRP techniques should be applied to the LS method (Hager & Zhang, 2006). Since LS and PRP schemes are identical when an exact line search is used (Dai, 2001).
Babaie-Kafaki & Ghanbari (2014c) suggested two Hybrid Conjugate Gradient (HCG) methods where the CG coefficients were obtained from standard and modified secant equations respectively. The parameters are calculated as an affine combination of and . While Djordjevic (2017) suggested a hybrid parameter using Liu & Storey (1991) with Conjugate Descent CG parameters Convex Combination (LSCDCC) from conjugacy condition as an affine combination of and . To achieve global convergence for general function, HCG adopted ≥ 0 restriction, while the other coefficient is hypothetically superior for uniformly convex function. Subsequently, the selection of CG parameter received little attention by researchers except for the work of (Djordjevic, 2017;Salihu, et al, 2020) recently motivated this work. Given the above, a large number of hybrid conjugate gradients techniques were proposed (Andrei, 2008a) that modified different coefficients to maximize their strengths and minimize their weaknesses (Babaie-Kafaki, et al. 2010). Among them: (e.g. see (Yuan, 1991;Gilbert & Nocedal, 1992;Andrei,2008c;Dai & Yuan, 2001;. The excellent contributions of Andrei and Babaie-Kafaki on hybridization using convex combination and that of Djordjevic motivated us to extend their approaches to access and combine the strength of LS and CD CG update parameters.

Extended Conjugacy Condition of Dai and Liao CG Method (ECCDL)
In the earlier CG methods; conjugacy condition +1 = 0 that rests on the exact line search plays a significant part in the mathematical experiment and convergence analysis (Sun & Yuan, 2006). Though, a practical mathematical experiment implies using an inexact line search to find the step-size . Especially in a situation where +1 is not equal to zero, then it may take other form, to avoid such defect Dai & Liao (2001) substituted the pure conjugacy condition with extended conjugacy condition. Due to simpler structure and low memory requirements of Dai-Liao conjugate gradient methods; Yao, et al. (2019) proposed some three-term Dai-Liao CG algorithms that possess efficient conjugate gradient structures. Esmaeili, et al. (2018) suggested a new CG scheme to solve problems emerged from astronomical imaging, file restoration, image video coding and other applications. Similarly, Guo & Wan (2019) developed CG algorithm for sparse engineering signal problem. Numerical tests indicated that the algorithm is an alternative for recovering sparse signal problems and beats earlier methods. Recently, Liu, et al. (2020) transformed nonlinear unconstrained optimization problems as -tensor equation to solve reallife issues originating from engineering and economics. Numerical results revealed that the proposed CG scheme is efficient than some known methods. One of todays' best performing CG method is Dai & Liao (DL) method which depends on non-negative parameter t for its computation. Although numerous optimal selections for the parameter were suggested as in (Babaie-Kafaki, & Ghanbari 2014b;Babaie-Kafaki,2015;Babaie-Kafaki & Ghanbari, 2017;Waziri, et al, 2019;Salihu, et al, 2021), the best choice of t still remains subject of consideration (Babaie-Kafaki, & Ghanbari, 2015). Motivated by the above, in this section, using similar approach in (Andrei, 2008b;Andrei, 2009;Babaie-Kafaki, et al, 2010;Djordjevic, 2017), this research will combine the attractive features of CG update parameters proposed by Liu and Storey (1991) with CG parameter proposed by Fletcher (1987) conjugate descent using Extended Conjugacy Condition of Dai and Liao (2001) CG method called (ECCDL) as follows: From relations (8) and (9), we can write (10) Therefore, using vector on relations (7) and (10) we obtain Applying +1 = 0 on (13) will lead to the following hybridization parameter in (Djordjevic ,2017): Similarly, if we apply Dai-Liao extended conjugacy condition: on (13) and after some algebra another new hybridization parameter is propose as: The justification of the choice of the method in this work, an algorithm with update parameter that do not require the calculation of the Hessian matrix for solving large scale problems is preferred. For this reason, we assume that does not satisfies +1 = 0. Therefore, the beautiful structures of CG update parameters proposed by Liu and Storey (1991) with CG parameter suggested by Fletcher conjugate descent using Extended Conjugacy Condition of Dai and Liao (ECCDL) CG method is proposed in such a way that, if the modulating parameter = 0, then (16) reduces to the method in (Djordjevic, 2017). So, as for the optimal choice of the method we assume ≠ 0 in the term. Next the algorithm of the proposed method is presented as follows: Algorithm 1 (ECCDL).

Remark:
The update parameter computed by (16) may be outside interval [0, 1]. However, to have a proper convex combination in (10)-(11). The following rules are applied: if Therefore, under this selection for , the direction +1 in (12)-(13) is a proper convex combination of and .

Theoretical Analysis
To demonstrate the sufficient descent condition of the ECCDL method, we apply the following theorem.

Convergence Analysis
The following assumptions are required to establish the global convergence of ECCDL method: Boundedness Assumptions: Assumption 3.1. The set = { ∈ ∶ ( ) ≤ ( 0 )} is bounded from below where 0 is the starting point of CG method in (2) and (7). That is, there exist a positive constant such that

Lipschitz Assumptions: Assumption 3.2. In a neighborhood , the objective function f is continuously differentiable and its gradient ( ) is Lipchitz continuous on , that is, there exist a constant > 0 such that
for all , ∈ .

Lemma 3.3 Let Assumption 3.1 holds and is a descent direction and the step-size satisfies
then Proof: Using (26) and (29), it holds that −(1 − ) ≥ ( +1 − ) ≤ ‖ ‖ 2 . Subsequently, when > 1 and the search direction is < 0, it is not difficult to claim that (30) holds. Clearly, from relations (6) and (18), the step-size in the ECCDL algorithm fulfils (30). Consequently, according to (18) and since = 0 does not fulfil (6), we can easily conclude that ≠ 0 ∀ ≥ 0, which means ≠ 0 and as such there exist > 0 so that Generally, any CG method with strong Wolfe line search converges. However, only weak form of the (Zoutendijk, 1970) condition is required for general function (Dai & Liao, 2001). The theorem below establishes the useful theoretical property of ECCDL through the strong Wolfe line search.
The proof is using contradiction, that theorem (3.2) is not true.
Proof: Let ≠ 0 and assume that (32) does not hold. Then, we have a constant > 0, such that It follows from the first inequality that ∈ (0,1) and subsequently the second inequality holds from the Cauchy Schwartz inequality. But when ∉ (0,1), it is easy to get the above inequality according to selection in Step 5 of algorithm 1. Thus, from (7) and (34) we get and this indicates that Contrarily, from (18), (28) and (33), we can achieve Also, applying Lemma (3.1), we conclude that Obviously, this is a contradiction of (36) and hence (33) is not satisfied which implies that (32) is proved. ∎

Results and Discussion
In engineering, medical sciences, biological and other areas of science; digital image processing plays an important role. Therefore ;Ibrahim, et al. (2020) utilized Hybrid Liu and Storey and Fletcher and Revees (HLSFR) algorithm of Djordjevic (2019) in restoring one dimensional signal sparse problem using mean squared error (MSE) with the LS and FR CG algorithm to suggest a hybrid algorithm for unconstrained minimization problems and extend the result to convex monotone equations. Numerical assessments with some image restoration in compressive sensing CG algorithms show that the proposed scheme is efficient and promising than other schemes with smaller number of iterations, computing time and MSE on different noise sample problems. The HLSFR algorithm is the foundation of Ibrahim et al. (2020) work and is similar to Hybrid Hestenes and Stiefel and Fletcher and Revees (HHSFR) algorithm of Djordjevic (2018), LSCDCC of Djordjevic (2017) and ECCDL algorithms which can be applied to compressive sensing problems that has wide variety of applications as shown in Ibrahim et al. (2020) for example; the wireless sensor networks that are usually placed in field can be used in temperature, fire, seismic sensors and humidity detectors in forest, etc. Therefore, in this section, we present the performance of ECCDL and compare with that of LSCDCC of Djordjevic (2017) and HCG method of Babaie-Kafaki & Ghanbari (2014c). To implement the hybridize CG parameters, the codes were run on a computer with a processor and memory of 2.20 and 3.0 , respectively, using Matlab 8.3 ( 2014 ) on 250 unconstrained optimization problems. The test problems are the unconstrained problems obtained from (Andrei, 2008c;and Gould, et al, 2003). Since CG schemes are used to solve largescale unconstrained optimization, we choice 25 problems that are tested 10 times for: 100; 200; 500; 1,000; 2,000; 5,000; 10,000; 20,000; 50,000 and 100,000 with summary of the numerical results and list of test functions shown in table 1-2 respectively. All the algorithms were implemented using (4) and (6) with = 0.0001 and = 0.001, and the step length is computed with initial trail value = 1 and the modulating parameter = 0.5 .The same stopping criterion ‖ ‖ ∞ ≤ 10 −5 is used. All the test functions were minimizing from standard starting points.
Test function results are obtained by running a solver on set of problems and recording the number of iterations and the computing time. Interpretation of figures 1-2 show the performance of these methods using (Dolan & Mor´e, 2002) profile. The ( ) is the portion of problems with performance ration , thus, a solver with high values ( ) or at the top right of the figures are preferable. That is, for each method, we plot the percentage ( ) of the problems for the best time for each algorithm within a factor of ( ) versus time . The left side gives the percentage of the test problems of the method that is fastest. The right side gives the percentage of the test functions successfully solved by each method. The interpretation of figure 1 shows the probability of ECCDL method is the winner on a given problem is 61%. While LSCDCC and HCG methods win 39% and 15% percentages respectively, when the factor is chosen within the interval 0 < < 0.5. Clearly, ECCDL method has the most wins, because it has the highest probability of being closer to the optimal solution. However, if we extend our of interest to ≥ 0.5, ECCDL and HCG algorithms solved the test functions in a given time and reach 88% and 87% respectively, while LSCDCC method is 85% to. It is easy to see that the performance of ECCDL and HCG algorithms are computationally efficient than LSCDCC scheme.
Since the computing time is also affected by the computer atmosphere like operating system and busy status, we additionally compare the number of iterations of the algorithms. Figure 2 shows that the fraction of ECCDL method is the winner on a given problem is 82%. While LSCDCC and HCG methods win 80% and73% percentages respectively, when the factor is chosen within the interval 0 < < 0.5 Clearly, ECCDL method wins, since it has the highest possibility of being closer to the best solution. However, if we extend our of interest to ≥ 0.5, ECCDL and HCG algorithms solved the test functions in a given number of iterations and reach 88% and 87% respectively, while LSCDCC method is 85%, it is easy to see that the performance of ECCDL and HCG algorithms are computationally efficient than LSCDCC scheme.

Conclusion
One of todays' best performing CG method is DL method which depends on non-negative parameter for its computation. Although numerous optimal selections for the parameter were suggested, the best choice of remains a subject of consideration. In this paper, we have presented a new hybrid Dai-Liao conjugate algorithm in which the parameter is computed from and in such a way that if the modulating parameter = 0 then it reduces to the method that uses the pure conjugacy condition. Theoretical and numerical computations adopt inexact line search when compared with some known CG coefficients using strong Wolfe condition show the algorithm is robust, efficient and converge globally compared to LSCDCC and HCG methods on 250 unconstrained optimization problems. Numerical assessments of these CG algorithms show that the schemes are comparable with smaller iterations and computing time and can be applied to compressive sensing problems with a wide variety of applications, as shown in Ibrahim et al.(2020) method.