Particle swarm optimization algorithms with selective differential evolution for AUV path planning

Particle swarm optimization (PSO)-based algorithms are suitable for path planning of the Autonomous Underwater Vehicle (AUV) due to their high computational efficiency. However, such algorithms may produce sub-optimal paths or require higher computational load to produce an optimal path. This paper proposed a new approach that improves the ability of PSO-based algorithms to search for the optimal path while maintaining a low computational requirement. By hybridizing with differential evolution (DE), the proposed algorithms carry out the DE operator selectively to improve the search ability. The algorithms were applied in an offline AUV path planner to generate a near-optimal path that safely guides the AUV through an environment with a priori known obstacles and time-invariant non-uniform currents. The algorithm performances were benchmarked against other algorithms in an offline path planner because if the proposed algorithms can provide better computational efficiency to demonstrate the minimum capability of a path planner, then they will outperform the tested algorithms in a realistic scenario. Through Monte Carlo simulations and Kruskal-Wallis test, SDEAPSO (selective DE-hybridized PSO with adaptive factor) and SDEQPSO (selective DE-hybridized Quantum-behaved PSO) were found to be capable of generating feasible AUV path with higher efficiency than other algorithms tested, as indicated by their lower computational requirement and excellent path quality.


INTRODUCTION
AUVs are unmanned underwater vehicles that can be remotely programmed to conduct various missions, ranging from seabed survey, coastal mapping and environmental monitoring for scientific research purposes, to anti-submarine warfare for defence purposes. To date, numerous efforts have been made in the attempt to enable the operation of AUVs in more dynamic and constrained environments, such as shallow coastal areas, deep ocean regions and regions underneath ice shelves. The operation of AUVs in highly dynamic regions is challenging and it poses several technical issues, particularly for the path planning of the AUVs.
Planning the path for an AUV is essentially a multimodal optimization problem; numerous optimization techniques have been proposed to solve this problem effectively and efficiently. Nonetheless, developing the algorithms for AUV path planning still faces several intrinsic difficulties, particularly in an AUV simulator of REMUS 100 in Section 6. Finally, Section 7 concludes the paper along with the future research directions.

REVIEW ON PSO AND ITS VARIANTS
This section presents the overview of various particle swarm intelligence based algorithms used for developing the novel algorithms, which include the basic PSO, basic QPSO and their variants.

PSO algorithm
Introduced by Eberhart and Kennedy [20], PSO algorithm is a heuristic population-based optimization algorithm inspired by the analogues of cognitive abilities and social interaction in animals. The algorithm consists of particles that move within a multidimensional search space to find the potential solutions, which are represented by the particles" positions. The particles" velocities are iteratively updated by the particle"s own experience (cognitive behaviour) and the entire swarm"s experience (social behaviour) to vary the particles" positions. In a standard PSO algorithm that consists of N particles with D number of dimensions for solving a cost evaluation function f, the position vector of the i th particle at t th iteration can be denoted as: Based on its previous best position pbest and global best position in the swarm gbest, the velocity V and the position X of the i th particle at (t+1) th iteration are updated by (2) and (3) respectively. pbest and gbest are determined based on the particle"s fitness f(X) and its previous best fitness f(pbest) as shown in (4) and (5). (3) , ( )- In (2), r 1 and r 2 are uniform distributed random positive numbers that are less than 1.0. C 1 and C 2 denote the acceleration coefficients for cognitive and social components respectively; they are both set to 2.0 for most applications [21]. Parameter w is the inertia weight introduced by [22] for balancing the global exploration and local exploitation of the particles. A common strategy is to set the inertia weight at an initial w max value of 0.9, and linearly decreasing to a w min value of 0.4 according to (6) as the iteration progresses [23,24].
where t max is the maximum number of iterations defined for the algorithm.
To confine the particles within the search space, the particle velocity denoted by V is usually bound to an interval of [-V max , V max ], where the maximum velocity V max is recommended to be 10% to 20% of the dynamic range of the variables [24,25].

QPSO algorithm
Inspired by the mechanics of quantum system and dynamical analysis of the PSO algorithm, Sun, Feng [26] proposed the QPSO algorithm. In QPSO, the position of the i th particle can be updated using the following stochastic equation: ∑ ⁄ where u is a uniform distributed random positive number that is less than 1.0,  is a uniform distributed random positive number that is less than 1.0 , and mbest is the mean best position which is defined as the average of personal best positions of all particles in the swarm as shown in (8).  is known as the contraction-expansion (CE) coefficient, which is the most critical parameter for tuning the convergence behaviour of QPSO. As suggested by the empirical study of parameter selection in [11], a linearly decreasing  from a maximum value  max of 1.0 to a minimum value  min of 0.5 according to (9) is suitable for most applications.

DEPSO and DEQPSO algorithms
One of the most effective method used for improving the PSO-based algorithm is by hybridization, in which the beneficial features of other optimization techniques is combined with PSO or QPSO algorithm. In [27], the basic PSO was combined with Differential Evolution (DE), resulting in a hybrid algorithm known as DEPSO. Based on the inspiration from DEPSO, [18] applied a similar hybridization concept in QPSO to propose DEQPSO. In both DEPSO and DEQPSO, the particles undergo the usual position update operations, followed by a successive three-step DE operation, which is the mutation, crossover and selection as described below.
-Mutation: A mutated donor vector U is first generated using (10): where r 1 , r 2 , r 3 and r 4 are randomly selected particle indices that are mutually different, and different from the current index i and the particle index of global best position, i.e. r 1  r 2  r 3  r 4  i  gbest. -Crossover: A trial vector T is generated to increase the diversity, by conducting crossover between the donor vector and personal best position as shown in (11).
where CR is the crossover probability which is suggested to be 0.85, r j is a uniform distributed random number ranging from 0 to 1.0, and r is a random positive integer ranging from 1 to the number of particle dimensions D. -Selection: A greedy selection is used to decide whether the trial vector T should replace the current position X in the (t+1) th iteration. The fitness of T will be evaluated and compared with X. X will only be replaced if T has better fitness value; otherwise X will be retained. This means the hybridization of the DE operation will never deteriorate the solution, but only make it better or remain unchanged. DEPSO and DEQPSO algorithms were applied to solve the path planning problem of Unmanned Aerial Vehicle (UAV) in [18], and has proven to be capable of generating significantly higher solution quality than basic PSO and QPSO algorithms.

APSO algorithm
In basic PSO, the acceleration coefficients C 1 and C 2 , and inertia weight w in the update equation are important for maintaining the balance between the global exploration and local exploitation of the particles. Zhan, Zhang [19] proposed an adaptive PSO (APSO), in which an evolutionary factor is used as an indicator representing the evolutionary state of the particles to control the equation coefficients adaptively. To determine the evolutionary factor, the mean particle distance d i of the i th particle to other particles has to be calculated using (12). The evolutionary factor f is then computed according to (13). where d g is the mean particle distance of the global best particle, d min and d max are the minimum and maximum of the mean particle distances respectively. The inertia weight w can be calculated from evolutionary factor f using (14). The adaptation of the acceleration coefficients C 1 and C 2 can also be achieved using the evolutionary factor as shown in (15).

METHODOLOGY: SELECTIVE DE HYBRIDIZATION
Although DEPSO and DEQPSO algorithms are able to generate excellent solution qualities for AUV path planning, the hybridization of DE significantly increases the computational requirements of the algorithm due to the greedy selection operator used in the DE operation [17]. The greedy selection operator requires the fitness of the particles to be evaluated twice for comparison purposes, meaning an additional fitness evaluation for every particle in every iteration. As the fitness evaluation process usually contributes to the majority of the computational time [11], the greedy operator drastically increases the computational requirements of the algorithms. The increase in computational requirements due to the addition of greedy selection operator will be even more obvious when the complexity and the dimensionality of the problem increase [17]. In order to minimize the downside of DE operator in PSO-based algorithm, a selective hybridization scheme is proposed in this paper to present the following algorithms: -SDEPSO (PSO with selective DE hybridization) -SDEAPSO (PSO with adaptive factor and selective DE hybridization) -SDEQPSO (QPSO with selective DE hybridization) Using the selective scheme, these proposed algorithms apply the DE operation to a selected number of particles only, instead of the entire swarm. The number of particles selected for DE operation, N S , is controlled by a selective factor S as shown in (16).
, - The DE operation in the proposed algorithms was modified by replacing the greedy selection operator with a natural selection operator. The DE operation proposed in this paper initiates by sorting all the particles in the entire swarm according to their personal best positions. Next, a number of selected particles with the best fitness undergo the mutation and crossover operators, similar to those in DEPSO and DEQPSO, to generate the same number of trial vectors. The trial vectors are then subjected to a natural selection operator, in which the same number of particles with the worst fitness is replaced by the trial vectors.
As only the worst particles are replaced in this process, all potentially best solutions will never deteriorate. Furthermore, the computational requirements of the algorithms will not be significantly affected because the natural selection operator does not involve fitness comparison between the particles, which requires additional particle fitness evaluation in every iteration. The DE operation with natural selection increases the diversity and the evolutionary rate of the entire swarm by eliminating the least desirable solutions, hence leading to a faster and better global convergence theoretically.
The selective DE hybridization was applied to PSO and QPSO algorithms to develop the SDEPSO and SDEQPSO algorithms in this paper. In addition, another algorithm, namely SDEAPSO, was developed by adding an adaptive mechanism to the control of inertia weight and acceleration coefficients in PSO algorithm, similarly to the APSO algorithm. The implementation of SDEPSO, SDEAPSO and SDEQPSO algorithms in AUV path planning can be conducted as described in the following pseudocode.
Step 1. Input the algorithm parameters and environmental information of the ocean field.
Step 2. Initialize particles with random positions in (1) to represent an initial group of candidate paths. Set pbest to be the current particle positions.
Evaluate the cost function f (Xi t ).
Evaluate the cost function f (Xi t ). Update pbest and gbest according to (4) and (5) respectively.
Update particle velocity and position according to (2) and (3) respectively.
Update particle position according to (7).

End
Step 3.3 Sort all particles according to the fitness of their personal best positions.
Step 3.4 For k = 1, 2,…, NS th best performing particle, Mutation: Generate mutated vector Uk t according to (10). Crossover: Generate trial vector Tk t according to (11). Natural selection: Replace k th worst performing particle with trial vector Tk t .

End End
Step 4. Output gbest that holds the optimal path when the stop criteria is met or when tmax is reached.

Complexity Analysis
The time complexity of the proposed algorithms can be measured by counting the number of primitive operations in the algorithm. By referring to the pseudocode of the proposed algorithms, the number of operations can be counted as follows: -In Step 2, initialization contributes one operation for N times.

Benchmark Functions
Metaheuristic algorithms such as the PSO-based algorithms can be evaluated empirically by comparing their performance in solving a set of objective function problems. In addition to the AUV path planning problem, a number of non-linear continuous function problems were used to study and benchmark the characteristics of the proposed algorithms. According to the "no free lunch" (NFL) theorem [29], the development and evaluation of an algorithm for a specific problem should be based on the benchmark function problems of similar class and properties, because the algorithm performance will not be consistent for every kind of problem. Thus, these benchmark functions were selected based on their resemblances to the properties of path planning problem. The selected benchmark functions should have the following properties: -Multimodal with deceptive local minima and one global minimum, because the path planning problem usually consists of multiple suboptimal paths and an optimal path. -Multi-dimensional, because the dimensionality of the path planning problem is dependent on the number of control waypoints along the path. Four test functions were chosen for benchmarking in this study. These minimization problem functions, which are commonly used to evaluate the characteristics of optimization algorithms, were found to exhibit the abovementioned properties. The information on the selected benchmark functions are given in Table 1. The dimensions of all functions are set to 20 in this study.

Empirical Study on Parameter Selection
In SDEPSO, SDEAPSO and SDEQPSO, the number of best performing particles that undergo the DE operation and the number of worst performing particles that will be replaced during the natural selection are determined by the selective factor S. Thus, S can be manipulated to control the diversity of the population. In order to study the effects of S on the algorithm performance, an empirical study is conducted on SDEPSO by using a range of S. The selective factor S is a positive number that is less than 1.0. Note that when S = 0, the algorithm will not be hybridized with DE at all; while S = 1 means the DE operation will be conducted on the entire swarm, and the entire swarm will be replaced during the natural selection, meaning all the solutions generated from the PSO operation will be discarded, which is undesirable. Therefore, the empirical study includes S values ranging from 0 to 0.9, meaning that 0%-90% of the particles will undergo the DE operation; the results for S = 0 are included for comparison purposes.
Through a 1000-run Monte Carlo simulation with 100 (max) iterations and a population size of 150 particles, the performance of SDEPSO under different S settings is evaluated by solving the optimization problems of the benchmark functions and the path planning problem in 2D and 3D scenarios; the formulation of the path planning problem is described in Section 4.
Prior to evaluating the algorithm performance, Shapiro-Wilk test was performed to examine the normality of the obtained simulation data. The normality test revealed that the data was not normally distributed. Hence, the median was used as the indicator for solution quality. The median of fitness obtained (Med.) and the best known fitness (Best) for each setting of S were obtained for all problems and tabulated in Table 2. A lower fitness value indicates a higher solution quality and hence a stronger search ability. The best-performing results for each of the problems are in bold in Table 2. It can be observed from the results that the behaviour of the algorithms varies greatly as S increases, and the variations are not consistent for all problems. The best results for the majority of the problems are identified to be in the range of S = 0.1 -0.3, except for problem F4. Such results can be explained by the geometry of the Schwefel function F4, which has all its local minima and the global minimum spread far apart from one another. Effective optimization of this function requires an algorithm that promotes larger solution diversity (higher S), so that it provides a jumping-out ability to prevent trapping in deceptive local minima. This actually complies with the NFL theorem, which suggests that no single algorithm can generate better performance than any other algorithms for every problem. In fact, the improved algorithm performance in one class of problem is not necessarily consistent in all kinds of problems; instead, it is exactly traded with performance in another class of problem [29]. Although all the function problems selected for benchmarking purposes have similar properties (they are all multimodal and multi-dimensional), the geometry of the problems are different. Therefore, the setting of S should be adjusted accordingly for different optimization problems. Based on this empirical study, it can be deduced that the optimal setting of S for the majority of the tested problems is in the range of 0.1 -0.3. More specifically for the path planning problem, the setting of S = 0.3 was found to be appropriate and effective.

Benchmark Study
The benchmark functions were used to evaluate and benchmark the proposed algorithms in this study. Through a 1000-run Monte Carlo simulation with 100 (max) iterations and a population size of 150 particles, the performances of the proposed algorithms in solving the optimization problems of the four benchmark functions were compared with other existing PSO-based algorithms. At each run, the initial particle positions for all problems were randomly generated based on the uniform distribution within the boundary intervals given in Table 3.
As the data was not normally distributed according to the Shapiro-Wilk test, the Kruskal-Wallis test [34], which is a non-parametric ANOVA (analysis of variance), was used with a significance level of 0.05 to rank the algorithm performance based on the solution qualities (fitness obtained). The ranking procedure used the Holm-Bonferroni "stepdown" approach [35], which is best suited for all pairwise comparisons when the confidence intervals are not needed and sample sizes are equal [11]. The algorithms are given the same rank if they are not statistically different from one another. The medians (Med.) of fitness obtained, the ANOVA ranks (#R) and the medians of computational time required were tabulated in Table 3. The medians of the top two best-performing results for each problem are in bold. The overall performances of the algorithms are given by their total ranks, which are calculated from the summation of the ranks of the algorithm for all problems.
Based on the results, it can be seen that there is no single algorithm that can achieve the best results for all problems; this observation agrees with the NFL theory. For the Griewank function (F 1 ), DEQPSO produced the best result. In fact, APSO, SDEAPSO, QPSO, DEQPSO, and SDEQPSO algorithms were found to be producing satisfactory results, indicating that the adaptive mechanism and quantum behaviour of the particles are beneficial for solving this problem. DEPSO and SDEPSO algorithms produced equally good performance for the Rastrigin function (F 2 ). For the Ackley function (F 3 ), the QPSO-based algorithms, i.e. QPSO, DEQPSO and SDEQPSO produced the best performance, followed by the adaptive PSO-based algorithms, i.e. APSO and SDEAPSO. As far as the Schwefel function (F 4 ) is concerned, only DEPSO, SDEPSO and SDEAPSO are able to generate satisfactory results, while all the other algorithms seem to have inferior performances.
The total ranking of the algorithms reveal that DEQPSO achieved better overall performance than other algorithms. The second-best performing algorithms are found to be DEPSO and SDEAPSO. Most importantly, the results for all problems show that the fully DE-hybridized algorithms, i.e. DEPSO and DEQPSO required significantly higher computational time to obtain the solutions, while the selectively DE-hybridized algorithms are able to maintain a reasonably similar computational requirement as the PSO, QPSO and APSO algorithms.

PROBLEM FORMULATION FOR PATH PLANNING
The AUV path planning problem is formulated in this section. Throughout the formulation, the AUV is assumed to have constant thrust, and hence constant water reference velocity.

Path Formulation
In this paper, the primary objective of the AUV path planner is to solve a multimodal non-linear optimization problem, in which the optimal path among a group of potential paths for the AUV to travel 102 towards a target location through the ocean environment is required to be determined. Each potential path of the AUV comprises a series of nodes along the path from the start point to the target (end) point. Controlling and optimizing the coordinates of the path nodes will yield the optimized path for the AUV. The start point and the endpoint of the path are not involved in the optimization because all the potential paths share the same start and end locations.
In PSO-based algorithm, each potential path solution for the problem is modelled as an individual particle in the swarm population. The swarm population is denoted by a matrix X = [X 1 , X 2 ,…, X N ] T , where X is the position vector of the particles and N is the number of particles in the swarm. The entries of the position vectors for the particles represent the coordinates of the path nodes. Assuming every path consists of n+2 nodes including the start point and endpoint, the number of nodes involved in the optimization is n. In order to record the coordinates of n nodes, the entries of the position vector for a particle in 2D problem space will have 2n dimensions, while a particle in 3D will have 3n dimensions. Thus, the respective position vectors of the i th particle at t th iteration for 2D and 3D problems are: Based on the path nodes including the start and end points, B-spline geometry is used to construct the AUV path. B-spline are parametric curves generated from a series of connected piecewise polynomials [36], which are suitable for modelling the AUV path because of its continuity for smooth path and locality for path alteration without loss of continuity. The path nodes act as the control points for the B-spline curve according to the following curve function, which gives the output vector P(u) representing a B-spline curve with k+1 order in the form of discretised waypoints. Given the total number of control points is n+2, the total number of piecewise polynomials is one less than the number of control points, which is n+1.
where x i are the control points, u is the non-decreasing knot sequence contained in a knot vector U = [u 0 , …, u i, …, u n+k+2 ], and B i,k (u) are the piecewise polynomial basis functions of k degree defined by Cox de Boor recursion [37] as follows.
The continuity of the spline is fully dependent on the basis functions. Hence, it can be noted from (19) that the control points, i.e. path nodes can be adjusted during the path optimization process without affecting the spline continuity.

Evaluation Functions
When implementing PSO-based algorithms in an optimization problem, it is critical to develop the suitable cost evaluation functions to measure the fitness of the particles based on their respective solutions. Due to the high computational efficiency of PSO-based algorithms, the evaluation functions usually contribute to the majority of the computational time [11]. The functions are developed based on the optimization criteria of the problem. They must closely resemble the physical conditions of the problem space to provide an accurate cost representation model for finding the optimal solution. For path planning, which is a minimization problem, a lower cost/fitness indicates a better solution. The main criteria for evaluating the AUV path are: -Minimum length or travel time required to reach the target -Minimum exposure to the threats -Compliance with physical motion limitations of AUV As the optimum of all criteria does not necessarily coincide, a trade-off between these criteria can be established using a weighting scheme with multiple evaluation functions, which include a main evaluation function to measure the path length/time cost, a function to measure the threat cost along the path, and functions to measure the compliance of the path with respect to the AUV motion limitations. Thus, the fitness of a particle/path X i can be given by a combination of several evaluation functions F k for different criteria, with each criterion weighted by a cost factor f k .
( ) ∑ ( ) * + (22) where k refers to different evaluation functions and K is the total number of functions for the problem.

Path Travel Time Cost
The main evaluation function for path planning problem is to measure the path cost based on its length or time to travel on the path. This study focuses on finding an optimal path that is capable of taking advantage of favourable current to assist the AUV motion, while avoiding the less favourable current to achieve a shorter travel time. For this purpose, a travel-time-based evaluation function is developed in this study.
Based on previous formulation, a given path X i can be represented as a series of path nodes or alternatively in the form of discretised waypoints P = [p i,1 , p i,2 , … , p i,m ], where P is the output from B-spline function and m is the total number of discretised waypoints. The travel time cost F 1 along a path can be determined by finding the sum of discretised time required to travel on each small path segment that connects the consecutive discretised waypoints in P.
where V g is the ground reference velocity of the AUV, which is the resultant AUV velocity under the effect of surrounding ocean current. The contribution of current on the AUV can be obtained by projecting the current velocity V c in the direction of the AUV water reference velocity V a , which is essentially the direction of the path vector. Thus, V g is given by the sum of V a and the contribution of V c as shown in (24).

Threat Cost
The obstacles avoidance ability of the path planner relies on the threat cost evaluation function, in which the exposure of the path to threats/obstacles is measured. All threats in the problem space are modelled as ellipses (or circles if the major axis and minor axis are equal) under 2D condition, and as ellipsoids (or spheres if all the principal axes are equal) under 3D condition with their principal axes aligned with the coordinate axes. A threat cost evaluation method based on the intersection between the path and the threats is employed in this study.
Assuming  (25). The parametric equation of a path segment that connects two consecutive waypoints p i, j = (x 1 , y 1 , z 1 ) and p i, j+1 = (x 2 , y 2 , z 2 ) can be written as (26). The cost evaluation in 2D takes a similar approach, except that the dimension reduction in 2D reduces the number of variables and hence simplifies the computation.
Substituting (26) into (25) yields the following equations, which are expressed in terms of s. The intersection of the path with the threat can be evaluated by obtaining the discriminant ξ of (27) according to (31).
A safety margin is added to the principal axes of all threat regions so that the AUV will not conflict with the threat when ξ = 0, i.e. the path is tangent to the threat region. When ξ > 0, the path will conflict with the threat if the roots s 1 and s 2 given by (32) are within the range of 0 ≤ s 1 , s 2 ≤ 1.

√ (32)
If the path conflicts with the threat, the threat cost will be proportional to the length of the path segment contained in the threat region as given in (33). The intersection points, S 1 and S 2 can be determined by solving (27) using the obtained s 1 and s 2 , and substituting them back into (26).

Physical Motion Limitations
The considerations for physical motion limitations of AUV should include its yaw (turning) and pitch motions at a given forward speed. Evaluation functions are developed to check the compliance of the path with respect to these limitations, and to penalise the cost if any of the limitations is violated. To check the path compliance with the yaw limitation, the turning angle of the path in the x-y plane is measured and compared against the maximum allowable turning angle ψ max . Considering two consecutive path segments that consist of three waypoints p i, j , p i, j+1 and p i, j+2 (refer to Figure 1), the turning angle ψ can be obtained from the cosine function as shown in (34). The first part of the function is the scalar projection of the second path segment on the first segment in the x-y plane, while the second part is the length of the second path segment in the x-y plane. The cost F 3 for violating the yaw limitation is obtained from the calculated turning angle as shown in (35).
For the pitch motion, the instantaneous pitch angle θ and the change in pitch θ of the AUV at any point should not exceed their respective maximum values (θ max & θ max ). Referring to Figure 1, θ can be determined using the basic tangent function as shown in (36). Next, θ can be calculated using (37).
From the calculated pitch, the cost F 4 for violating θ max and the cost F 5 for θ max can be obtained as shown in (38) and (39) respectively.

SIMULATIONS
The performance of the proposed algorithms is evaluated in the AUV path planning problem under different scenarios in this section.

Simulation Setup
The path planning of the AUV was conducted in a 1000-run basis Monte Carlo simulation under a 2D scenario, followed by a 3D scenario. The machine used has Intel Core i5-6300U CPU @ 2.4GHz with 8GB RAM. The problem spaces of the simulations were assumed to be a current field that consists of 500×500 square grids for 2D, and 500×500×500 cube grids for 3D, with each side of the grid equivalent to 1 metre. Non-uniform ocean current and static obstacles of different sizes are present in the problem space. The current field was generated based on the data obtained from the field experiment conducted at Beauty Point, Tasmania, Australia.
The AUV is required to travel from a starting point to a target with a pre-set water reference velocity of 1.5m/s. Based on the properties of REMUS 100 AUV, the safety margin used in the threat computation is set to 1 metre, while the angles ψ max , θ max and θ max are set to 30, 45 and 10 respectively. The cost factor for the path travel time f 1 was set to be 1.0, and the other cost factors f 2f 5 were all set to be 0.25, so that all costs except the travel time cost have similar impact on the solutions. Hence, when the path solution is not violating the threat exposure (f 2 ) and the physical motion limitations (f 3f 5 ), the fitness value of the solution directly represents the time required for the AUV to travel on the corresponding path.
In each simulation run, the maximum number of iterations for the algorithm was set to 100 with a pre-defined stopping threshold. This means the algorithm will be iterated up to a maximum number of 100, but will be stopped whenever the solution difference between iterations is less than the pre-set threshold. The population size of all algorithms was set to 150 particles, with each particle consisting of 5 path nodes, meaning each particle has 10 dimensions for the 2D problem and 15 dimensions for the 3D problem. All algorithm parameters were set to be their respective suggested values as discussed in Section 2. For comparison purposes, another path planning technique, RRT* and other metaheuristic algorithms, including Ant Colony Optimization (ACO) [38], Firefly Algorithm (FA) [16], Differential Evolution (DE) and Genetic Algorithm (GA) [9], are also tested in this study.

Simulation Results
The performances of the algorithms are compared based on the following properties: solution qualities, stabilities, convergence behaviours, and computational requirements. These properties can be evaluated by studying the fitness values of the solutions obtained and the computational time required to obtain the solutions. The fitness value of a solution is simply the time required (cost) for the AUV to reach the endpoint from the starting point by travelling on the path corresponding to the solution. Therefore, a lower fitness value indicates a higher solution quality and hence a stronger search ability.
The Monte Carlo simulation results of the 2D and 3D scenarios are graphed and compared in boxplots as shown in Figure 2 and Figure 3. The data was not normally distributed based on the Shapiro-Wilk normality test. In the boxplots, the medians of the data are represented by the red horizontal line; the blue boxes indicate the range of 25th to 75th percentile; the black whiskers indicate the acceptable data range. For the boxplots of fitness values, the extreme lowest end of each whisker gives the individual best fitness obtained by each algorithm over the 1000-run simulation, and the green cross sign represents the best known (lowest) fitness value among all algorithms in the simulations. The acceptable data ranges and percentile ranges are indicators for the stabilities of the algorithms performances, while the medians give information about the solution qualities and search abilities of the algorithms. The Kruskal-Wallis ANOVA procedure with a significance level of 0.05 was used to rank the solution qualities (fitness values) based on the Holm-Bonferroni stepdown method. The algorithms are given the same rank if they are not statistically different from one another. Detailed results of the path planning simulation, including the median of fitness obtained (Med.), the best known fitness (Best), the interquartile range (IQR), the ANOVA rank (#R), the median of computational time (T) and the total ranks, are tabulated in Table 4. The total ranks are calculated from the summation of the ranks for the 2D and 3D scenarios. The ranking of the algorithms does not consider the impact of computational time.
Based on Figure 2, Figure 3 and Table 4, almost all the PSO-based algorithms have better solution quality than RRT* and other metaheuristic algorithms, with the exception of standard PSO being outperformed by FA. Despite having lower solution quality, RRT* has the shortest computational time in both 2D and 3D scenarios. It can also be seen that all variants of PSO and QPSO produced better solution qualities than the standard PSO and QPSO. DEPSO and DEQPSO outperformed all other algorithms by achieving the lowest medians for fitness value in both 2D and 3D. The total ranks of DEPSO and DEQPSO suggest that the two fully DE-hybridized algorithms are able to produce the top two best solution qualities for path planning problem. However, the computational time of DEPSO and DEQPSO are significantly higher than all the other algorithms due to the high computational requirements of the greedy selection operator. The solution qualities of SDEAPSO and SDEQPSO are second to the fully DE-hybridized algorithms; they are ranked similarly in 2D based on the ANOVA ranking. APSO has better solution quality than SDEPSO in 2D. It is worth noting that APSO has the lowest interquartile range is both 2D and 3D, indicating the highest stability among all the algorithms. In the 3D scenario, SDEQPSO is ranked slightly higher than SDEAPSO, while SDEPSO is ranked similar to APSO. The total ranks of the overall performance in both 2D and 3D reveal that SDEQPSO and SDEAPSO are ranked as the third and the fourth respectively. More importantly, the computational times of the two selectively DE-hybridized algorithms are significantly lower than the fully DE-hybridized algorithms and very close to other PSO-based algorithms. These indicate the higher computational efficiency of SDEQPSO and SDEAPSO in solving the path planning problem because they are able to produce solution quality that is very close to DEPSO and DEQPSO, while having a significantly lower computational requirement. In terms of problem size, the computational time required by the path planner is considered short, particularly in comparison to the computational time required for estimating the ocean environment based on the AUV sensory measurements.

VEHICLE PATH VALIDATION
For validation purposes, the path solutions generated by the AUV path planner were used as a reference trajectory for a dynamic model of REMUS 100. This section briefly explains the dynamic model and the path following controller used.

Dynamic Model
Based on Fossen"s vectorial representation [39] and SNAME (Society of Naval Architects and Marine Engineers) standard formulation, the 6 DOF equation of motion for a typical AUV can be modelled as shown in (40) and (41).
where R (η 2 ) and T (η 2 ) are the rotation matrices between inertial and body-fixed reference frames for the translational velocities and angular velocities respectively. η includes the position η 1 and the orientation η 2 of the vehicle with respect to the inertial reference frame, while the derivative of η in (40) represents its rate of change. includes the translational velocities and the rotational velocities of the vehicle with respect to the body-fixed reference frame as described in the vectors in (42).
In (41), M and C( ) describe the inertial and Coriolis matrices (including rigid body and added mass) respectively, D( ) is the hydrodynamics damping matrix, g(η) is the hydrostatics restoring forces, and τ describes the control forces from the actuators. This study uses the REMUS 100 model derived from (40)-(42) by [40]. The hydrodynamics coefficients calculated in [40] are used in the vehicle model.

Path Following Controller
The path following controller of the AUV model used the integral line-of-sight (iLOS) guidance law to set the yaw and pitch angles for following the trajectory generated by the path planner. The iLOS guidance law described by [41] allows the AUV to shape the convergence towards the path in the presence of ocean current and environmental disturbance. The desired iLOS yaw angle (heading) ψ d and pitch angle θ d can be given as follows: where e is the cross-track error, h is the vertical-track error, K i,y and K i,z are the integral gains, and Δ y and Δ z represent the look-ahead distances for iLOS heading and pitch respectively. The integral terms of cross-track error e int and vertical-track error h int will produce non-zero ψ d and θ d even when the AUV is on the planned path, allowing the vehicle to counteract any effects of ocean current with the necessary side-slip and pitch angles. The rates of integral terms ė int and int will reduce the integral action with large cross-track and vertical-track errors (i.e. vehicle is far from the planned path), in order to minimize the risk of integrator wind-up.

Validation Results
The feasibility of the path solutions is first checked by comparing against the motion limitation of REMUS 100, which has a minimum turning radius of 8.1 metres in the worst case scenario [42]. The curvature radius of a feasible path must be higher than the minimum turning radius. The paths generated by SDEQPSO satisfy the AUV motion limitation as shown in Figure 4.
Next, the 2D and 3D solutions generated by SDEQPSO were validated by comparing against the simulated paths in Figure 5. The AUV is required to travel from the starting point (green square) to the target (pink star) without running into obstacles, while trying to take advantage of the favourable current to assist the AUV motion. In the 2D results, the blue-coloured regions indicate the favourable current while the red-coloured regions denote the less favourable current. In both results, the solid sections of the planned paths indicate that the favourable ocean current has a positive effect on the AUV motion while the dashed sections suggest otherwise. It can be observed that the paths are able to follow the favourable current and avoid the less favourable current to achieve a shorter travel time. The simulated paths closely resemble the planned paths in both scenarios. The cross-track errors of the simulated paths relative to the planned paths for the 2D and 3D scenarios are graphed in Figure 6. The errors for both scenarios are well below 1 metre, proving that the AUV was able to follow the planned paths closely. Hence, the simulation results showed that the path solutions generated by the proposed algorithm are smooth and feasible for the path planning application. Figure 6. Cross-track error of simulated path relative to planned path

CONCLUSION
By selectively hybridizing with differential evolution, this paper presents new variants of PSO algorithm with improved search ability for the global minimum path of an AUV without increasing the computational requirements. The proposed algorithms were benchmarked against other algorithms in an offline AUV path planner because if the proposed algorithms can provide better computational efficiency to demonstrate the minimum capability of a path planner, then they will outperform the tested algorithms in the online path planner. Based on the Monte Carlo simulations and ANOVA procedures, the SDEAPSO and SDEQPSO algorithms were able to achieve a similar performance to DEPSO and DEQPSO algorithms in terms of solution quality and stability, while having a significantly lower computational requirement. Most importantly, the simulation results showed that the planned paths in both the 2D and 3D scenarios are smooth, feasible and able to account for a priori known environment.
The PSO-based algorithms proposed in this study are most efficient for solving nondeterministic polynomial time (NP)-hard problem, such as the path planning problem. Although the simulation assumed a priori known environment to represent the minimum capability of a path planner, the algorithms can be adapted to a more realistic operational condition in future work due to the demonstrably high computational efficiency, which is suitable for solving compute-intensive problems such as path re-planning in highly dynamic environments. The future extension of this work will include developing a path re-planning algorithm that can deal with a priori unknown environment.