Parallel genetic algorithm based on construction of gene pool in the ordinary network for TSP
Xiaoqin Fan
General Education Department, Guangzhou Panyu Polytechnic, Guangzhou 511583, China
*Corresponding Author
Xiaoqin Fan
Article History: | Received: 20.06.2022 | Accepted: 24.07.2022 | Published: 25.08.2022|
Copyright @ 2022: This is an open-access article distributed under the terms of the Creative Commons Attribution license which permits unrestricted use, distribution, and reproduction in any medium for non commercial use (NonCommercial, or CC-BY-NC) provided the original author and source are credited.
TSP (Travelling Salesman Problem), as an ideal problem,its research findings are not to be applied directly, but widely translated into many combinatorial optimized problems, such as cargo distribution of large chain stores, PCB drilling, genetic detection and etc.. They can be all abstracted to TSP problem.Therefore,the study of TSP problems and their solutions are of great significance both theoretically and practically.
The perfect method to solve the TSP problem is global search method. Due to the limitation of computer operating ability, when n is relatively big, with global search method, it seems impossible to find out the accurate optimal solution but only approximate solution. So far, there is no effective algorithm to solve this kind of problem, thus any simplified method to solve TSP will attract much attention and evaluation. Many scholars have great interest in TSP and many methods solving TSP emerg, including evolutionary algorithm [1,2].
Genetic Algorithm for TSP is of higher efficiency [3,4], but because of the increasing “n”, the numbers of the cities, using genetic algorithm to solve TSP resulting worse solution and decreased convergence rate. Here I propose parallel genetic algorithm to address this issue. However, huge investment in large parallel computer system is the barrier for expansion. When the number of cities in a TSP is relatively big, a certain number of parallel computers are needed for the tasks.
So this paper focuses on the construction of local gene pool and greedy gene pool under the common computer network, by using greedy algorithm and the improved Inver-over operator[5]in the child process, implementing dynamic optimization to the gene pool. In the main process, hybrid genetic algorithm based on the gene pool is applied to dynamically update the elite groups, replacing genes from the gene-group with genes from gene-library to improve the speed of evolutionary algorithms and the quality of the result.
This paper is divided into five sctions.
The first section introduces the research background, the main contents, the purpose and the significance of the research.
The Second section is background knowledge. TSP, genetic algorithm, the present situations and the existing problems of using genetic algorithm to solve TSP are briefly introduced.
The third section deals with the method to construct gene pool. It mainly introduces the method to build partial gene pool and greedy gene pool, and the algorithm thought of greedy gene pool.
The fourth section describes improvement of Inver - over operator
The fifth section presents the thought of greedy gene pool and the method of implementing parallel algorithm.
Based on normal computer simulation experiment under the general network, the sixth section selects some samples from TSP case library (TSPLIB) which is internationlly agreed, to verify the validity of the algorithm described.
The last section comes to a conclusion, and points out the deficiency of this article and the direction of further research.
Introduction of TSP and Genetic Algorithms
TSP long-standing problem is a typical combinatorial optimization problem and has proven to be NP-complete problem. The study has attracted many scholars and there have been a large number of algorithms for solving TSP. Among them is evolutionary algorithm [1, 2]. Known as the traveling salesman problem or Wayfaring Salesman Problem, TSP problem may be summarized as follows: There are n cities. Starting from a given city, how does a traveling salesman find out the shortest way back to the starting city after having visited all the n cities? Its mathematical model is as follows: Given a graph , the edge , a non-negative weights , find Hamiltonian cycle , making the total weight the smallest, where [3,4]. It is well-known combinatorial optimization problems of mathematics field.
TSP was first mentioned in 1800. Between 1920s and 1950s of the 20th century, people began to realize that TSP is an NP problem [5,6]; in 1954, optimal solution to TSP of 42 cities is obtained. Since 1954, the scale of optimal solutions to TSP is larger and larger. Instances with up to 13509 towns were managed to be exactly solved in the United States in 1998. An optimal tour through a 15,112-town instance in Germany is computed in 2001. Nevertheless, the cost of the project is huge. According to the report, to solve the TSP problem between 15112 towns in the United States, 110 computers with 500 MHZ compaq Ev6Alpha processor from Rice University and Princeton University were put into the process. These 110 connected computers spent 22.6 years in total. In May 2004, the Swedish obtained the optimal solution of 24978 towns.
TSP problem became increasingly popular in scientific circles in Europe and the USA. Notable contributions were made by George Dantzig, Delbert Ray Fulkerson and Selmer M. Johnson at the RAND Corporation in Santa Monica, who expressed the problem as an integer linear program and developed the cutting plane method for its solution. With these new methods they solved an instance with 49 cities to optimality by constructing a tour and proving that no other tour could be shorter. In the following decades, the problem was studied by many researchers from mathematics, computer science, chemistry, physics, and other sciences [6].
Richard M. Karp showed in 1972 that the Hamiltonian cycle problem was NP-complete, which implies the NP-hardness of TSP. This supplied a mathematical explanation for the apparent computational difficulty of finding optimal tours.
Great progress was made in the late 1970s and 1980, when Grötschel, Padberg, Rinaldi and others managed to exactly solve instances with up to 2392 cities, using cutting planes and branch-and-bound.
In the 1990s, Applegate, Bixby, Chvátal, and Cook developed the program Concorde that has been used in many recent record solutions. Gerhard Reinelt published the TSPLIB in 1991, a collection of benchmark instances of varying difficulty, which has been used by many research groups for comparing results. In 2006, Cook and others computed an optimal tour through an 85,900-city instance given by a microchip layout problem, currently the largest solved TSPLIB instance. For many other instances with millions of cities, solutions can be found that are guaranteed to be within 2-3% of an optimal.
The total number of possible paths of TSP and the number of cities are increased by factorial number; therefore, it is difficult to find out the optimal solution. As for this problems, no matter the traditional dynamic programming, branch and bound method, greedy method or other recent methods, like intelligent optimization algorithms (tabu search, simulated annealing, genetic algorithm and artificial neural networks, ant algorithm) and their hybrid algorithm are of lower efficiency, and higher cost.
Genetic algorithm is a stochastic simulation of biological evolutionary mechanisms global search and optimization methods[6]. It automatically obtains and optimizes the search space, and adaptively control the search process in order to achieve the optimal solution. General procedures for genetic algorithm optimization problems are as follows:
1) Initialization
The population size depends on the nature of the problem, but typically contains several hundreds or thousands of possible solutions. Often, the initial population is generated randomly, allowing the entire range of possible solutions (the search space). Occasionally, the solutions may be "seeded" in areas where optimal solutions are likely to be found[7].
2) Selection
During each successive generation, a proportion of the existing population is selected to breed a new generation. Individual solutions are selected through a fitness-based process, where fitter solutions (as measured by a fitness function) are typically more likely to be selected. Certain selection methods rate the fitness of each solution and preferentially select the best solutions. Other methods rate only a random sample of the population, as the former process may be very time-consuming.
The fitness function is defined over the genetic representation and measures the quality of the represented solution. The fitness function is always problem dependent. For instance, in the knapsack problem one wants to maximize the total value of objects that can be put in a knapsack of some fixed capacity.
3) Genetic operators
The next step is to generate a second generation population of solutions from those selected through a combination of genetic operators: crossover (also called recombination), and mutation.
4) Termination
This generational process is repeated until a termination condition has been reached. Common terminating conditions are:
1. A solution is found that satisfies minimum criteria.
2. Fixed number of generations is reached.
As genetic algorithm is not affected by the limitation of search space, requirements such as continuity, conductivity and unimodality are not necessary. Its robustness and implicit parallelism make it is widely used in solving the complex problems that are difficult to solve with the traditional methods [7], such as combinatorial optimization, pattern recognition, computer network optimization, etc. People have been using genetic algorithm to solve large-scale traveling salesman problem [6].
Construction of greedy gene pool
Set that is a feasible path for point , the total length of the loop is , where, is the jth point, is the Euclidean distance between point and point . By using this algorithm, the total length of the loop can be used to evaluate the individual[9].
Set , the coordinates of point i and point j are and respectively, then the Euclidean distance between i and j is . Set that , then D is the square of .
Set . For each point , according to the size of , sort the th line of the corresponding elements in in accordance with the order from small to large, and insert into the first column in , expand to a phalanx , calling the phalanx local gene pool .
After the generation of local gene pool , the first m columns were selected from to compose n-6 new matrixes this matrixes vary in the number of points, for example, there are m total different points in , which respectively reflects the local situation in the name of a certain point as the center, while the th line in constitutes point i’s neighbor set points. In the algorithm, each can be optimized by using greedy algorithm, and according to the computer distribution, it can be separately placed on different computers to run independently, and thus get a new , that is, the greedy gene pool. This n-6 matrix point number which varies, for example, a total of 6 B6 in different points 7 points different from B7, They are represented in the center of which is a point of local conditions, such as matrix Bm constitute the first line of the point i i neighbor point set. Algorithm runs, for each matrix greedy algorithm implementation, and according to the actual situation of the computer running the distribution will be different on different computers Bm run independently greedy algorithm, matrix , where is greedy gene pool.
TSP problem with the greedy algorithm, the time required does not exceed .In general, solution obtained with greedy algorithm is not the optimal for TSP problem. It is necessary to optimize the greedy gene pool in the child process, and to get the best genes from the main process for dynamic update of the local gene pool.
Since in the evolvement process, the genes involved in genetic operators promoter is mainly from the individuals, thus the quality of the evolved individual determines the efficiency of the algorithm. If individual’s fitness values are poor, the overall performance of the algorithm will be affected, especially for TSP. Although the greedy algorithm does not guarantee optimal solution, we can use it to generate relatively good initial population, and thus greatly improve parallel TSP algorithm performance of the sub-process evolution, solving the speed, quality solution[10].
Paper[11] presents the evolution algorithm for TSP by constructing the gene pool which improves both the accuracy and the speed. But its point-centered gene pool constructs gene chip from the near points, without considering the relationship among the points which are among the gene fragments. So the algorithm is running fast early, but in the latter part of the evolution of computing, the gene pool almost had no effect on group.
Because the front part of the gene constructed in the algorithm is of fine locality, only the first m individual genes are extracted to construct local gene library, reducing the length of the gene, thereby improving the efficiency of the algorithm.
Improved Inver-over operator
Guo's algorithm[12-14] is one of the fastest algorithms for solving TSP which proposed Inver-over operator. Inver-over operator has characteristics of both crossover and mutation which takes full advantage of the information community, and it is much higher in speed and quality than other simple crossover. But the experiment also shows that when the number of cities becomes relatively larger, its global optimization capability dramatically declines. As a result, many scholars have studied Inver-over operator and try to improve it [5, 10]. This paper improves Inver-over operator by labeling the all points near every vertex as its neighboring point set, initializing each point set, choosing the next search space instructively [11]. The main steps are as follows:
Initialize the group P using the neighbor set of points
While (stop condition not satisfied)
{
For (each individual Si of groups)
{
S’= Si;
Select a vertex c from S randomly;
While (true)
{
Generate a random number p (0 ≤ p ≤ 1);
if (p<pc) /*pc is a constant*/
{
Select vertex c’ from the remaining selected vertices from S’;
}
else
{
Randomly choose an individual from P;
Mark the selected-individual c’s next vertex as c;
}
if (the vertices c and c’ in S’ are adjacent)
{
break;
}
else
{
Flipped vertex c to the next vertex to the vertices between c’, c = c';
}
} /* end while*/
if (the fitness value of S’≤ Si)
{
Si = S’;
}
} / * end for*/
} / * end while*/
Parallel algorithm basing on the greedy gene pool
Based on the greedy evolution of the gene pool, parallel algorithm sets a computer as the mainframe of the entire system, which runs the main process and as the center of data exchange in the evaluation process. The main process optimizes the elite by using the hybrid genetic algorithm based on gene pool and Guo Tao algorithm[11]. The basic steps are as follows:
(1) to generate an initial local gene pool and greedy gene pool, and to deposit these genes into the database for sharing;(2) to write the initial control information into an asynchronous control information database, to prohibit the work machines to read and when the algorithm starts; (3) to generate and , and deposit them into their corresponding database;(4) to divide the working machines based on the n-size of the TSP problem so that it can deal separately with one or more lines in and ;(5) to write control information into an asynchronous control information database, allowing working machine to read and ; and store the best individuals in the elite group into the database for the sharing of the working machines;(6) to read from the database the evolution elite groups and generate the optimal solution, and then deposit it into its database s after setting every other algebra.
Other computers in the network as a working machine run sub-process and read the TSP from the mainframe and obtain the corresponding gene pool. Sub process only solves point-centered domain. The main steps are:
to copy all the data from the mainframe to the working machine and extract genes from the corresponding local gene pool according to the task assigned or set (2) to generate a number m randomly, implement the greedy algorithm on , so that it becomes greedy gene pool ;(3) to implement improved Inver-over algorithms on ;(4) to store in the mainframe to obtain the optimal solution;(5) to obtain the gene chip with the length of m and for hybridization [15]; store optimal gene in the mainframe, turn to (3).
Simulation
Multiple TSP problems are selected for testing from the paper[16] and international general TSP instance library TSPLIB[17]. Algorithm uses the C # programming language in Microsoft Visual Studio 2019 Preview platform, which is configured to host CPU Intel Core i7-1165G7@4.70GHz, memory is 16 GB. The operating system is Windows10. The database software is Microsoft SQL Server 2019. The working machine connects mainframe database through ADO.NET, carrying out the process on 50 computers in LAN. Table 1 lists the optimal value. The optimal paths for the first 6 problems are as in Figure 1-6, while the optimal paths for problem pr2392 are too many to be listed.
Figure Image is available at PDF file
Figure1 Oliver30 optimal path Figure2 att48 optimal path
Figure Image is available at PDF file
Figure3 eil76 optimal path Figure4 chn144 optimal path
Figure Image is available at PDF file
Figure5 a280 optimal path Figure6 TSP pcb442 optimal path
Table Image is available at PDF file
Conclusion
This paper proposes an algorithm to solve the problems using computers in general network instead of parallel computer, which is economic convenient and fast.
Simulation shows that the proposed algorithm is feasible. When the computer configuration is low, it is an effective way to solve tough problems. When the scale of TSP is too large, the algorithm converges relatively slow and the algorithm needs further improvement.
Baraglia R,Hidalgo J I,Perego R. A hybrid heuristic for the traveling salesman problem.IEEE Trans. On Evolutionary Computation,2001;5:613-622
Wen Yi,PanDa-zhi. Improved Genetic Algoithm for Traveling Salesman Problem [J]. Computer Science, 2016(S1): 90-92.
Sun Wen-bin,Wang Jiang. An Algorithm for TSP Problem Based on Genetic Algorithm and Multi-optimization Operration [J]. Geography and Geo-Information Science, 2016, 32(4): 1-4.
Yang H, Kang LS, Chen YP. A gene-pool based genetic algorithm for TSP. Wuhan University Journal of Natural Sciences, 2003; 8: 217-223.
Zhai Fan,Xie Xian-hua. Study on optimal robot task scheduling based on genetic algorithms [J]. Mathematics in practice and thory, 2020, 50(15): 143-154.
Clarke G, Wright J W. Scheduling of Vehicles from a Central Depot to a Number of Delivery Points. Opera. Res. 1964; 12:568-581.
Li MQ, Ji-song Kou. Basic theory and application of genetic algorithms. Beijing: Science Press, 2002.
Simin Yang, Fengjun Wang. Research on Solving TSP with Genetic Algorithm or Branch and Bound Method[J]. Computer Science and Application, 2020, 10(9): 1609-1617.
Luo Zhiyuan,Feng Shuo, Liu Xiaofeng, et al.Method of area coverage path planning of multiunmanned cleaning vehicles based on step by step genetic algorithm [J]. Journal of Electronic Measurement and Instrumentation, 2020, 32(8): 43-50.
Wang,Zhen,Liu Rumin,Zhu Yangguang, et al. Improved genetic algorithm for solving TSP problem[J]. Electronic Measurement Technology, 2019, 42(23):91-96.
Chen Siyuan,LIN Piyuan,HUANG Peijie. Pointer Network Improved Genetic Algorithm for Solving Traveling Salesmen Problem [J]. Computer Engineering and Applications, 2020, 56(19):231-236.
Hassanat A B, Prasath V B, Abbadi M A, et al. An improved genetic algorithm with a new initialization mechanism based on regression techniques[J]. Information, 2018, 9(7): 167.
Fu C, Zhang L, Wang X, et al. Solving TSP problem with improved genetic algorithm[C]//AIP Conference Proceedings. AIP Publishing LLC, 2018, 1967(1): 040057.
Agrawal M, Jain V. Applying Improved Genetic Algorithm to Solve Travelling Salesman Problem[C]//2020 Second International Conference on Inventive Research in Computing Applications (ICIRCA). IEEE, 2020: 1194-1197.
Akter S, Nahar N, ShahadatHossain M, et al. A new crossover technique to improve genetic algorithm and its application to TSP[C]//2019 International Conference on Electrical, Computer and Communication Engineering (ECCE). IEEE, 2019: 1-6.