Background & Summary

With the continuous development of sensor, wireless communication and other technologies, the performance of UAVs has been improved. Due to their small size and long range, UAVs have been widely used in military and civilian fields, such as attack and confrontation1, mineral exploration2, agricultural monitoring3, traffic patrol4, etc. However, due to limitations in payload, volume, mass, and other factors, the shortcomings of single UAVs, such as low mission success rate and insufficient system resilience, have become increasingly prominent. Through real-time collaborative information sharing and resource complementarity, UAV clusters can effectively compensate for the shortcomings of single UAVs, improve operational efficiency and success rate, and have gradually attracted widespread attention from researchers5.

Accurate navigation information is the key to achieving intelligent collaboration for UAV clusters. In most environments, individual UAVs in a cluster rely on satellite navigation to obtain positioning information. However, when an individual UAV is unable to receive satellite signals due to device failure or external interference, its positioning accuracy will be affected, significantly reducing the coordination effectiveness of the entire cluster6,7. Furthermore, in satellite-denied environments, information exchange between UAVs in a cluster is crucial for maintaining formation and avoiding collisions. UAVs can communicate with each other through collaborative ranging, enhancing the reliability and stability of the navigation system for the entire cluster8,9. Therefore, it is necessary to explore collaborative navigation techniques for UAV clusters under satellite-denied conditions.

In satellite-denied environments, radio frequency (RF)-based collaborative navigation techniques for UAV clusters occupy an important position due to their universality and low cost compared to complex vision and optical positioning systems10,11. One type of RF positioning method utilizes the geometric characteristics of RF links between devices to estimate positions, including distance-based techniques such as Time of Arrival (TOA), Time Difference of Arrival (TDOA), and Two-Way Ranging (TWR)12,13,14, as well as angle-based techniques such as Angle of Arrival (AOA)15. Compared to angle-based techniques, distance-based techniques offer advantages in terms of accuracy and stability. Another type of RF positioning technology relies on scene analysis, also known as RF fingerprint positioning methods16, which have limited flexibility and are mostly used for indoor planar positioning.

UWB devices send and exchange information through extremely short pulse sequences (<1ns), which form a signal bandwidth greater than 500MHz. This allows UWB signals to have high temporal resolution, significantly reducing the impact of non-line-of-sight on measurements17. Therefore, UWB systems can accurately measure the time of flight (TOF) and achieve precise navigation and positioning methods based on time-based ranging.

TOA relies on node devices with internal clocks that are perfectly synchronized with the main network clock. The internal clocks of all devices in the network are synchronized with a master clock, and the TOF is calculated by subtracting the transmit timestamp from the receive timestamp. However, mismatches in node clock synchronization can introduce significant ranging errors, with each nanosecond representing approximately 30 cm of ranging error18. When time synchronization is only available between reference base station nodes and not between UAV nodes, TDOA can be deployed. UAV nodes send data frames to all base station nodes within the communication range, and the base station nodes mark the received messages with global timestamps and report the receive timestamps to the network center. Based on the differences in the base station node timestamps, the position of the UAV node can be estimated, which requires a significant amount of computing power to support the entire process19. TWR technology provides extensive measurements by continuously timestamping messages during round-trip propagation. The overall ranging process uses four timestamps to eliminate local clock biases and calculate TOF, enabling TOF measurements without clock synchronization between nodes20. Compared to TOA and TDOA, TWR ranging technology does not require any time synchronization and is more suitable for use in low-power, large-scale networks.

To fully reflect the three-dimensional relationship from ranging to positioning, there is a possible algorithm for ranging-based positioning called multi-point positioning. This algorithm mathematically describes the individual ranges between the drone node and individual reference base station nodes as Euclidean distances:

$$\left\{\begin{array}{c}{\left({x}_{1}-x\right)}^{2}+{\left({y}_{1}-y\right)}^{2}+{\left({z}_{1}-z\right)}^{2}={d}_{1}^{2}\\ {\left({x}_{2}-x\right)}^{2}+{\left({y}_{2}-y\right)}^{2}+{\left({z}_{2}-z\right)}^{2}={d}_{2}^{2}\\ \vdots \\ {\left({x}_{n}-x\right)}^{2}+{\left({y}_{n}-y\right)}^{2}+{\left({z}_{n}-z\right)}^{2}={d}_{n}^{2}\end{array}\right.$$
(1)

Where dn represents the distance between the UAV node located at the \(\left(x,y,z\right)\) coordinates and the preset nth base station node located at the \(\left({x}_{n},{y}_{n},{z}_{n}\right)\) coordinates. Expanding Eq. (1) gives:

$$\left\{\begin{array}{c}-2{x}_{1}x-2{y}_{1}y-2{z}_{1}z+{x}^{2}+{y}^{2}+{z}^{2}={d}_{1}^{2}-{x}_{1}^{2}-{y}_{1}^{2}-{z}_{1}^{2}\\ -2{x}_{2}x-2{y}_{2}y-2{z}_{2}z+{x}^{2}+{y}^{2}+{z}^{2}={d}_{2}^{2}-{x}_{2}^{2}-{y}_{2}^{2}-{z}_{2}^{2}\\ \vdots \\ -2{x}_{n}x-2{y}_{n}y-2{z}_{n}z+{x}^{2}+{y}^{2}+{z}^{2}={d}_{n}^{2}-{x}_{n}^{2}-{y}_{n}^{2}-{z}_{n}^{2}\end{array}\right.$$
(2)

To linearize the obtained nonlinear equations, a new variable is introduced:

$$R={x}^{2}+{y}^{2}+{z}^{2}.$$
(3)

Now the position of the UAV node can be estimated through equations:

$$\widehat{\varphi }={\left[\begin{array}{cccc}x & y & z & R\end{array}\right]}^{T}.$$
(4)

The resulting predefined linear equation set is expressed in matrix form as:

$$\Gamma =\left[\begin{array}{cccc}-2{x}_{1} & -2{y}_{1} & -2{z}_{1} & 1\\ -2{x}_{2} & -2{y}_{2} & -2{z}_{2} & 1\\ \vdots & \vdots & \vdots & \vdots \\ -2{x}_{n} & -2{y}_{n} & -2{z}_{n} & 1\end{array}\right].$$
(5)
$${\rm{H}}=\left[\begin{array}{c}{d}_{1}^{2}-{x}_{1}^{2}-{y}_{1}^{2}-{z}_{1}^{2}\\ {d}_{2}^{2}-{x}_{2}^{2}-{y}_{2}^{2}-{z}_{2}^{2}\\ \vdots \\ {d}_{n}^{2}-{x}_{n}^{2}-{y}_{n}^{2}-{z}_{n}^{2}\end{array}\right].$$
(6)

This predefined linear equation set can be used for position estimation by using the least squares (LS) method in normal form:

$$\widehat{\varphi }={\left({\Gamma }^{T}\Gamma \right)}^{-1}{\Gamma }^{T}{\rm{H}}.$$
(7)

When the distance between devices increases, the error of TWR tends to increase due to clock deviation and multipath effects. In a multipath environment, LS has limitations as it assigns equal weights to each base station node. However, Weighted Least Squares(WLS) improves positioning performance by assigning different weights to the base station nodes. The WLS position estimate \({\widehat{\varphi }}_{wls}\) and its weight matrix Θ are expressed as follows:

$${\widehat{\varphi }}_{wls}={\left({\Gamma }^{T}\Theta \Gamma \right)}^{-1}{\Gamma }^{T}\Theta {\rm{H}}.$$
(8)
$$\Theta =\left[\begin{array}{cccc}\frac{1}{{d}_{1}} & 0 & \cdots & 0\\ 0 & \frac{1}{{d}_{2}} & \cdots & 0\\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & \frac{1}{{d}_{n}}\end{array}\right].$$
(9)

WLS can enhance the estimation performance of UAV nodes, but it cannot adapt to unstable measurement noise. To facilitate the analysis and utilization of noise and error issues in datasets, the positioning Root Mean Square Error(RMSE) is expressed using the following equation:

$$RMSE=\sqrt{\frac{1}{3}\left[{\left({x}_{\widehat{\varphi }}-{x}_{\varphi }\right)}^{2}+{\left({y}_{\widehat{\varphi }}-{y}_{\varphi }\right)}^{2}+{\left({z}_{\widehat{\varphi }}-{z}_{\varphi }\right)}^{2}\right]}.$$
(10)

Currently, there are many excellent datasets for positioning and navigation in different environments, such as the UWB positioning and tracking dataset for indoor scenes21, the vehicle autonomous driving and navigation dataset for outdoor scenes22, and the visual positioning dataset relying on image matching23. However, there is a relatively small number of datasets available for the field of drone navigation and positioning, especially for the challenging problem of collaborative navigation for cluster drones under satellite navigation denial conditions.

The purpose of this article is to provide the scientific and engineering community with a well-described dataset that can be used to develop algorithms for mitigating cooperative positioning errors in UAV clusters, thereby improving the navigation accuracy of UAV formations. This dataset was created to analyze distance-based cooperative positioning algorithms for indoor and outdoor UAV cluster nodes. Data measurements were conducted in two different environments, with two cluster flight formations planned for specific tasks in each environment, and each formation set up with trajectories resembling those of air traffic. Each location point in the dataset can be used to evaluate steady-state positioning algorithms, as well as continuous dynamic positioning algorithms. Each set of trajectory points in the dataset can be utilized to evaluate cooperative navigation algorithms for clusters, as well as algorithms for cluster formation maintenance, multi-UAV path planning, and multi-source fusion navigation and positioning for UAV clusters.

From introducing the measurement equipment, measurement network, UWB ranging program, to the two environments and four formations during dataset generation, we systematically describe the measurement methods. In the data recording section, we provide a description of the data format, how the data is organized in practice, and the organization of the dataset folders, including descriptions of the floor plans. Additionally, we include technical validation programming scripts and sources for preparing the numerical and results presented in the paper. The technical validation section provides several perspectives on the dataset, visually presenting the quality of the data. The usage instructions section explains the provided materials by providing practical examples of using the dataset, which can be fully replicated by running the scripts included in the technical validation subfolder of the dataset.

Methods

Overall measurement architecture

For this measurement task, UWB modules were selected as the basic measurement tool. Base station-multi-UAV cooperative systems were established for indoor and outdoor UAV cluster applications. A scientific measurement procedure was formulated according to the requirements of cooperative navigation applications. Steadfast hardware was employed for formation flight, and efficient code was used to complete data reception, followed by the organization of various elements within the dataset. The basic measurement structure for constructing the dataset is shown in Fig. 1).

Fig. 1
figure 1

(a) Schematic diagram of measurement framework, (b) Actual measurement layout of outdoor environment, (c) Actual measurement layout of indoor measurement environment.

This dataset is constructed based on the standard Cartesian coordinate system. In both indoor and outdoor measurement scenarios, a local Cartesian coordinate system is established for specific areas. This system serves as the basis for setting the positions of the reference base station nodes and predefining the trajectory coordinates of the drone nodes. Subsequent node positioning calibration and the specific measurement process are all conducted within this reference frame. By achieving the conversion between the self-established local Cartesian coordinate system and the spherical coordinate system, the dataset can be applied to cluster navigation applications globally.

For UWB domain measurements, the commercial Nooploop LinkTrack P-B module was selected. This module is based on the principle of symmetric bilateral two-way ranging using ultra-wideband signals. It has a small measurement delay, high ranging accuracy, and supports multiple functions such as navigation, positioning, timing, and communication. It is also equipped with an offline microserver, enabling automatic wireless networking and communication transmission between modules.

The specific scenarios of navigation and positioning are the primary considerations for the utilization of this dataset. Through investigating the current application environments of UAV clusters, this paper considers outdoor cooperative formation and indoor cooperative formation separately, and plans two different UAV trajectories for each to meet the diverse needs of cluster missions.

In the implementation of the measurement scheme, there are always two types of UWB devices: base station node devices and UAV node devices. The base station node devices are UWB devices with static (fixed) positions, while the UAV node devices are UWB devices with dynamically changing (moving) positions. These two types of devices form a UWB base station system and a UWB UAV cluster system. The hardware modules in both systems send and receive signals to measure the distance between modules, thereby enabling collaborative navigation estimation for the UAV cluster formation.

The construction of this dataset includes a variety of scenarios and multiple communication channels. Therefore, a measurement procedure that involves grouping, batching, and simultaneous synchronization was adopted. The measurement coordinate system was planned in advance, and the reception code was debugged to avoid non-experimental factors from affecting the measurement data. During the dataset construction, the UWB measurement program was deployed on the PC side, with a USB data cable used to connect to the measurement network. Communication with the hardware was achieved through a serial port, and commands to start and end the measurement were issued to complete the data reception and preservation.

UWB ranging

The UWB positioning system estimates the distance between the UAV node and the base station node by exchanging ranging messages. Among the message exchange methods, double-sided two-way ranging (DS-TWR) is a widely used approach that calculates the distance by reducing the time drift caused by clock deviation between UWB devices. This method can be divided into two ranging processes, using two round-trip times to calculate the flight time. Specifically, the reply end of the first round-trip measurement is used as the initiation end of the second round-trip measurement, which helps reduce errors. The specific ranging principle is shown in Fig. 2).

Fig. 2
figure 2

Schematic diagram of ranging principle.

During the DS-TWR ranging message exchange process, timestamps TU1TU3 and TB1TB3 are generated. The UAV node sends the captured timestamps TU1, TU2, and TU3 from the second message to the base station node. After receiving this message, the base station node uses its stored timestamps TB1, TB2, and TB3 to obtain the round-trip time (TroundU) and reply time (TreplyU) of the UAV node, as well as the round-trip time (TroundB) and reply time (TreplyB) of the base station node. Therefore, the propagation time (TOF) of the UWB message is calculated as follows:

$$TOF=\frac{{T}_{roundU}\ast {T}_{roundB}-{T}_{replyU}\ast {T}_{replyB}}{{T}_{roundU}+{T}_{roundB}+{T}_{replyU}+{T}_{replyB}}.$$
(11)

The calculations for TroundU, TreplyU, TroundB, and TreplyB are as follows:

$$\left\{\begin{array}{l}{T}_{roundU}={T}_{{\rm{U}}2}-{T}_{{\rm{U}}1}\\ {T}_{replyU}={T}_{{\rm{B}}2}-{T}_{{\rm{B}}1}\\ {T}_{roundB}={T}_{{\rm{B}}3}-{T}_{{\rm{B}}2}\\ {T}_{replyB}={T}_{{\rm{U}}3}-{T}_{{\rm{U}}2}\end{array}\right.$$
(12)

Based on the TOF and the speed of light c, the distance between the UAV node and the base station node can be obtained:

$$Distance=TOF\ast c.$$
(13)

Measurement environment

Considering the typical application environment of the UAV cluster, two measurement environments were selected for the measurement activities, and two formation flight tasks were planned for each measurement environment.

Measurement Environment 0 is designated as an outdoor building. An open area between outdoor buildings was selected, with a length of 20 m, a width of 18 m, and a height of 10 m. There are plants, plastics, insulating materials, and concrete walls distributed between the buildings. The UAV cluster formation completed two types of tasks in this environment: free formation flight and climbing formation flight. The schematic diagrams of the two formations in Measurement Environment 0 are shown in Fig. 3a and b, and the specifically planned flight trajectory points are shown in Fig. 4a and b.

Fig. 3
figure 3

(a) Schematic diagram of straight formation in Environment 0(Outdoor), (b) Schematic diagram of dispersed formation in Environment 1(Indoor), (c) Schematic diagram of ascending formation in Environment 0, (d) Schematic diagram of circular formation in Environment 1, (e) Legend.

Fig. 4
figure 4

(a) Schematic diagram of flight trajectory of straight formation, (b) Schematic diagram of flight trajectory of ascending formation, (c) Schematic diagram of flight trajectory of dispersed formation, (d) Schematic diagram of flight trajectory of circular formation, (e) Legend.

Measurement Environment 1 is designated as an indoor venue. The selected indoor venue has a length of 18 m, a width of 15 m, and a height of 8 m. The venue has a complex internal structure, with multiple types and quantities of electronic equipment and metal reflectors. The UAV cluster formation completed two types of tasks in this environment: dispersed security protection and circular patrol inspection. The schematic diagrams of the two formations in Measurement Environment 1 are shown in Fig. 3c and d, and the specifically planned flight trajectory points are shown in Fig. 4c and d.

UWB base station - UAV system

LinkTrack P-B UWB supports various communication protocols such as UART and USB, with NLink protocol being a typical example. During the measurement process, the NLINK_LINKTRACK_ANCHOR_FRAME0 data transmission protocol is used. The measurement sequence under this protocol consists of Frame Header, Function Mark, Data, and Sum Check, with a total length of 896 bytes. Among them, Frame Header and Function Mark are fixed values; Data represents the transmitted data content; and Sum Check is the lowest byte after summing up Frame Header, Function Mark, and Data (i.e., summing up all the preceding bytes). Additionally, this protocol follows the little-endian mode with low bytes before high bytes, and the transmission content includes network system timestamps, coordinates of all UAV nodes and their distances to the base station node, node IDs, and other information. The schematic diagram of the data frame structure is shown in Fig. 5.

Fig. 5
figure 5

Schematic diagram of data frame structure.

The antenna gain of LinkTrack P-B UWB has an adjustable range of 0 33.5 dB. In this dataset, the transmit power for data collection is set to 33.5 dB to facilitate expanding the communication and measurement range to the highest possible level. This module supports multiple baud rates and measurement refresh frequencies, and for this measurement, the baud rate is set to 921600, and the measurement refresh frequency is 50 Hz, to minimize the impact of measurement time on ranging during the measurement process.

The communication channels of LinkTrack P-B UWB are formed by a combination of radio frequency channels and encoding. It supports 6 radio frequency channels and 2 encoding methods, totaling 12 combinations. For the specific application scenarios of this dataset, channels 2, 3, 4, and 5 are used as the system communication channels for specific measurements. Table 1 shows the selected UWB parameter combinations used to create the dataset.

Table 1 The selected UWB parameter combinations.

Time synchronization for LinkTrack P-B UWB includes both intra-network time synchronization and system time synchronization. The time synchronization between UWB devices within the network is achieved through communication between the internal micro-server hardware of the modules, with synchronization accuracy reaching the nanosecond level, conforming to ISO9001 standards. The specific details of the synchronization are protected by commercial copyright. The time synchronization between the UWB network and the PC system is implemented by the data reception program, with an accuracy of milliseconds. This millisecond-level time synchronization is used for information exchange between the networked devices and the PC end, and it satisfies the requirements for dataset construction in two different scenarios. The specific process is as follows: When the first positioning or ranging frame is received from the UWB output, the PC time is recorded as 10:42:50.300 on June 15, 2024. The system_time in this frame of UWB data, converted to decimal, is 5100, which corresponds to 5.1 seconds. This establishes the corresponding relationship between the two clocks. If a subsequent UWB data frame with a system_time of 15100 is received, then the PC time for receiving this frame would be 10:43:00.300 on June 15, 2024.

All parameters have been briefly described, and for detailed explanations, please refer to the device manual and standard documentation24.

The UWB base station system used to create this dataset comprises 8 reference base station nodes, numbered 1-8. Each node is equipped with a LinkTrack P-B UWB module, accompanied by a USB power cable and a 5.5 V regulated power supply mobile power bank, with the module attributes set to Anchor. In various test environments, the base station nodes are placed at pre-determined reference positions set by humans. Using a laser rangefinder and based on the preset coordinates, we determine the placement distances of the base station modules along the X, Y, and Z axes of the environmental coordinate system. After the base station modules are placed, we conduct multiple inspections with the laser rangefinder to ensure that the absolute reference position accuracy of the base station nodes in the measurement coordinate system is within 1 cm. The physical reference base station node is shown in Fig. 6a.

Fig. 6
figure 6

(a) Physical image of reference base station node, (b) Physical image of drone node.

The UWB UAV cluster system used to create this dataset consists of 7 drone nodes, numbered 1-7. Each node is equipped with a LinkTrack P-B UWB module, connected with a USB power cable and powered by a 5.5 V regulated mobile power supply, with the module attribute set to Tag. The hardware module of the drone node is fixed inside the storage compartment of a six-rotor drone and flies according to the planned formation trajectory during the measurement process. Each drone is also equipped with a three-axis laser rangefinder (with a resolution of 1 mm, a measurement accuracy of  ± 1.5 cm, a standard deviation of less than 5 cm, and equipped with a level), with each axis’s laser completing the laser ranging for the corresponding dimension. The laser ranging results are displayed on the host computer, with the software provided by the manufacturer of the three-axis laser rangefinder, ensuring that the drone is as close as possible to the planned real trajectory points in the measurement space. When the three-axis ranging results are consistent with the preset coordinates, the measurement program is initiated to start collecting UWB device data. The physical UAV node is shown in Fig. 6b. It should be noted that the drones used in the construction of this dataset are equipped with GPS, IMU, and visual equipment; however, the GPS/IMU/visual data are only used to assist in controlling the movement and attitude of the drones and are not part of the dataset construction. They do not communicate with the UWB devices and PC devices, and their position accuracy information is not related to the dataset construction.

Measurement program

The measurement tasks in this article can be broadly classified into four categories: two outdoor environment categories (straight formation and climbing formation) and two indoor environment categories (dispersed formation and circular formation). The measurement procedures for each category are consistent. This section takes the outdoor environment straight formation task category as an example to illustrate the system measurement procedures of this dataset. The basic measurement procedures are shown in Fig. 7.

Fig. 7
figure 7

Measurement program.

Step 1: Select an outdoor building scene, collect and analyze the physical characteristics of the area, design the distribution and reference positions of the base station nodes based on actual needs, plan the flight trajectory of the UAV straight formation nodes (30 sets of trajectory points are planned in a single formation, with each set containing seven UAV node trajectory positions), and make corrections in the building environment. Save the corrected reference positions and trajectory positions in .mat file format (each formation task has one flight trajectory, with 30 sets of trajectory points per trajectory, and each set of trajectory points has seven position coordinates represented by a 3 × 7 matrix). Data collection during the measurement process is carried out uniformly at equal displacement intervals, making the dataset suitable for evaluating multi-target timing tracking algorithms.

Step 2: Place eight base station nodes at the reference positions in the measurement environment. The base station nodes should not be collinear and should be placed as close to the edges of the environment as possible to ensure coverage of the entire measurement area. Place seven UAV nodes at the planned starting points of the trajectories. During the measurement process, the UAVs will fly according to the planned trajectories, ensuring the orientation of the three-axis rangefinder to ensure that the actual positions of the UAV nodes are as close as possible to the preset true values.

Step 3: Complete the measurement data reception and parsing code according to the relevant settings of the UWB module and debug the data reception program code. Power on all equipment and check the status of the equipment required for measurement. Test whether the measurement system can perform data measurement tasks through joint debugging of software and hardware. Once all parts of the measurement system are confirmed to be correct, start the official measurement task. Before the official measurement begins, perform a ranging experiment with seven distance lengths in each scenario and keep the experimental results for subsequent ranging error analysis.

Step 4: Start the measurement application to begin cluster formation measurement. For a single measurement task with a set of preset trajectory points (a total of 30 sets of trajectory points, with each set containing seven UAV node trajectory positions), under a single communication channel, the measurement application will cyclically traverse eight reference base station nodes and seven UAV nodes, obtaining a total of 56 distance measurements for the base station node-UAV node combinations, thus forming a ranging matrix. This ranging matrix describes the real state of the UAV cluster nodes at a set of trajectory points in a trajectory segment under a single measurement task. By cycling through the four communication channel lists of the LinkTrack P-B UWB, four ranging matrices with 224 measurements for a set of trajectory points are obtained. Therefore, for an entire trajectory of a measurement task, 30 files can be recorded, with each file containing 224 raw measurement data. Table 2 shows the total number of measurements performed in a specific environment before data preprocessing.

Table 2 The total number of measurements.

Step 5: Check the measured data records to ensure the completeness of the dataset composition. Parse the ranging data frames to obtain decimal distance data, analyze the ranging errors, perform position settlement of the UAV cluster nodes, calculate the distances between the UAV cluster nodes, and analyze the positioning errors. Similarly, for an entire trajectory of a measurement task, 30 files can be recorded, with each file containing four base station node-UAV node ranging matrices (each matrix with a size of 8 × 7), four calculated positions of the UAV cluster nodes (each matrix with a size of 3 × 7), and four UAV node-UAV node ranging matrices (each matrix with a size of 7 × 7). All files are saved in .mat format, totaling 1440 file records.

Step 6: Technical verification of the measure-ment data was conducted, a manual for using the dataset was compiled, the elements for dataset usage were completed, and the dataset code was standardized, preparing for the maintenance and subsequent develop-ment of the dataset.

Data collection and processing

The UWB measurement application of a personal computer connects to the UWB device of base station node 1 using a USB serial port. After being powered on, each measurement node realizes automatic networking and communication through a microserver, and waits for commands from the UWB measurement application.

The measurement application uses a data frame reception code to save the original measurement data frames to a corresponding folder. A data frame parsing code is used to read the data frames and convert the bitstreams into decimal base station node-drone node distance measurements. A ranging error analysis code verifies the distribution of ranging errors during the measurement process. A data preprocessing code performs an initial quality estimation of the ranging data. A preliminary positioning solution code calculates the actual flight position of the drone node in real-time. A node distance calculation code determines the distance information between drone nodes. A positioning error analysis code verifies the positioning error distribution obtained from the measurements. Specific code names and purposes are provided in the data recording section.

The data frame reception code uses the Serial tool function, which has the following relevant attributes for this measurement task: the port number is consistent with base station node 1, the BaudRate is set to 921600, consistent with the UWB settings. The InputBufferSize is set to 1024 to ensure that at least one complete data frame can be received. The DataBits is set to 8 for convenient byte-sized data saving. The data frames are bitstreams consisting of 0s and 1s, saved in 8-bit lengths.

The data frame parsing code reads the saved data frames, verifies the start byte and checksum based on their positions, parses the ranging bitstreams in little-endian mode according to the selected data protocol format, converts the hexadecimal measurement results to decimal measurement data, and saves them in matrix form in the corresponding folder.

The ranging error analysis code compares the ranging results with the true distances, calculates the ranging errors in two application environments, and provides a ranging error histogram to support subsequent data quality estimation. The data preprocessing code performs an initial quality estimation of the ranging data based on the ranging error results, identifies abnormal distance data in continuous motion trajectories, and compensates and corrects them using the mean and variance of the ranging errors.

The preliminary positioning solution code uses the parsed decimal ranging data as a basis, employs a simple least squares method to achieve position calculation, and saves the results. While there are many collaborative positioning algorithms available, such as Kalman filter algorithms and factor graph algorithms, the use of the classical least squares algorithm ensures the originality of the positioning data in the dataset, facilitating subsequent mining and innovative research by dataset users.

The positioning error analysis code analyzes the positioning errors by comparing the RMSE between the least squares position solution and the pre-set trajectory point position data, and provides positioning error histograms for four types of measurement tasks. The node distance calculation code calculates the Euclidean distance between drone nodes in three-dimensional space and saves the results. The ranging information between drone nodes and between base station nodes and drone nodes together constitute the collaborative ranging data in this dataset.

It is worth noting that the measurement system built in this paper undergoes strict signal synchronization and communication synchronization. All modules are equipped with the same specification of antennas and auxiliary equipment, theoretically avoiding UWB hardware signal transmission and reception delays and delays caused by additional hardware. Therefore, no additional measurement offset compensation is required in terms of measurement principles.

Data Records

The dataset is publicly available on Figshare25.

The dataset archive is named Data_setzip. The main directory in the Data_set archive contains the complete dataset content, and its hierarchical structure is shown in Fig. 8. The Data_set directory includes five folders: Environment0, Environment1, Raw_data, Technical_validation, and Code. The first two folders contain all measurement-related data for each environment. The third folder, Raw_data, contains all the files for the original measurements. The fourth folder Technical_validation contains the results of the technical validation scripts. The last folder contains all the code used for the measurements.

Fig. 8
figure 8

Dataset hierarchy.

Each folder in the environment folder contains an Anchors.mat file that specifies the list of reference base station nodes (A1 – A8) and their corresponding positions in the selected environment. Each reference base station node position is represented by x – axis, y – axis, and z – axis positions, with units in meters. Next, there is an Environment.png file that illustrates specific measurement environment information. Most importantly, there is a Flying_path folder that contains two types of simulated flight paths for cluster formations. Each flight path includes 30 sets of trajectory points, recorded as a list of drone node device positions with values for the x − axis, y − axis, and z − axis, in meters.

The Raw_data folder contains all the measurements collected in a specific environment. It holds two environment folders, Environment0 and Environment1. Each environment folder contains two different cluster formation flight type folders (Flying_straight and Flying_climb for Environment0; Flying_disperse and Flying_circle for Environment1). Each formation flight type folder corresponds to the trajectory points of the simulated flight path, with a total of 30 sets of measurement data. All measurements are categorized and stored in sequence from 1 to 30. Each set of trajectory points is measured across four communication channels, and each subfolder contains 4 sub-files. The measurement file for a single set of trajectory points first contains an Original_data subfolder, which contains four measurement files. Each measurement file records the original measurement data frame information without any processing in a specific environment. The file name is defined using the format data_hex_orgin_chimat, where chi represents the system measurement channel number. The next subfolder is Distance_Anchor_Label, which also contains four measurement files. These files store the measured distances between each of the seven drone nodes and the eight reference base station nodes for the corresponding environment. The file name is defined using the format Dis_anchor_label_chimat. The next subfolder is Distance_Label_Label, which stores the measured distances between the seven drone nodes for the corresponding environment. The file name is defined using the format Dis_label_label_chimat. The last subfolder is Position_Label, which stores the real-time positioning solution positions of the seven drone nodes during the flight in the corresponding environment. The file name is defined using the format Position_label_chimat.

The Technical_validation folder contains the technical validation results of the dataset, mainly including two parts: ranging error estimation and positioning error estimation. The subfolder Range_error contains ranging error estimates from two specific environments. Under each environment folder, there are line of sight(LOS) error estimates and non line of sight(NLOS) error estimates, with histograms of ranging errors under different measurement channels. The subfolder Position_error contains positioning error estimates for two specific environments, with positioning error estimates for different cluster formation flight types in both environments.

The Code folder contains all the codes and functions for the entire process of data measurement and subsequent processing, which are divided into two parts according to the measurement environments: Environment0 and Environment1. The Environment0 folder contains two subfolders, “Flying_straight” and “Flying_climb”, and each subfolder includes codes for data frame reception, data preprocessing, data parsing, ranging error analysis, ranging error compensation, positioning calculation, positioning error analysis, etc. The folder Environment1 folder is the same in this regard.

To facilitate easier use of our dataset for research purposes, we have additionally provided an organizational method for continuous trajectory nodes of the drones. At the same time, we have provided a dataset user manual, with the specific content publicly released on Github along with the program code.

Technical Validation

Ranging error analysis

The ranging error increases with the severity of LOS obstacles and the distance between devices. Visual obstacles that allow radio frequency signals to penetrate materials do not completely block UWB signals, but alter their propagation speed and attenuate them. Obstacles prevent UWB signals from continuing to propagate along a predefined path. Blocked UWB signals can be deflected to other propagation paths that are longer than the initial propagation path. Some UWB signals also reach their destination through diffraction. All these effects increase the propagation delay of the original UWB signal and disperse the propagation delays of multipath components, thus increasing the measurement range of TOF measurements under NLOS conditions. We refer to the difference between the measured range and the actual range as the ranging error. In environments with longer distances and more reflective and obstructive obstacles, the expected dispersion of ranging errors under NLOS conditions is greater. To demonstrate the impact of the environment on the dispersion of ranging errors, ranging error histograms were created for all measurement environments.

We conduct distance measurement error analysis for both indoor and outdoor environments based on actual measurement scenarios. Experiments are carried out separately for LOS and NLOS conditions according to signal propagation characteristics. Taking the LOS condition in an indoor environment as an example, the experiment set up seven test distances: 3.0 m, 6.0 m, 9.0 m, 12.0 m, 15.0 m, 18.0 m, and 21.0 m. Two UWB nodes were used for distance measurement, one being a fixed node and the other a mobile node. A laser rangefinder was used to ensure the true physical distance between the two nodes. By placing the mobile node at different distances, UWB distance measurement results at various distances were obtained. The difference between the measured distance and the physical distance was saved and analyzed as the distance measurement error. This process was executed sequentially in four system measurement channels. In the NLOS condition, a wooden cube with a side length of 1.0 m was placed between the fixed node and the mobile node. The cube’s surface and interior contained irregular metal sheets, and the same measurement process was carried out. The estimation of distance measurement error for outdoor environments was similar. It should be noted that during the dataset construction process, UWB distance measurement errors were not used to process the experimental data, with the aim of retaining the original flight distance measurement data. The dataset is also available for scholars engaged in error analysis.

Figures 9 and 10 show the error histograms of seven sets of ranging data in outdoor building environments and indoor venue environments, with each environment splitting the ranging error histograms into two parts. The first histogram shows the ranging error under LOS conditions, and the second histogram shows the ranging error under NLOS conditions.

Fig. 9
figure 9

(a) Ranging error under LOS conditions in Environment 0(Outdoor), (b) Ranging error under NLOS conditions in Environment 0.

Fig. 10
figure 10

(a) Ranging error under LOS conditions in Environment 1(Indoor), (b) Ranging error under NLOS conditions in Environment 1.

In outdoor building environments, the dispersion of ranging errors under LOS conditions is low, and the overall error range remains at a low level. The main reason is that there is less electromagnetic interference outdoors, and UWB signals are less affected. However, under NLOS conditions in outdoor buildings, due to the influence of thick reinforced concrete walls located at specific outdoor locations, UWB signals experience significant attenuation, resulting in relatively large delays during transmission, which causes some significant large-scale ranging errors. In indoor venue environments, the ranging error range under LOS conditions is relatively smaller than under NLOS conditions, but it is still larger than under LOS conditions in outdoor buildings. The main reason is that indoors contains many metal surfaces, which have high reflectivity and can affect the propagation of UWB signals. Additionally, there are various types of electromagnetic signal interference indoors, which has a significant impact on the transmission quality of UWB signals. The ranging error range under NLOS conditions in indoor venues is increased compared to LOS conditions, but it is reduced compared to NLOS conditions in outdoor buildings. Indoor environments contain more metal objects and transmissive materials, and many NLOS signals are reflected to other reflective surfaces, extending the actual propagation time (distance) of the signal. However, the thickness of indoor walls is not as great as outdoor walls, and signals can penetrate more easily, resulting in less signal attenuation and a smaller overall error distribution.

Positioning error analysis

The results of positioning calculation are typically dependent on measurement data and positioning algorithms. The previous section has discussed the error conditions of measurement data, but the impact of positioning algorithms on positioning accuracy is often difficult to analyze uniformly due to inconsistencies in PC parameters. In this case, it is necessary to study the differences between the positioning calculation results and the actual positions, which we refer to as positioning error analysis. After data collection is completed, UWB ranging data from each drone node to the reference base station nodes are obtained. The raw ranging data is not preprocessed, and the weighted least squares method is used to calculate the drone node positions. Formula (10) is used to calculate the comprehensive mean square error across the three dimensions, with the mean square error serving as the positioning error. We have created positioning error histograms for all measurement environments.

Figure 11 shows the overall positioning error histograms for outdoor building environments and indoor venue environments. Each environment splits the positioning error histograms into two parts based on the cluster formation flight tasks. For indoor venue environments, the first histogram shows the positioning errors for dispersed formation flights, and the second histogram shows the positioning errors for circular formation flights. For outdoor building environments, the first histogram shows the positioning errors for straight formation flights, and the second histogram shows the positioning errors for ascending formation flights.

Fig. 11
figure 11

(a) Positioning error of dispersed formation in Environment 0(Outdoor), (b) Positioning error of circular formation in Environment 1(Indoor), (c) Positioning error of straight formation in Environment 1, (d) Positioning error of ascending formation in Environment 0.

In indoor scenarios, the positioning errors for dispersed formation flights are mainly distributed between 0.5 m and 1.0 m, while the positioning errors for circular formation flights are mainly distributed between 0.0 m and 0.5 m. A small number of trajectory points have positioning errors exceeding 1m, mainly due to sudden errors in indoor ranging caused by electromagnetic signal interference, resulting in larger errors in some measurements and affecting the distribution of positioning results. In outdoor scenarios, the positioning errors for straight formation flights are mainly distributed between 0.0 m and 1.0 m, with a certain number of trajectory points having positioning errors exceeding 2.0 m. The increase in non-line-of-sight conditions is the main reason for the increase in errors. The positioning errors for ascending formation flights are mostly distributed between 0.5 m and 1.0 m. During this formation flight, the drones pass through many thick cement walls, and changes in flight altitude cause an increase in overall positioning errors.

Usage Notes

The primary uses of this dataset are to develop and validate indoor and outdoor multi-target positioning technologies, enhance and improve single-target and multi-target continuous motion navigation algorithms, enhance the collaborative navigation capabilities of UAV cluster formations, improve the robustness of UAV cluster formation maintenance algorithms, develop and expand high-performance cluster path planning algorithms, and verify and enhance multi-source fusion navigation algorithms for clusters.

The dataset contains ranging information and position information between multiple UAV nodes and base station nodes using UWB, which can be used for the development and validation of large-scale node positioning algorithms indoors and outdoors, such as the WLS positioning algorithm26. These UAV nodes are set to be movable during the measurement process, and the dataset includes four formation flight trajectories recorded separately in two indoor environments. By collecting data from predefined equally spaced position trajectory points along the flight path, navigation and tracking during continuous motion can be achieved.

In addition to the ranging information between UAV nodes and base station nodes, the dataset also provides distance information between multiple UAV nodes during flight. By synergistically utilizing these two types of ranging information and fully realizing the transfer of positional information confidence, it can be applied to collaborative positioning algorithms to improve the accuracy of cluster collaborative navigation, such as Kalman filtering algorithms and factor graph collaborative positioning algorithms27,28. BP neural networks can also be used to estimate ranging errors and positioning errors under specific conditions.

The data records in the dataset are collected in a manner with multiple UAVs and multiple trajectories at equally spaced points. Through the analysis and calculation of distance measurement data, it can truthfully reflect the changes in the team configuration during the movement of the UAV cluster, making it very suitable for UAV cluster formation maintenance algorithms based on distance measurement and position measurement. Since the number of measurements in the dataset supports deep network training, formation flight can also be modeled as a Markov process, with deep learning and reinforcement learning introduced to achieve continuous formation maintenance, such as the DQN algorithm, etc29.

The dataset records flight trajectories for different tasks in different environments, and different flight trajectories avoid a certain number of obstacles, which is reflected in the concentrated changes in the position data of each UAV. Therefore, various cluster path planning algorithms and strategies can be developed and evaluated on the dataset to enhance the ability of the cluster formation to recognize, avoid obstacles, and navigate safely.

The dataset consists of UWB data obtained from measurements. UWB, as a commonly used navigation source, is increasingly being integrated with other navigation sources across various fields to enhance navigation accuracy, such as GNSS, IMU, etc30. Therefore, this dataset can be used as a type of navigation source in multi-source fusion navigation algorithms to verify and evaluate the quality of fusion and the positioning effect, such as the multi-navigation source fusion algorithm based on information geometry31.

Indeed, the work presented in this paper also has certain limitations. When constructing the dataset, only 30 trajectory points were selected, which is insufficient for long-duration and long-distance navigation tasks, failing to reflect the timeliness and fault tolerance of the algorithm. Furthermore, the dataset only includes UWB measurement data; when applying it to multi-source fusion navigation, the issue of time synchronization needs to be considered additionally, which is not conducive to the efficient development of fusion navigation methods. In subsequent research, we plan to expand the dataset, using more time steps to create a cluster navigation dataset for long duration and long-distance travel, and to collect IMU and visual sensor data, so that the dataset can better serve the technology of multi-source information fusion navigation.

Regarding the overall construction process of this dataset, if interested readers want to delve deeper, they can explore individual technologies according to their own preferences, but they can also explore the results without running all the time-consuming and resource-intensive experiments.