Introduction

Cells and their interactions within the human body are both complex and diverse, which necessitates the focus on single-cell analysis in fields such as genomics1,2, transcriptomics3,4, proteomics5,6,7, and metabolomics8,9. These single-cell researches enable the discovery of gene regulation mechanisms and protein expression dynamics, which have completely transformed how we understand the role of cell heterogeneity in health and disease10,11,12. Many technologies and platforms have been developed for single-cell analysis, such as flow cytometry13, microwell microfluidics14, microdroplet microfluidics15, optoelectronic tweezers16, and digital microfluidics (DMF)17 (Table S1). A DMF system offers several benefits over other platforms, as it can simultaneously perform sample separation, real-time manipulation, and in situ analysis in parallel on a two-dimensional surface18,19,20. Furthermore, DMF systems are compatible with various detection modalities (e.g., optical or electrochemical detection), making it particularly useful in single-cell research applications21,22,23. DMF systems are comprised of a DMF driver and a DMF chip, which manipulates the droplets. Most DMF chips feature a passive matrix (PM) structure, which utilizes a grid of large electrodes, individually connected to control lines to produce the droplet movement. However, PM-DMF systems are constrained by their wiring designs and therefore, their electrode density, resulting in a limited number of individually controlled droplets24. To address this issue, researchers have developed active-matrix digital microfluidic (AM-DMF) systems with integrated thin-film transistors at each pixel, allowing each pixel to be individually addressed by scanning the row and column control signals and enabling the parallel manipulation of thousands of cell-containing droplets25,26,27.

On the AM-DMF systems, great challenges exist in achieving intelligent single-cell sample manipulation (SCSM), which often include generating nanolitre-scale cell-captured droplets, sorting single-cell droplets, and assigning the sorted droplets into desired locations for further experiments. Nowadays, the SCSM workflow continues to rely heavily on manual input, particularly in the manual editing of droplet paths and the manual sorting of single-cell droplets. As the number of droplets increases, these processes become increasingly time-consuming and inefficient. Therefore, it is essential to enhance the development of automated workflows for efficient and reliable SCSM. The existing studies have demonstrated the successful application of artificial intelligence (AI) on automation design for biology28,29,30 and the microfluidics fields31,32,33,34,35,36,37. However, fewer studies have applied AI in AM-DMF systems. In our previous work, we integrated AM-DMF technology with AI to achieve automated biosample determination38, which supports single-cell recognition. However, cell recognition can be affected by oil bubbles within droplets. Moreover, the droplet path design was inefficient due to manual editing processes.

Previous studies on droplet path planning primarily focused on clearly defined bounded problems within specific scenarios. Compiling-based approaches reduce the droplet routing problem to well-known problems, such as integer linear programming39 and boolean satisfiability40. These algorithms are complete, optimal, and efficient for small-scale problems. However, their computational complexity grows exponentially with problem scale. Moreover, they primarily aim to minimize completion time and are not suitable for other optimization criteria, such as minimizing path crossover to reduce the potential for cross-contamination. Numerous priority-based methods that sequentially generate routes for individual droplets are proposed with varying priority rules and strategies to avoid detours or deadlocks41,42. They are fast and flexible, and easily integrate with other goals such as minimizing active electrodes or cross-contamination. However, their effectiveness has been demonstrated only in small-scale scenarios. Reinforcement learning-based methods are proposed to deal with online dynamic situations such as movement failure caused by electrode degradation43,44. These techniques are therefore neither complete nor optimal, facing challenges with sparse rewards and unstable training. In recent years, large language models (LLMs) have become a powerful tool that has been widely adopted in both academia and industry, due to their unprecedented performance and flexibility in a range of applications45,46. They have shown great potential in robotic-based platforms by simplifying and minimizing the labor effort45,47,48. The emergent capabilities of LLMs provide new perspectives for droplet path planning. They enable users to obtain solutions that meet specific requirements through human–computer interaction without redefining the problem. Given the experimental conditions and objectives, LLMs can generate corresponding path planning strategies. However, their application in droplet path planning still faces challenges, including high data requirements and substantial computational resources for training and inference. In this work, we take the first step forward and develop an AM-DMF platform that realizes fully automated biological procedures for intelligent SCSM. The platform employs an AI-based model for the sorting of single-cell droplets and an LLMs-based droplet path generation (DPG) model for generating nanolitre-scale droplets and path planning of single-cell droplets to designated locations. Based on a fully programmable AM-DMF system, we represent a breakthrough for SCSM by combining LLMs and object detection technologies, significantly enhancing experimental efficiency and broadening the horizons of AI applications in the life sciences. The automated workflow combines the DPG model and the cell recognition model for intelligent SCSM, as illustrated in Fig. 1. By integrating these methods, our AM-DMF platform enhances the accuracy and reliability of SCSM while also increasing the flexibility and automation of its workflow. This work can be summarized according to the following research highlights.

Fig. 1: AI-enabled high-throughput SCSM on an AM-DMF system.
Fig. 1: AI-enabled high-throughput SCSM on an AM-DMF system.
Full size image

The automated workflow combines the DPG model and the cell recognition model for intelligent SCSM

Advancement of the platform

As a pioneering work, we introduce a fully automated AM-DMF platform for SCSM based on LLMs that processes 1600–1700 droplets/h, achieves a single-cell sample generation rate of over 25%, and attains a model identification precision exceeding 98%.

Novelty of the proposed method

We have addressed several challenges in the DMF field. The three-class detection method enhances cell recognition accuracy by identifying oil bubbles, recognizing cells concealed at the edges of droplets using the proposed droplet movement method, and automatically generating workflows using the DPG model, eliminating the need for manual editing.

Relevance and potential to advance new biological applications

The proposed method serves as a powerful platform for intelligent SCSM, supporting a wider range of applications in single-cell research and extending its utility to various research areas, such as biological sciences and chemistry.

Results and discussion

Distinguishing cells and oil bubbles

Droplets are enveloped in the medium oil on the AM-DMF chip. During droplet transportation, a phenomenon is observed where the oil film sandwiched between the droplet and the chip may break up, forming oil bubbles49. Oil bubbles under droplets form due to the instability of an entrapped thin oil film caused by electrostatic pressure and surface tension, with their size influenced by the applied voltage, as confirmed through both theoretical analyses and numerical simulations50. During the experiments evaluating the previously proposed model for cell recognition38, we observed that some oil bubbles and cells exhibited similar appearances, causing the cell recognition algorithm to misidentify oil bubbles as cells. This misidentification problem can be challenging to correct, even for a human observer. To increase the accuracy of the model in terms of recognizing cells, we developed a three-class method that builds on the previous two-class recognition algorithm (droplets, cells) by adding an oil bubble identification mechanism. To ensure the accuracy of data labeling, we implemented a rigorous annotation process. Each image was independently annotated by multiple experienced experts. Only images with unanimous annotations were included in the training dataset, while those with discrepancies were discarded. The three-class detection model was trained for 100 epochs. Figure 2a–f shows the training and test set loss functions for the droplet, cell, and oil bubble detection tasks. The results indicate that the three-class detection model has converged and achieved optimal performance. Figure 2g shows the confusion matrix of the three-class model. To further verify the effectiveness of our proposed three-class method, we carried out a model comparison experiment on the same dataset to test the two-class and three-class methods separately. The two-class model was trained on a dataset containing only droplets and cells, whereas the dataset for the three-class model included an additional category for manually labeled bubbles. The results shown in Fig. 2h reveal that the two-class method achieved an average precision on the test set \({\rm{AP}}_{50}^{\rm{test}}\) of 96.4% for cells, whereas the three-class method slightly increased this value to 96.8%, representing an improvement of 0.4%. In terms of \({\rm{AP}}_{75}^{\rm{test}}\), the two-class method achieved 90.9% for cells, with the three-class method enhancing this value to 91.9%, representing an improvement of 1.0%. Finally, regarding the overall \({\rm{AP}}_{50:95}^{\rm{test}}\), the two-class method reached 74.8% for cells, and the three-class method improved this score to 75.5%, representing an improvement of 0.7%. Table S2 compares the cell category recognition performance of two-class and three-class methods across different models. As illustrated in Fig. 2i, the two-class detection model identified 5 cells, whereas the three-class detection model identified 4 cells and 13 oil bubbles, revealing that one of the oil bubbles was misclassified as a cell in the two-class detection results. Figure 2j shows that in the two-class detection task, 5 cells were detected, whereas in the three-class detection task, 3 cells and 19 oil bubbles were identified, correcting the misclassification of two oil bubbles as cells in the two-class detection task. Finally, Fig. 2k indicates that two-class detection resulted in 2 cells, whereas three-class detection resulted in 1 cell and 4 oil bubbles, avoiding the mislabelling of an oil bubble as a cell in the two-class detection task.

Fig. 2: Comparative analysis of the prediction results produced by the two-class and three-class models.
Fig. 2: Comparative analysis of the prediction results produced by the two-class and three-class models.
Full size image

The loss functions of the cell detection model on the training dataset (ac) and test dataset (df). g The confusion matrix of the three-class model. h Cell category recognition performance comparison between two-class and three-class models. ik The red boxes indicate the droplets predicted by the model, the pink boxes represent the cells predicted by the model, and the orange boxes denote the oil bubbles predicted by the model

Multi-scenario cell recognition testing and potential application analysis

In practice, cell recognition models may encounter various challenging situations, such as unclear images or conducting experiments on different chip substrates. When screening single-cell samples, the autofocus algorithm of the AM-DMF system is capable of controlling the imaging plane of the microscope within ±25 μm of the focal plane. We conducted a series of tests examining images with varying focus levels to assess the performance of the model. Figure 3a shows a schematic diagram of the model’s predictions produced for droplets (red boxes), cells (pink boxes), and oil bubbles (orange boxes) at various imaging clarity levels. Specifically, we conducted 20 groups of tests. In each group, we fix the x–y position of the system to target droplets on the chip, while moving the microscope upwards and downwards, in the z-axis, to vary the focus of the captured image. The clearest image was identified as a focal plane, followed by moving the microscope upwards and downwards with a step size of 5 μm at the same position, moving up to 100 μm in both directions. The results of the droplet recognition test are shown in Fig. 3b. The model could correctly detect droplets across all 20 positions, regardless of the image clarity level. The model maintained this accuracy for an average distance of 100 μm above and below the focal plane. Figure 3c depicts the results of the cell recognition tests. The model exhibited a slight accuracy variance across the 20 positions for images with different clarity levels. The model could correctly detect cells at average distances of 78 μm above the focal plane and 72.75 μm below it. Notably, within the controllable focusing range of ±25 μm of the AM-DMF system, the model demonstrated high accuracy in terms of recognizing droplets and cells, laying the foundation for a fully automated algorithm that can manipulate high-throughput single-cell samples in the next section.

Fig. 3: Droplet and cell recognition performance tests at various imaging planes, as well as droplet and cell data analysis.
Fig. 3: Droplet and cell recognition performance tests at various imaging planes, as well as droplet and cell data analysis.
Full size image

a Model’s predictions were produced for droplets (red boxes), cells (pink boxes), and oil bubbles (orange boxes) at different imaging planes. b Model precision rates achieved for droplet recognition at various distances from the focal plane. c Model precision rates achieved for cell recognition at various distances from the focal plane. Marginal histograms of droplet (d) and cell (e) heights and widths obtained by the cell recognition model and manual measurements

Additionally, we carried out real-time analysis of droplet morphological features, cell morphological features, and cell density on the AM-DMF system using the cell recognition model. Specifically, we analyzed 212 locations under a high-magnification field, covering 212 droplets and 310 cells. Figure 3d presents the marginal histograms of droplet heights and widths obtained from the model’s predictions and manual measurements, demonstrating the effectiveness of our proposed model. More than 80% of the droplets have diameters ranging from 110 to 130 μm. A linear regression analysis of the droplet height and width data resulted in an aspect ratio of 0.9702, indicating excellent circularity of the droplets. This demonstrates that our AM-DMF chip offers consistent electrode control. Similarly, Fig. 3e depicts the marginal histograms of cell heights and widths obtained from the model’s predictions and manual measurements, demonstrating the effectiveness of our proposed model. More than 80% of the cells have diameters ranging from 7 to 11 μm. A linear regression analysis of the cell height and width data from the model revealed an aspect ratio of 0.9754, suggesting a high level of cell circularity. This finding demonstrates that the tested cells are uniform in shape and exhibit high viability, indicating that DMF actuation is gentle on cells. Additionally, as shown in Fig. S1, we conducted 28 sets of cell density tests. The results demonstrate the significant potential of our platform in measuring these parameters, contributing to the analysis of electrode control uniformity, droplet morphological features, cell morphological features, and cell density.

Moreover, we conducted additional tests to assess the robustness of the three-class cell detection model. Figure 4a displays the model predictions produced on various AM-DMF chip backgrounds, demonstrating its good performance. Figure 4b–f illustrates the precision, recall, mAP50, mAP75, and mAP50:95 functions achieved by the three-class recognition model on the test set. Figure 4g presents the performance metrics achieved by the model on the test set. We conducted testing on 1482 locations under a high-magnification field, encompassing 10,574 boxes. The overall performance metrics are as follows: the precision metric achieved 92.7%, the recall metric achieved 93.4%, the mAP50 metric achieved 96.1%, the mAP75 metric achieved 83.4%, and the mAP50:95 metric achieved 76.4%. For the droplet category, which has 1485 instances, the precision metric was 99.7%, the recall metric was 99.6%, the AP50 metric was 99.4%, the AP75 metric was 99.4%, and the AP50:95 metric was 98.3%. For the cell category, which has 2157 instances, the precision metric reached 93.1%, the recall metric reached 95.2%, the AP50 metric reached 96.8%, the AP75 metric reached 91.9%, and the AP50:95 metric reached 75.5%. For the bubble category, which has 6,932 instances, the precision metric achieved 85.3%, the recall metric achieved 85.5%, the AP50 metric achieved 92.1%, the AP75 metric achieved 59.1%, and the AP50:95 metric achieved 55.3%. The results demonstrate that the model exhibits robust performance in recognizing droplets, cells, and oil bubbles. Although the performance metrics for the oil bubble category are less favorable than those for the droplet and cell categories, this difference is expected, as the dataset does not label minute oil bubbles that do not impact cell recognition. In practical applications, we can increase the confidence threshold (e.g., above 0.45) to filter out minute bubbles, which typically have low confidence scores, thereby minimizing their impact on downstream analysis. We conducted further tests to evaluate the platform’s performance on various cell types, as shown in Fig. S2. The results show the platform’s applicability across a broader range of cell types.

Fig. 4: Model generalizability testing.
Fig. 4: Model generalizability testing.
Full size image

a The model’s predictions were evaluated across different AM-DMF chip backgrounds. The precision metric function (b), the recall metric function (c), the mAP function at 0.5 IoU (d), the mAP function at 0.75 IoU (e), and the mAP function at a range of 0.5–0.95 IoU (f) of the model on the test dataset. g Model performance metrics values

Recognizing cells concealed at the edges of droplets

Theoretically, the encapsulation of cells within droplets is expected to follow the Poisson distribution51. The positions of the cells are distributed randomly within the droplets. Therefore, we noticed that cells positioned near the edges of droplets were often obscured, making it very difficult for the model and manual inspection to identify the cells. To determine whether cells are at the edge, we performed multiple movements of 500 droplets in four directions: up, left, down, and right (Fig. S3). If additional cells were detected during any of these movements, we concluded that the cells were located at the edge of the droplet. As illustrated in Fig. 5a, a statistical analysis was performed, which revealed that approximately 76% of all droplets exhibited no cells at their edges, while around 24% of the droplets contained cells at their edges. Within the subset of droplets containing cells at their edges, it was found that 79.2% had one cell present, 16.7% had two cells present, and 4.2% had more than two cells present. Due to their concealment at the edges of droplets, the chosen single-cell droplets may actually contain two or more cells, and the presence of this phenomenon affects the reliability of droplet-based single-cell research. Therefore, we propose a droplet movement method to improve the cell recognition performance. This approach reduces model detection failures for cells located at the edges of droplets. By applying a voltage to the electrodes adjacent to a droplet, a movement was induced for the target droplet. This movement displaces the cells from the edge of the droplet, due to the difference in their momenta, to its interior, enhancing the detection ability of the algorithm. Figure 5b shows that the model detected 1 cell and 7 oil bubbles before the droplet movement occurred. In comparison, the model identified 2 cells and 7 oil bubbles after the droplet moved to the left, effectively resolving the issue of cells hidden at the droplet edge. Similarly, Fig. 5c shows that prior to droplet movement, the model detected 0 cells and 12 oil bubbles. However, after shifting the droplet upward, the model detected 1 cell and 7 oil bubbles, thus avoiding the problem of cells being obscured by the droplet contour and being undetected by the model.

Fig. 5: A droplet movement method for detecting cells concealed at the edges of droplets.
Fig. 5: A droplet movement method for detecting cells concealed at the edges of droplets.
Full size image

a A statistical analysis of the percentage of cell positions distributed within the droplets. b Before moving the droplet to the left, the model detected one cell; after the droplet moved, two cells were detected. c Before moving the droplet upward, the model detected no cells; after the droplet moved, one cell was detected

Intelligent DPG

An automated path design method for DMF systems can significantly reduce the time required for manual path design, simplifying the task of planning and executing experimental procedures, and enhancing operational efficiency. Therefore, we propose a DPG model to automatically design experimental paths. DPG is a Llama 3-based droplet path generation model that can automatically generate droplet movement and splitting paths. We utilized Llama 3-8B as the pre-trained model and employed the low-rank adaptation (LoRA) fine-tuning technique to freeze the weights of the pre-trained Llama 3-8B model. Figure 6 shows the droplet paths generated by the DPG model through a series of simulations and experiments. The DPG model was trained for 31,432 steps. Figure 6a, b illustrates the training and evaluation loss functions during the training process. We compared the performance of fine-tuned Llama 3.2-3B and Llama 3-8B. The results indicate that fine-tuned Llama 3-8B demonstrates superior performance in the droplet path generation task. Figure 6c illustrates the training and evaluation results of the DPG model. Throughout the training process, the total floating point operations (flops) reached 3.76 × 1018, with a training loss of 4.65 × 10−3. The training runtime lasted for 41,903 s, achieving an average processing rate of 3.001 samples/s and 0.75 steps/s. In the evaluation phase, the evaluation loss was recorded as 1.69 × 10−6, with a runtime of 979 s and an average processing rate of 14.267 samples/s and 7.134 steps/s. These results indicate that the model has achieved optimal performance in both the training and evaluation processes. Figure 6d shows the droplet movement paths generated by DPG in the y-axis direction, the x-axis direction, and a combination of both the x- and y-axis directions. Figure 6e displays subdroplet generation paths generated by DPG, illustrating the process of generating subdroplets with widths of 2, heights of 2, row spacings of 2, and column spacings of 4. Figure 6f depicts the droplet movement and splitting paths generated by DPG, illustrating the process from the starting droplet (row 60, column 36) to the end droplet (row 54, column 42). At the end droplet position, subdroplets were generated with widths of 1, heights of 2, row spacings of 2, and column spacings of 2. Figure 6g shows the paths generated by DPG for droplets assigned to designated locations. Five designated droplets were assigned to specified locations with a column movement of −4, a row movement of −12, and a subdroplet column spacing of 3 in a 4 × 4 droplet array. In the example depicted in Fig. 6g, with regard to the output time issue of the DPG model, we propose a new droplet path output format. As illustrated in Tables S3 and S4, the original format required 2882 characters, whereas our new format only demands 788 characters, representing a reduction by a factor of 3.66. This substantial decrease in the number of output characters greatly enhances the temporal efficiency of the path-planning process executed by the model. Additional details regarding the user prompts, DPG model responses, the concepts used in the prompts and responses, and a series of simulations can be found in Tables S4S9 and Movie S1. We conducted a series of tests to evaluate the performance of the DPG model in generating droplet paths. As shown in Fig. 6h, the user prompt and the corresponding responses from the DPG, GPT-4o, Gemini 2.0, and DeepSeek-V3 models are presented. Of these, only the DPG model successfully generated the correct path, while the other models failed to generate the correct path.

Fig. 6: Droplet paths generated by the DPG model.
Fig. 6: Droplet paths generated by the DPG model.
Full size image

a The training loss function of the DPG model. b The evaluation loss function of the DPG model. c Statistics on the training and evaluation results of the DPG model. d Droplet movement paths are generated by DPG in the y-axis direction, the x-axis direction, and both the x- and y-axis directions. e Subdroplet generation paths generated by DPG. f Both the droplet movement and splitting paths were generated by DPG. g Paths generated by DPG for droplets assigned to designated locations. h The user prompt and corresponding responses from the DPG, GPT-4o, Gemini 2.0, and DeepSeek-V3 models

The solution for intelligent SCSM

We developed a solution for high-precision and high-throughput SCSM, addressing the challenges traditionally associated with manual operation, which often leads to inconsistencies and inefficiencies. The proposed solution can achieve a precision rate of over 98% in terms of cell identification and a single-cell sample generation rate exceeding 25%. The proposed solution is detailed in Movie S2 and illustrated in the flowchart shown in Fig. 7a, including generating a droplet array, sorting out single-cell droplets, and moving the single-cell droplets into the desired location. Figure 7b illustrates the user prompt, the response of the DPG model, and the droplet array generation test conducted on the AM-DMF chip. We generated an 8 × 8 array of subdroplets. Then, the single-cell sample sorting stage was performed by combining the three-class method with the droplet movement method (Fig. 7a, c). A motion control card (MCC) guides the microscope to the specified droplet via the x- and y-axes. The z-axis of the MCC, combined with the microscope module, activates the autofocus function to ensure image clarity. These images were fed into the cell recognition model for forward inference, with cell counts recorded via postprocessing algorithms such as nonmaximum suppression (NMS). For single-cell detection, the process involved manipulating the position of the droplet: if more than one cell was detected at a position, the process is stopped, and the camera is moved to the next location. Otherwise, the droplet was moved to the right, and the camera captured a new image for cell counting. If more than one cell still remained, the process was stopped; otherwise, the droplet was moved to the left. After three attempts, if no cells were detected, the process moved on. If a single cell was identified, its location was recorded for further analysis. Finally, the DPG model generated paths according to the user prompt for guiding single-cell samples to their designated locations after the sorting workflow was completed (Fig. 7d). The DPG model is designed to automatically generate experimental paths. Further details on addressing real-world variability factors, such as electrode failures or unsuccessful droplet splitting events, can be found in our previous work38.

Fig. 7: Intelligent SCSM workflow.
Fig. 7: Intelligent SCSM workflow.
Full size image

a A fully automated algorithmic flowchart for SCSM. b The droplet path generated with 8 × 8 subdroplets by the DPG model on the AM-DMF chip (128 rows × 128 columns). c The process of sorting the droplets containing one cell by combining the three-class method with the droplet movement method. d The path for the assigned single-cell sample to their designated locations is generated by the DPG model

Conclusion

In this work, we take the first step forward and present a platform for intelligent SCSM via an AM-DMF system, employing a DPG model and a cell recognition model in combination with specialized pre- and postprocessing techniques. This platform offers a fully automated solution for efficiently manipulating single-cell samples, thus addressing the current demands of single-cell research and significantly outperforming existing platforms in terms of efficiency and performance. We conducted a comprehensive intelligent SCSM experiment, including generating paths for droplets containing cells via the DPG model, performing cell recognition via the three-class classification method and droplet movement method, and generating paths for assigning single-cell droplets to designated locations based on the DPG model. The proposed method was able to achieve a single-cell sample generation rate of over 25% and a model identification precision rate exceeding 98%. The novelty of the proposed method include (1) an AI-based algorithm capable of detecting and distinguishing between droplets, cells, and oil bubbles, which improved the cell recognition accuracy of the model by 1.0% in terms of the \({\rm{AP}}_{75}^{\rm{test}}\) metric; (2) a technique that leverages droplet movement to recognize cells obscured by droplet edges, with approximately 20% of all droplets contained cells at their edges; (3) a DPG model for automatically generating experimental paths, replacing the manual design process; (4) a fully automated solution for SCSM; and (5) two comprehensive annotated datasets for an AM-DMF chip. We anticipate that the integration and proliferation of AM-DMF technology, LLMs, and object detection technologies, combined with the ongoing improvement and expansion of related functional modules, will establish the AM-DMF system as a powerful platform for SCSM. This platform is expected to support a wide range of applications in single-cell research and extend its utility to various research areas, such as biological sciences and chemistry.

Methods and materials

Platform and reagents

The AM-DMF system (DM sys) was developed by Guangdong ACXEL Micro & Nano Tech (Foshan, China) and ACX Instruments Ltd. (Cambridge, UK) (as shown in Fig. S4). In this study, the AM-DMF chip contains 16,384 electrodes (128 × 128), with an electrode pitch of 250 μm and a two-plate gap size of 50 μm (as shown in Fig. S5). A control voltage of 45 V was applied to the droplets throughout the experiments. The medium oil used in this study was silicone oil (2 cSt) obtained from Dow Corporate. Human cervical cancer (HeLa) cells were obtained from Cellverse Bioscience Technology Co., Ltd. The HeLa cells were cultured in a cell culture incubator (5% CO2, 75% atmosphere, 37 °C). The concentration of the HeLa cells was 4.5 × 105 cells/ml.

Data collection and composition

In this work, the original dataset used for training and testing the cell recognition model was obtained from images and videos. After the droplets were split via the AM-DMF chip, a microscope with a 7.5-fold objective lens was used to observe the cell-containing droplets. High-magnification microscope with pixel resolutions of 2448 × 2048 captured the cell images. We focused on fields of view with resolutions of 2048 × 2048 or 1280 × 1280 for analysis purposes. The images were displayed and saved using custom software. The exposure time was set to 20 ms, and the gain was set to 500%. We randomly divided the samples, allocating 70% to the training set and 30% to the test set. A comprehensive overview of the dataset is provided in Table 1, which includes the total number of labels and images for the cell detection models. The manual dataset annotations, which categorized droplets, cells, and oil bubbles, consisted of 4940 images, 4950 droplet labels, 7102 cell labels, and 22,718 bubble labels. The training set consisted of 3458 images, with 3465 droplet category labels, 4945 cell category labels, and 15,786 bubble category labels. Moreover, the testing set included 1482 images, with 1485 droplet category labels, 2157 cell category labels, and 6932 bubble category labels.

Table 1 Model annotation summary for the dataset

For the DPG model dataset, we initially prepared 140,000 data instances, comprising 50,000 droplet movement paths, 50,000 droplet split paths, and 40,000 combined droplet movement and split paths. After filtering out the noncompliant data, we retained a total of 139,701 instances for training and testing the DPG model. All droplet paths and prompts were saved in text files. Finally, these 139,701 sets of droplet paths and prompts were organized into a single JSON file format to create a conversational-style dataset through postprocessing.

Model architecture

Figure S6 shows the detailed architecture of the three-class detector. The model’s output stage provides class probabilities for droplets, cells, and oil bubbles. The inference time of the cell recognition model is approximately 350 ms on an Intel Core i7 processor. Further details regarding the cell recognition model can be found in our previous work38. GPT-4o (Omni) was launched in May 2024 and developed by OpenAI52. Gemini was launched in December 2023 and developed by the Gemini Team, Google53,54. DeepSeek-V3 was launched in December 2024 and developed by Hangzhou Deepseek Artificial Intelligence Basic Technology Research Co., Ltd.55. As shown in Fig. S7, the DPG model was developed on the basis of the Llama 3 architecture and the LoRA technique. Llama 3 is an open-source LLMs based on a decoder transformer architecture that was launched in April 2024 by Meta56. The LoRA is an LLMs fine-tuning technique that freezes the weights of a pretrained LLMs and injects a trainable low-rank matrix factorization into each layer of the transformer architecture57. This approach reduces the number of trainable parameters for the droplet path generation task. On the Nvidia GeForce RTX 4090, the time from user input to the first token output is approximately 2 s. After that, the DPG model outputs paths line by line, where each line corresponds to one step of movement for all droplets. The DPG model generates each line of the path faster than the time required for each step of the droplet movement.

Model training parameter setup

The cell recognition model was implemented via Python 3.9.0 and the PyTorch 2.0.1 framework on a machine equipped with a 24-GB Nvidia GeForce RTX 4090 graphics card. The total number of parameters in the cell recognition model is 8,625,065. The DPG model was implemented via Python 3.12.0 and the PyTorch 2.3.1 framework on a machine equipped with two 24 GB Nvidia GeForce RTX 4090 GPUs. All the models were implemented on an Intel® X®(R) Gold 6133 CPU @ 2.50 GHz with 125 GB of RAM and run in the Ubuntu 20.04.5 LTS environment. The parameters of the Llama 3-8B model are 8,051,232,768. In the DPG model, the rank of the LoRA was set to 8. Consequently, the DPG model comprises 20,971,520 trainable parameters, representing 0.26% of the total number of parameters. During the training process, the other experimental parameter settings for the cell recognition model and DPG model (hyperparameters of LORA) were as shown in Table S10.

Model performance evaluation metrics

The research community often relies on average precision (AP) as the primary performance comparison metric for object detection models. The AP metric is derived from precision and recall. Precision is determined by the number of true positives divided by the sum of the true positives and false positives, whereas recall is calculated as the number of true positives divided by the sum of the true positives and false negatives. AP50 is the AP at an intersection over union (IoU) threshold of 0.5; AP75 is the AP at an IoU threshold of 0.75; AP50:95 is the AP at an IoU ranging from 0.5-0.95; the mAP for object detection is the mean of the AP over all object classes. The AP is defined as:

$${{AP}}=\mathop{\sum }\limits_{i=1}^{i=101}{\left\{\left[{{Recall}}\left(i\right)-{{Recall}}\left(i+1\right)\right]\times {{Precision}}\left(j\right)\right\}}_{{IoU}}$$
(1)

The precision and recall metrics are defined as follows

$${{Precision}}=\frac{{TP}}{{TP}+{FP}}$$
(2)
$${{Recall}}=\frac{{TP}}{{TP}+{FN}}$$
(3)

where TP represents true positives, FP represents false positives, and FN represents false negatives.