Data format
The dataset was formatted as an M x N matrix, where the row represents different sampling sites S1, S2, S3…, Sn and the column represents phytoplankton species G1, G2, G3…Gn. Each element [i,j] represents the occurrence of the species j in the sample i (Table 1).