iobjectspy.ml.vision package

Module contents

class iobjectspy.ml.vision.DataPreparation

Bases: object

Image data preparation process entry

static create_training_data(input_data, input_label, label_class_field, output_path, output_name, training_data_format, tile_format='jpg', tile_size_x=1024, tile_size_y=1024, tile_offset_x=512, tile_offset_y=512, tile_start_index=0, save_nolabel_tiles=False, input_compare_data=None, **kwargs)

Training data generation

Generate the training sample image tiles in specified size from the input imagery with labeled vector data.
The output includes pictures, annotations, and meta-information. The pictures and annotations name one-to-one correspondence.
Parameters:
  • input_data (str) – input image data, support image file
  • input_label (str or DatasetVector) – input vector label data, support vector dataset
  • label_class_field (str or None) – the field name represents the label categpries. If ‘None’ is specified, all labels are of the same category.
  • output_path (str) – output training data storage path
  • training_data_format (str) – output training data format, support four different formates: ‘VOC’, ‘MULTI_C’, ‘BINARY_C’, ‘SCENE_C’.
  • tile_format (str) – image tile format, support ‘tif’, ‘jpg’, ‘png’, and ‘origin formates’
  • tile_size_x (int) – tile size in x direction
  • tile_size_y (int) – tile size in y direction
  • tile_offset_x (int) – tile offset in the x direction
  • tile_offset_y (int) – tile offset in y direction
  • tile_start_index (int) – the initial index value for naming the tiles. The default is 0 and set to -1 when using this function to process multiple images.
  • save_nolabel_tiles (bool) – whether to save tiles without labels
Returns:

None

VOC format:
./VOC
./VOC/Annotations/000000001.xml label tiles
./VOC/Images/000000001.jpg image tiles
./VOC/ImageSets/Main/train.txt, val.txt, test.txt, trainval.txt training dataset tile name, validation dataset tile name, test dataset tile name, training dataset and validation dataset tile name
./VOC/VOC.sda training data configuration file
MULTI_C format:
./MULTI_C
./MULTI_C/Images/00000000.tif image tiles
./MULTI_C/Masks/00000000.png label tiles
./MULTI_C/MULTI_C.sda training data configuration file
BINARY_C format:
./BINARY_C
./BINARY_C/Images/00000000.tif image tiles
./BINARY_C/Masks/00000000.png label tiles
./BINARY_C/BINARY_C.sda training data configuration file
SCENE_C format:
./SCENE_C
./SCENE_C/0/00000000.tif image tiles
./SCENE_C/1/00000000.png image tiles
./SCENE_C/2/00000000.tif image tiles
….
./SCENE_C/scene_classification.csv mapping the relationship between saved image file path and the categories.
./SCENE_C/SCENE_C.sda Training data configuration file
class iobjectspy.ml.vision.ImageryEvaluation

Bases: object

static binary_classification(inference_data, ground_truth_data, inference_class_value_field=None, ground_truth_class_value_field=None, metric_type=None, out_data='', out_data_name='metric')

影像二元分类模型评估接口,可基于输入的真实标签数据和预测标签数据计算结果,支持影像和影像数据计算,矢量和矢量数据计算。

Parameters:
  • inference_data (str or DatasetVector) – 必选参数。推理结果数据集,输入的矢量面数据集来自于模型推理object_detect_infer
  • ground_truth_data (str or DatasetVector) – 必选参数。真实标签数据集,输入的矢量面数据集来自于真实的标签数据集
  • inference_class_value_field (str or None) – 可选参数。推理结果数据包含类别字段名。如果指定的字段为None,则默认去找’value’字段,若字段不存在,则所有记录都被认定为是同一个类
  • ground_truth_class_value_field (str or None) – 可选参数。真实数据类别字段名。如果指定的字段为None,则默认去找’value’字段,若字段不存在,则所有记录都被认定为是同一个类
  • metric_type (str or None) – 可选参数。待计算的指标名称。默认为None,为None时输出该功能全部指标。支持的metric_type为:PA,IoU,F1,Kappa
  • out_data (str or Datasource or DatasourceConnectionInfo) – 可选参数。输出文件(或数据源)路径
  • out_data_name (str) – 可选参数。输出文件(或数据集)名称
Returns:

tuple (metric_type ,[dict, dict, …])

static general_change_detection(inference_data, ground_truth_data, inference_class_value_field=None, ground_truth_class_value_field=None, metric_type=None, out_data='', out_data_name='metric')

影像通用变化检测模型评估接口,可基于输入的真实标签数据和预测标签数据计算结果,支持影像和影像数据计算,矢量和矢量数据计算。

Parameters:
  • inference_data (str or DatasetVector) – 必选参数。推理结果数据集,输入的矢量面数据集来自于模型推理object_detect_infer
  • ground_truth_data (str or DatasetVector) – 必选参数。真实标签数据集,输入的矢量面数据集来自于真实的标签数据集
  • inference_class_value_field (str or None) – 可选参数。推理结果数据包含类别字段名。如果指定的字段为None,则默认去找’value’字段,若字段不存在,则所有记录都被认定为是同一个类
  • ground_truth_class_value_field (str or None) – 可选参数。真实数据类别字段名。如果指定的字段为None,则默认去找’value’字段,若字段不存在,则所有记录都被认定为是同一个类
  • metric_type (str or None) – 可选参数。待计算的指标名称。默认为None,为None时输出该功能全部指标。支持的metric_type为:PA,IoU,F1,Kappa
  • out_data (str or Datasource or DatasourceConnectionInfo) – 可选参数。输出文件(或数据源)路径
  • out_data_name (str) – 可选参数。输出文件(或数据集)名称
Returns:

tuple (metric_type ,[dict, dict, …])

static multi_classification(inference_data, ground_truth_data, inference_class_value_field=None, ground_truth_class_value_field=None, metric_type=None, out_data='', out_data_name='metric')

影像地物分类模型评估接口,可基于输入的真实标签数据和预测标签数据计算结果,支持影像和影像数据计算,矢量和矢量数据计算。

Parameters:
  • inference_data (str or DatasetVector) – 必选参数。推理结果数据集,输入的矢量面数据集来自于模型推理object_detect_infer
  • ground_truth_data (str or DatasetVector) – 必选参数。真实标签数据集,输入的矢量面数据集来自于真实的标签数据集
  • inference_class_value_field (str or None) – 可选参数。推理结果数据包含类别字段名。如果指定的字段为None,则默认去找’value’字段,若字段不存在,则所有记录都被认定为是同一个类
  • ground_truth_class_value_field (str or None) – 可选参数。真实数据类别字段名。如果指定的字段为None,则默认去找’value’字段,若字段不存在,则所有记录都被认定为是同一个类
  • metric_type (str or None) – 可选参数。待计算的指标名称。默认为None,为None时输出该功能全部指标。支持的metric_type为:F1,CPA,IoU,Kappa,mPA,mIoU
  • out_data (str or Datasource or DatasourceConnectionInfo) – 可选参数。输出文件(或数据源)路径
  • out_data_name (str) – 可选参数。输出文件(或数据集)名称
Returns:

tuple (metric_type ,[dict, dict, …])

static object_detection(inference_data, ground_truth_data, inference_class_value_field=None, ground_truth_class_value_field='category', metric_type=None, out_data='', out_data_name='metric', iou_thr='')

影像目标检测模型评估接口,可基于输入的预测标签数据集与真实标签数据集计算结果计算指标(只支持规则正矩形数据)

Parameters:
  • inference_data (str or DatasetVector) – 必选参数。推理结果数据集,输入的矢量面数据集来自于模型推理object_detect_infer
  • ground_truth_data (str or DatasetVector) – 必选参数。真实标签数据集,输入的矢量面数据集来自于真实的标签数据集
  • inference_class_value_field (str or None) – 可选参数。推理结果数据包含类别字段名。如果指定的字段为None,则默认去找’value’字段,若字段不存在,则所有记录都被认定为是同一个类。
  • ground_truth_class_value_field (str or None) – 可选参数。真实数据类别字段名。如果指定的字段为None,则默认去找’value’字段,若字段不存在,则所有记录都被认定为是同一个类。
  • metric_type (str or None) – 可选参数。待计算的指标名称。默认为None,为None时输出该功能全部指标。支持的metric_type为:F1、Recall、Precision、AP、mAP。
  • out_data (str or Datasource or DatasourceConnectionInfo) – 可选参数。输出文件(或数据源)路径
  • out_data_name (str) – 可选参数。输出文件(或数据集)名称
  • iou_thr (float) – 可选参数。IoU(Intersection over Union)用作评估目标检测模型准确性的阈值的交集与并集的比率.分子是推理边界框和真实边界框之间的重叠区域,分母是两个边界框合并的区域。IoU值应在0到1的范围内,[0,1] 示例:0.5
Returns:

tuple (metric_type ,[dict, dict, …])

static object_extraction(inference_data, ground_truth_data, inference_class_value_field=None, ground_truth_class_value_field='category', metric_type=None, out_data='', out_data_name='metric', iou_thr='')

影像对象提取模型评估接口,可基于输入的预测标签数据集与真实标签数据集计算结果计算指标

Parameters:
  • inference_data (str or DatasetVector) – 必选参数。推理结果数据集,输入的矢量面数据集来自于模型推理object_detect_infer
  • ground_truth_data (str or DatasetVector) – 必选参数。真实标签数据集,输入的矢量面数据集来自于真实的标签数据集
  • inference_class_value_field (str or None) – 可选参数。推理结果数据包含类别字段名。如果指定的字段为None,则默认去找’value’字段,若字段不存在,则所有记录都被认定为是同一个类。
  • ground_truth_class_value_field (str or None) – 可选参数。真实数据类别字段名。如果指定的字段为None,则默认去找’value’字段,若字段不存在,则所有记录都被认定为是同一个类。
  • metric_type (str or None) – 可选参数。待计算的指标名称。默认为None,为None时输出该功能全部指标。支持的metric_type为:F1、Recall、Precision、AP、mAP。
  • out_data (str or Datasource or DatasourceConnectionInfo) – 可选参数。输出文件(或数据源)路径
  • out_data_name (str) – 可选参数。输出文件(或数据集)名称
  • iou_thr (float) – 可选参数。IoU(Intersection over Union)用作评估目标检测模型准确性的阈值的交集与并集的比率.分子是推理边界框和真实边界框之间的重叠区域,分母是两个边界框合并的区域。IoU值应在0到1的范围内,[0,1] 示例:0.5
Returns:

tuple (metric_type ,[dict, dict, …])

super_resolution(*args, **kwargs)
class iobjectspy.ml.vision.ImageryTrainer(train_data_path, config, epoch, batch_size, lr, output_model_path, output_model_name, backbone_name, backbone_weight_path=None, log_path='./', reload_model=False, pretrained_model_path=None, gpus=[0], init_data=False, **kwargs)

Bases: object

Model training function entrance for pictures.

Parameters:
  • train_data_path (str) – training data path
  • config (str) – configuration file path
  • epoch (int) – number of iterations
  • batch_size (int) – batch size
  • lr (float) – learning rate
  • output_model_path (str) – output model file path
  • output_model_name (str) – output model file name
  • backbone_name (str) – backtrunk network name
  • backbone_weight_path (str or None) – path to the trunk network model file. If it is None, the model weight is initialized randomly
  • log_path (str) – log and checkpoint output paths
  • reload_model (bool) – whether to reload the checkpoint model trained previously
  • pretrained_model_path (str or None) – pre-trained model path (optional)
binary_classify_train()

Function entry for binary classification model training. The build model will be stored under the ‘output_model_path’ folder :return: None

general_change_detection_train()

通用变化检测模型训练功能

生成模型将存储在输入的 ‘output_model_path’ 路径下

Returns:None
multi_classify_train()

Function entry for ground object classification model training. The build model will be stored under the ‘output_model_path’ folder :return: None

object_detect_train()

Function entry for object detection model training. The build model will be stored under the ‘output_model_path’ folder :return: None

object_extract_train()

Function entry for objects classification model training. The build model will be stored under the ‘output_model_path’ folder :return: None

scene_classify_train()

Function entry for scene classification model training. The build model will be stored under the ‘output_model_path’ folder :return: None

super_resolution_train()

超分辨率重建训练功能

生成模型将存储在输入的 ‘output_model_path’ 路径下

Returns:
class iobjectspy.ml.vision.PictureTrainer(train_data_path, config, epoch, batch_size, lr, output_model_path, output_model_name, backbone_name, backbone_weight_path=None, log_path='./', reload_model=False, pretrained_model_path=None, gpus=[0], init_data=False, **kwargs)

Bases: object

Function entry for picture data model training.

Parameters:
  • train_data_path (str) – training data path
  • config (str) – configuration file path
  • epoch (int) – number of iterations
  • batch_size (int) – batch size
  • lr (float) – learning rate
  • output_model_path (str) – output model file path
  • output_model_name (str) – output model file name
  • backbone_name (str) – backtrunk network name
  • backbone_weight_path (str or None) – path to the trunk network model file. If it is None, the model weight is initialized randomly
  • log_path (str) – log and checkpoint output paths
  • reload_model (bool) – whether to reload the checkpoint model trained previously
  • pretrained_model_path (str or None) – pre-trained model path (optional)
object_detect_train()

Function entry for object detection model training. The build model will be stored under the ‘output_model_path’ folder :return: None

object_extraction_train()

目标检测模型训练功能

生成模型将存储在输入的 ‘output_model_path’ 路径下

Returns:None
picture_classify_train()

Function entry for picture classification model training. The build model will be stored under the ‘output_model_path’ folder :return: None

picture_object_extraction_train()

图片分割训练功能:基于Mask-rcnn

生成模型将存储在输入的 ‘output_model_path’ 路径下

Returns:None
class iobjectspy.ml.vision.ImageryInference(model_path, gpus=[0], batch_size=1, **kwargs)

Bases: object

Image data model inference function entry

Parameters:model_path (str) – saved model path
binary_classify_infer(input_data, out_data, out_dataset_name, offset, result_type, infer_region=None, **kwargs)

Binary classification for satellite images. Supports image files such as tif, img (Erdas Image), and image files such as jpg, png. The output is a binary raster or vector file. Supports image datasets in SuperMap SDX, and the classification result is a vector or raster dataset.

Keyword parameters can be added: parameter ‘dsm_dataset’ used to add the DSM data matching with the image which realize the extraction of building surface based on DOM and DSM. Images and DSM can be extracted from oblique photography data using SuperMap iDesktop:

Open ‘3D scene’, use ‘3D analysis’ -> generate DOM; 3D analysis -> generate DSM. It is recommended to set the resolution to 0.1m.
Parameters:
  • input_data (str or Dataset) – data to be inferred
  • out_data (str or Datasource or DatasourceConnectionInfo) – output file (or datasource) path
  • out_dataset_name (str) – output file (or dataset) path
  • offset (int) – image block offset; To improve the prediction accuracy, large images need to be divided into blockes for prediction. The value is the overlap area between blocks.
  • result_type (str) – the result type returned, support vector area and raster, ‘region’ or ‘grid’
Returns:

Dataset name

close()

Closed the loaded model

general_changedet_infer(input_data, input_compare_data, out_data, out_dataset_name, offset, result_type, infer_region=None, **kwargs)

遥感影像数据通用变化检测 支持 tif、img (Erdas Image) 等影像文件,分类结果为二值栅格或矢量文件 支持SuperMap SDX下的影像数据集,分类结果为矢量或栅格数据集

Parameters:
  • input_data – 待推理的数据
  • input_compare_data – 待推理的数据
  • out_data (str or Datasource or DatasourceConnectionInfo) – 输出文件(或数据源)路径
  • out_dataset_name (str) – 输出文件(或数据集)名称
  • offset (int) – 图像分块偏移,其值为分块间重叠部分大小。大幅图像需分块预测,以提高图像块边缘预测结果
  • result_type (str) – 结果返回类型,支持矢量面和栅格: ‘region’ or ‘grid’
  • infer_region (vector or bounds or list(bounds or str)) – 指定推理范围,输入矢量面或角点坐标, 默认None
Returns:

数据集列表

multi_classify_infer(input_data, out_data, out_dataset_name, offset, result_type, infer_region=None, **kwargs)

Multi-classification and ground object classification for remote sensing images. Support image files such as tif, img (Erdas Image). The output is a binary raster or vector file. Supports image datasets in SuperMap SDX, and the classification outputs is a vector or raster dataset.

Parameters:
  • input_data (str or Dataset) – dataset to be inferred
  • out_data (str or Datasource or DatasourceConnectionInfo) – path of the output file (datasource)
  • out_dataset_name (str) – name of the output file (datsource)
  • offset (int) – image block offset; To improve the prediction accuracy, large images need to be divided into blockes for prediction. The value is the area of the overlap between blocks.
  • result_type (str) – the result type returned, support vector area and raster, ‘region’ or ‘grid’
Returns:

dataset name

object_classify_infer(input_data, input_region, field_name='class_type', **kwargs)

影像对象分类 支持tif影像数据,对应的影像区域矢量数据集

Parameters:
  • input_data (str) – 待推理的数据
  • input_region (DatasetVector) – 对应的矢量面数据集
  • field_name (str) – 输出类别的字段名称
Returns:

数据集名称

object_detect_infer(input_data, out_data, out_dataset_name, category_name, infer_region=None)

Object detection for imagery data

TIF, IMG (ERDAS IMAGE) and other Image files are supported, and the detection results are vector surface data sets
requires attention:
-When ‘input_data’ is the data to be detected and ‘out_data’ is the output file path or a datasource, ‘out_dataset_name’ is the file name. -When ‘input_data’ is the file path to be detected and ‘out_data’ is the output file path or a datasource, ‘out_dataset_name’ is ineffective and ‘dataset_name’ is obtained from the ‘input_data’ file list.
Parameters:
  • input_data (str or Dataset) – data to be inferred
  • out_data (str or Datasource or DatasourceConnectionInfo) – output file (or datasource) path
  • out_dataset_name (str) – output data (or dataset) name
  • category_name (list[str] or str) – categories for object detection, support multi-categories detection
  • nms_thresh (float) – ‘nms’ threshold
  • score_thresh (float) – category score threshold
Returns:

None

object_extract_infer(input_data, out_data, out_dataset_name, return_bbox=False, infer_region=None, **kwargs)

Object extraction for remote sensing image. Support imagery files such as tif, img (Erdas Image). The output is a vector file. Supports image datasets in SuperMap SDX, and the classification result is a vector file.

Parameters:
  • input_data (str or Dataset) – data to be inferred
  • out_data (str or Datasource or DatasourceConnectionInfo) – output file (or datasource) path
  • out_dataset_name (str) – output data (or dataset) name
  • score_thresh (float) – category score threshold
  • nms_thresh (float) – ‘nms’ threshold
  • return_bbox (bool) – whether to return the minimum bounding rectangle of the object
Returns:

dataset name

prompt_segmentation_infer(input_data, prompt_type, input_prompt_data, out_data, out_dataset_name, tile_size=1024, offset=0, sample_method=None, **kwargs)

基于SAM模型遥感影像数据对象提取 支持 tif、img (Erdas Image)等影像文件,分类结果为矢量文件 支持SuperMap SDX下的影像数据集,分类结果为矢量

Parameters:
  • input_data (DatasetImage) – 待推理的数据
  • prompt_type (str) – 输入暂时只支持面数据集polygon,无提示noprompt
  • input_prompt_data ('polygon'提示为DatasetVector,'noprompt'提示为None) – 输入矢量提示数据源
  • out_data (str or Datasource) – 输出文件(Udbx)路径
  • out_dataset_name (str) – 输出数据集名称
  • tile_size (int) – 切块大小, 默认值为1024
  • offset (int) – 切块重叠度, 默认值为0
  • sample_method (枚举类SAMSamplePointsEnum) – 采样点策略,polygon默认SAMSamplePointsEnum.HOMOGENEOUS,noprompt默认SAMSamplePointsEnum.UNIFORM
Returns:

数据集名称

scene_classify_infer(input_data, out_data, out_dataset_name, result_type, infer_region=None, **kwargs)

Scene classification for remote sensing image. Support imagery files such as tif, img (Erdas Image). The output is a binary raster or vector file. Supports image datasets in SuperMap SDX, and the output is a vector or raster datasets.

Parameters:
  • input_data (str or Dataset) – data to be inferred
  • out_data (str or Datasource or DatasourceConnectionInfo) – output file (or datasource) path
  • out_dataset_name (str) – output data (or dataset) name
  • result_type (str) – the result type returned, support vector area and raster, ‘region’ or ‘grid’
Returns:

dataset name

super_resolution_infer(input_data, out_data, out_dataset_name, **kwargs)

遥感影像超分辨率重建 支持 tif、img (Erdas Image)等影像文件,重建结果为影像文件 支持SuperMap SDX下的影像数据集,重建结果为影像文件

Parameters:
  • input_data (ImageFile or DatasetImage or ImageFile List) – 待推理数据
  • out_data (str) – 输出文件夹路径或数据源路径
  • out_dataset_name (str) – 输出数据名称
Returns:

数据集路径列表

class iobjectspy.ml.vision.PictureInference(model_path, gpus=[0], batch_size=1, **kwargs)

Bases: object

Model inference function entry for imagery data

Parameters:model_path (str) – model saved path
close()

Close the loaded model

object_detect_infer(input_data, out_data, out_dataset_name, category_name)

Object detection for pictures

Support jpg, png and other picture files, the outputs are ‘xml’ files.
requires attention:
-When ‘input_data’ is the data to be detected and ‘out_data’ is the output file path, ‘out_dataset_name’ is the file name. -When ‘input_data’ is the file path to be detected and ‘out_data’ is the output file path, ‘out_dataset_name’ is ineffective and ‘dataset_name’ is obtained from the ‘input_data’ file list.
Parameters:
  • input_data (str or Dataset) – data to be inferred
  • out_data (str or Datasource or DatasourceConnectionInfo) – output file (or datasource) path
  • out_dataset_name (str) – output data (or dataset) name
  • category_name (list[str] or str) – categories for object detection, support multi-categories detection
  • nms_thresh (float) – ‘nms’ threshold
  • score_thresh (float) – category score threshold
Returns:

None

object_extract_infer(input_data, out_visual_result, out_txt_result)

Object extraction for pictures. Support picture files such as jpg, png. The output is a vector file.

Parameters:
  • input_data (str or Dataset) – data to be inferred
  • out_data (str or Datasource or DatasourceConnectionInfo) – output file (or datasource) path
  • out_dataset_name (str) – output data (or dataset) name
  • score_thresh (float) – category score threshold
  • nms_thresh (float) – ‘nms’ threshold
  • return_bbox (bool) – whether to return the minimum bounding rectangle of the object
Returns:

dataset name

picture_classify_infer(input_data, out_data, out_dataset_name, **kwargs)

Picture classification Support picture files such as jpg, png. The output is xml files.

Parameters:
  • input_data (str or Dataset) – data to be inferred
  • out_data (str or Datasource or DatasourceConnectionInfo) – output file (or datasource) path
  • out_dataset_name (str) – output data (or dataset) name
  • result_type (str) – result return type, support vector and raster: ‘region’ or ‘grid’
Returns:

dataset name

class iobjectspy.ml.vision.ModelConverter(model_path, output_model_path, **kwargs)

Bases: object

模型转换功能入口 支持输出torchscript, onnx等中间模型 支持Libtorch,ONNX Runtime,TensorRT,Ncnn,Openppl,OpenVINO等多个推理引擎

Parameters:
  • model_path (str) – 模型存储路径
  • output_model_path (str) – 模型输出路径
pytorch2onnx()

onnx模型转换 :return:

pytorch2torchscript()

torchscript模型转换 :return: