Skip to content

efficientnet

mindnlp.transformers.models.efficientnet.image_processing_efficientnet

Image processor class for EfficientNet.

mindnlp.transformers.models.efficientnet.image_processing_efficientnet.EfficientNetImageProcessor

Bases: BaseImageProcessor

Constructs a EfficientNet image processor.

PARAMETER DESCRIPTION
do_resize

Whether to resize the image's (height, width) dimensions to the specified size. Can be overridden by do_resize in preprocess.

TYPE: `bool`, *optional*, defaults to `True` DEFAULT: True

size

346, "width": 346}): Size of the image afterresize. Can be overridden bysizeinpreprocess`.

TYPE: `Dict[str, int]` *optional*, defaults to `{"height" DEFAULT: None

resample

Resampling filter to use if resizing the image. Can be overridden by resample in preprocess.

TYPE: `PILImageResampling` filter, *optional*, defaults to 0 DEFAULT: NEAREST

do_center_crop

Whether to center crop the image. If the input size is smaller than crop_size along any edge, the image is padded with 0's and then center cropped. Can be overridden by do_center_crop in preprocess.

TYPE: `bool`, *optional*, defaults to `False` DEFAULT: False

crop_size

289, "width": 289}): Desired output size when applying center-cropping. Can be overridden bycrop_sizeinpreprocess`.

TYPE: `Dict[str, int]`, *optional*, defaults to `{"height" DEFAULT: None

rescale_factor

Scale factor to use if rescaling the image. Can be overridden by the rescale_factor parameter in the preprocess method.

TYPE: `int` or `float`, *optional*, defaults to `1/255` DEFAULT: 1 / 255

rescale_offset

Whether to rescale the image between [-scale_range, scale_range] instead of [0, scale_range]. Can be overridden by the rescale_factor parameter in the preprocess method.

TYPE: `bool`, *optional*, defaults to `False` DEFAULT: False

do_rescale

Whether to rescale the image by the specified scale rescale_factor. Can be overridden by the do_rescale parameter in the preprocess method.

TYPE: `bool`, *optional*, defaults to `True` DEFAULT: True

do_normalize

Whether to normalize the image. Can be overridden by the do_normalize parameter in the preprocess method.

TYPE: `bool`, *optional*, defaults to `True` DEFAULT: True

image_mean

Mean to use if normalizing the image. This is a float or list of floats the length of the number of channels in the image. Can be overridden by the image_mean parameter in the preprocess method.

TYPE: `float` or `List[float]`, *optional*, defaults to `IMAGENET_STANDARD_MEAN` DEFAULT: None

image_std

Standard deviation to use if normalizing the image. This is a float or list of floats the length of the number of channels in the image. Can be overridden by the image_std parameter in the preprocess method.

TYPE: `float` or `List[float]`, *optional*, defaults to `IMAGENET_STANDARD_STD` DEFAULT: None

include_top

Whether to rescale the image again. Should be set to True if the inputs are used for image classification.

TYPE: `bool`, *optional*, defaults to `True` DEFAULT: True

Source code in mindnlp/transformers/models/efficientnet/image_processing_efficientnet.py
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
class EfficientNetImageProcessor(BaseImageProcessor):
    r"""
    Constructs a EfficientNet image processor.

    Args:
        do_resize (`bool`, *optional*, defaults to `True`):
            Whether to resize the image's (height, width) dimensions to the specified `size`. Can be overridden by
            `do_resize` in `preprocess`.
        size (`Dict[str, int]` *optional*, defaults to `{"height": 346, "width": 346}`):
            Size of the image after `resize`. Can be overridden by `size` in `preprocess`.
        resample (`PILImageResampling` filter, *optional*, defaults to 0):
            Resampling filter to use if resizing the image. Can be overridden by `resample` in `preprocess`.
        do_center_crop (`bool`, *optional*, defaults to `False`):
            Whether to center crop the image. If the input size is smaller than `crop_size` along any edge, the image
            is padded with 0's and then center cropped. Can be overridden by `do_center_crop` in `preprocess`.
        crop_size (`Dict[str, int]`, *optional*, defaults to `{"height": 289, "width": 289}`):
            Desired output size when applying center-cropping. Can be overridden by `crop_size` in `preprocess`.
        rescale_factor (`int` or `float`, *optional*, defaults to `1/255`):
            Scale factor to use if rescaling the image. Can be overridden by the `rescale_factor` parameter in the
            `preprocess` method.
        rescale_offset (`bool`, *optional*, defaults to `False`):
            Whether to rescale the image between [-scale_range, scale_range] instead of [0, scale_range]. Can be
            overridden by the `rescale_factor` parameter in the `preprocess` method.
        do_rescale (`bool`, *optional*, defaults to `True`):
            Whether to rescale the image by the specified scale `rescale_factor`. Can be overridden by the `do_rescale`
            parameter in the `preprocess` method.
        do_normalize (`bool`, *optional*, defaults to `True`):
            Whether to normalize the image. Can be overridden by the `do_normalize` parameter in the `preprocess`
            method.
        image_mean (`float` or `List[float]`, *optional*, defaults to `IMAGENET_STANDARD_MEAN`):
            Mean to use if normalizing the image. This is a float or list of floats the length of the number of
            channels in the image. Can be overridden by the `image_mean` parameter in the `preprocess` method.
        image_std (`float` or `List[float]`, *optional*, defaults to `IMAGENET_STANDARD_STD`):
            Standard deviation to use if normalizing the image. This is a float or list of floats the length of the
            number of channels in the image. Can be overridden by the `image_std` parameter in the `preprocess` method.
        include_top (`bool`, *optional*, defaults to `True`):
            Whether to rescale the image again. Should be set to True if the inputs are used for image classification.
    """
    model_input_names = ["pixel_values"]

    def __init__(
        self,
        do_resize: bool = True,
        size: Dict[str, int] = None,
        resample: PILImageResampling = PIL.Image.NEAREST,
        do_center_crop: bool = False,
        crop_size: Dict[str, int] = None,
        rescale_factor: Union[int, float] = 1 / 255,
        rescale_offset: bool = False,
        do_rescale: bool = True,
        do_normalize: bool = True,
        image_mean: Optional[Union[float, List[float]]] = None,
        image_std: Optional[Union[float, List[float]]] = None,
        include_top: bool = True,
        **kwargs,
    ) -> None:
        """
        Initializes an instance of the EfficientNetImageProcessor class.

        Args:
            do_resize (bool, optional): Whether to resize the image. Defaults to True.
            size (Dict[str, int], optional): The target size for resizing the image. Defaults to None.
            resample (PILImageResampling, optional): The resampling filter to use when resizing the image.
                Defaults to PIL.Image.NEAREST.
            do_center_crop (bool, optional): Whether to perform a center crop on the image. Defaults to False.
            crop_size (Dict[str, int], optional): The size of the center crop. Defaults to None.
            rescale_factor (Union[int, float], optional): The factor by which to rescale the image pixel values.
                Defaults to 1 / 255.
            rescale_offset (bool, optional): Whether to offset the rescaled image pixel values. Defaults to False.
            do_rescale (bool, optional): Whether to rescale the image pixel values. Defaults to True.
            do_normalize (bool, optional): Whether to normalize the image pixel values. Defaults to True.
            image_mean (Optional[Union[float, List[float]]], optional): The mean pixel values used for normalization.
                Defaults to None.
            image_std (Optional[Union[float, List[float]]], optional):
                The standard deviation of pixel values used for normalization. Defaults to None.
            include_top (bool, optional): Whether to include the top layers of the EfficientNet model. Defaults to True.
            **kwargs: Additional keyword arguments.

        Returns:
            None

        Raises:
            None
        """
        super().__init__(**kwargs)
        size = size if size is not None else {"height": 346, "width": 346}
        size = get_size_dict(size)
        crop_size = crop_size if crop_size is not None else {"height": 289, "width": 289}
        crop_size = get_size_dict(crop_size, param_name="crop_size")

        self.do_resize = do_resize
        self.size = size
        self.resample = resample
        self.do_center_crop = do_center_crop
        self.crop_size = crop_size
        self.do_rescale = do_rescale
        self.rescale_factor = rescale_factor
        self.rescale_offset = rescale_offset
        self.do_normalize = do_normalize
        self.image_mean = image_mean if image_mean is not None else IMAGENET_STANDARD_MEAN
        self.image_std = image_std if image_std is not None else IMAGENET_STANDARD_STD
        self.include_top = include_top
        self._valid_processor_keys = [
            "images",
            "do_resize",
            "size",
            "resample",
            "do_center_crop",
            "crop_size",
            "do_rescale",
            "rescale_factor",
            "rescale_offset",
            "do_normalize",
            "image_mean",
            "image_std",
            "include_top",
            "return_tensors",
            "data_format",
            "input_data_format",
        ]

    # Copied from transformers.models.vit.image_processing_vit.ViTImageProcessor.resize with PILImageResampling.BILINEAR->PILImageResampling.NEAREST
    def resize(
        self,
        image: np.ndarray,
        size: Dict[str, int],
        resample: PILImageResampling = PILImageResampling.NEAREST,
        data_format: Optional[Union[str, ChannelDimension]] = None,
        input_data_format: Optional[Union[str, ChannelDimension]] = None,
        **kwargs,
    ) -> np.ndarray:
        """
        Resize an image to `(size["height"], size["width"])`.

        Args:
            image (`np.ndarray`):
                Image to resize.
            size (`Dict[str, int]`):
                Dictionary in the format `{"height": int, "width": int}` specifying the size of the output image.
            resample (`PILImageResampling`, *optional*, defaults to `PILImageResampling.NEAREST`):
                `PILImageResampling` filter to use when resizing the image e.g. `PILImageResampling.NEAREST`.
            data_format (`ChannelDimension` or `str`, *optional*):
                The channel dimension format for the output image. If unset, the channel dimension format of the input
                image is used. Can be one of:

                - `"channels_first"` or `ChannelDimension.FIRST`: image in (num_channels, height, width) format.
                - `"channels_last"` or `ChannelDimension.LAST`: image in (height, width, num_channels) format.
                - `"none"` or `ChannelDimension.NONE`: image in (height, width) format.
            input_data_format (`ChannelDimension` or `str`, *optional*):
                The channel dimension format for the input image. If unset, the channel dimension format is inferred
                from the input image. Can be one of:

                - `"channels_first"` or `ChannelDimension.FIRST`: image in (num_channels, height, width) format.
                - `"channels_last"` or `ChannelDimension.LAST`: image in (height, width, num_channels) format.
                - `"none"` or `ChannelDimension.NONE`: image in (height, width) format.

        Returns:
            `np.ndarray`: The resized image.
        """
        size = get_size_dict(size)
        if "height" not in size or "width" not in size:
            raise ValueError(f"The `size` dictionary must contain the keys `height` and `width`. Got {size.keys()}")
        output_size = (size["height"], size["width"])
        return resize(
            image,
            size=output_size,
            resample=resample,
            data_format=data_format,
            input_data_format=input_data_format,
            **kwargs,
        )

    def rescale(
        self,
        image: np.ndarray,
        scale: Union[int, float],
        offset: bool = True,
        data_format: Optional[Union[str, ChannelDimension]] = None,
        input_data_format: Optional[Union[str, ChannelDimension]] = None,
        **kwargs,
    ):
        """
        Rescale an image by a scale factor.

        If `offset` is `True`, the image has its values rescaled by `scale` and then offset by 1. If `scale` is
        1/127.5, the image is rescaled between [-1, 1].
            image = image * scale - 1

        If `offset` is `False`, and `scale` is 1/255, the image is rescaled between [0, 1].
            image = image * scale

        Args:
            image (`np.ndarray`):
                Image to rescale.
            scale (`int` or `float`):
                Scale to apply to the image.
            offset (`bool`, *optional*):
                Whether to scale the image in both negative and positive directions.
            data_format (`str` or `ChannelDimension`, *optional*):
                The channel dimension format of the image. If not provided, it will be the same as the input image.
            input_data_format (`ChannelDimension` or `str`, *optional*):
                The channel dimension format of the input image. If not provided, it will be inferred.
        """
        rescaled_image = rescale(
            image, scale=scale, data_format=data_format, input_data_format=input_data_format, **kwargs
        )

        if offset:
            rescaled_image = rescaled_image - 1

        return rescaled_image

    def preprocess(
        self,
        images: ImageInput,
        do_resize: bool = None,
        size: Dict[str, int] = None,
        resample=None,
        do_center_crop: bool = None,
        crop_size: Dict[str, int] = None,
        do_rescale: bool = None,
        rescale_factor: float = None,
        rescale_offset: bool = None,
        do_normalize: bool = None,
        image_mean: Optional[Union[float, List[float]]] = None,
        image_std: Optional[Union[float, List[float]]] = None,
        include_top: bool = None,
        return_tensors: Optional[Union[str, TensorType]] = None,
        data_format: ChannelDimension = ChannelDimension.FIRST,
        input_data_format: Optional[Union[str, ChannelDimension]] = None,
        **kwargs,
    ) -> PIL.Image.Image:
        """
        Preprocess an image or batch of images.

        Args:
            images (`ImageInput`):
                Image to preprocess. Expects a single or batch of images with pixel values ranging from 0 to 255. If
                passing in images with pixel values between 0 and 1, set `do_rescale=False`.
            do_resize (`bool`, *optional*, defaults to `self.do_resize`):
                Whether to resize the image.
            size (`Dict[str, int]`, *optional*, defaults to `self.size`):
                Size of the image after `resize`.
            resample (`PILImageResampling`, *optional*, defaults to `self.resample`):
                PILImageResampling filter to use if resizing the image Only has an effect if `do_resize` is set to
                `True`.
            do_center_crop (`bool`, *optional*, defaults to `self.do_center_crop`):
                Whether to center crop the image.
            crop_size (`Dict[str, int]`, *optional*, defaults to `self.crop_size`):
                Size of the image after center crop. If one edge the image is smaller than `crop_size`, it will be
                padded with zeros and then cropped
            do_rescale (`bool`, *optional*, defaults to `self.do_rescale`):
                Whether to rescale the image values between [0 - 1].
            rescale_factor (`float`, *optional*, defaults to `self.rescale_factor`):
                Rescale factor to rescale the image by if `do_rescale` is set to `True`.
            rescale_offset (`bool`, *optional*, defaults to `self.rescale_offset`):
                Whether to rescale the image between [-scale_range, scale_range] instead of [0, scale_range].
            do_normalize (`bool`, *optional*, defaults to `self.do_normalize`):
                Whether to normalize the image.
            image_mean (`float` or `List[float]`, *optional*, defaults to `self.image_mean`):
                Image mean.
            image_std (`float` or `List[float]`, *optional*, defaults to `self.image_std`):
                Image standard deviation.
            include_top (`bool`, *optional*, defaults to `self.include_top`):
                Rescales the image again for image classification if set to True.
            return_tensors (`str` or `TensorType`, *optional*):
                The type of tensors to return. Can be one of:

                - `None`: Return a list of `np.ndarray`.
                - `TensorType.TENSORFLOW` or `'tf'`: Return a batch of type `tf.Tensor`.
                - `TensorType.PYTORCH` or `'pt'`: Return a batch of type `torch.Tensor`.
                - `TensorType.NUMPY` or `'np'`: Return a batch of type `np.ndarray`.
                - `TensorType.JAX` or `'jax'`: Return a batch of type `jax.numpy.ndarray`.
            data_format (`ChannelDimension` or `str`, *optional*, defaults to `ChannelDimension.FIRST`):
                The channel dimension format for the output image. Can be one of:

                - `ChannelDimension.FIRST`: image in (num_channels, height, width) format.
                - `ChannelDimension.LAST`: image in (height, width, num_channels) format.
            input_data_format (`ChannelDimension` or `str`, *optional*):
                The channel dimension format for the input image. If unset, the channel dimension format is inferred
                from the input image. Can be one of:

                - `"channels_first"` or `ChannelDimension.FIRST`: image in (num_channels, height, width) format.
                - `"channels_last"` or `ChannelDimension.LAST`: image in (height, width, num_channels) format.
                - `"none"` or `ChannelDimension.NONE`: image in (height, width) format.
        """
        do_resize = do_resize if do_resize is not None else self.do_resize
        resample = resample if resample is not None else self.resample
        do_center_crop = do_center_crop if do_center_crop is not None else self.do_center_crop
        do_rescale = do_rescale if do_rescale is not None else self.do_rescale
        rescale_factor = rescale_factor if rescale_factor is not None else self.rescale_factor
        rescale_offset = rescale_offset if rescale_offset is not None else self.rescale_offset
        do_normalize = do_normalize if do_normalize is not None else self.do_normalize
        image_mean = image_mean if image_mean is not None else self.image_mean
        image_std = image_std if image_std is not None else self.image_std
        include_top = include_top if include_top is not None else self.include_top

        size = size if size is not None else self.size
        size = get_size_dict(size)
        crop_size = crop_size if crop_size is not None else self.crop_size
        crop_size = get_size_dict(crop_size, param_name="crop_size")

        images = make_list_of_images(images)

        validate_kwargs(captured_kwargs=kwargs.keys(), valid_processor_keys=self._valid_processor_keys)

        if not valid_images(images):
            raise ValueError(
                "Invalid image type. Must be of type PIL.Image.Image, numpy.ndarray, "
                "torch.Tensor, tf.Tensor or jax.ndarray."
            )
        validate_preprocess_arguments(
            do_rescale=do_rescale,
            rescale_factor=rescale_factor,
            do_normalize=do_normalize,
            image_mean=image_mean,
            image_std=image_std,
            do_center_crop=do_center_crop,
            crop_size=crop_size,
            do_resize=do_resize,
            size=size,
            resample=resample,
        )
        # All transformations expect numpy arrays.
        images = [to_numpy_array(image) for image in images]

        if is_scaled_image(images[0]) and do_rescale:
            logger.warning_once(
                "It looks like you are trying to rescale already rescaled images. If the input"
                " images have pixel values between 0 and 1, set `do_rescale=False` to avoid rescaling them again."
            )

        if input_data_format is None:
            # We assume that all images have the same channel dimension format.
            input_data_format = infer_channel_dimension_format(images[0])

        if do_resize:
            images = [
                self.resize(image=image, size=size, resample=resample, input_data_format=input_data_format)
                for image in images
            ]

        if do_center_crop:
            images = [
                self.center_crop(image=image, size=crop_size, input_data_format=input_data_format) for image in images
            ]

        if do_rescale:
            images = [
                self.rescale(
                    image=image, scale=rescale_factor, offset=rescale_offset, input_data_format=input_data_format
                )
                for image in images
            ]

        if do_normalize:
            images = [
                self.normalize(image=image, mean=image_mean, std=image_std, input_data_format=input_data_format)
                for image in images
            ]

        if include_top:
            images = [
                self.normalize(image=image, mean=0, std=image_std, input_data_format=input_data_format)
                for image in images
            ]

        images = [
            to_channel_dimension_format(image, data_format, input_channel_dim=input_data_format) for image in images
        ]

        data = {"pixel_values": images}
        return BatchFeature(data=data, tensor_type=return_tensors)

mindnlp.transformers.models.efficientnet.image_processing_efficientnet.EfficientNetImageProcessor.__init__(do_resize=True, size=None, resample=PIL.Image.NEAREST, do_center_crop=False, crop_size=None, rescale_factor=1 / 255, rescale_offset=False, do_rescale=True, do_normalize=True, image_mean=None, image_std=None, include_top=True, **kwargs)

Initializes an instance of the EfficientNetImageProcessor class.

PARAMETER DESCRIPTION
do_resize

Whether to resize the image. Defaults to True.

TYPE: bool DEFAULT: True

size

The target size for resizing the image. Defaults to None.

TYPE: Dict[str, int] DEFAULT: None

resample

The resampling filter to use when resizing the image. Defaults to PIL.Image.NEAREST.

TYPE: PILImageResampling DEFAULT: NEAREST

do_center_crop

Whether to perform a center crop on the image. Defaults to False.

TYPE: bool DEFAULT: False

crop_size

The size of the center crop. Defaults to None.

TYPE: Dict[str, int] DEFAULT: None

rescale_factor

The factor by which to rescale the image pixel values. Defaults to 1 / 255.

TYPE: Union[int, float] DEFAULT: 1 / 255

rescale_offset

Whether to offset the rescaled image pixel values. Defaults to False.

TYPE: bool DEFAULT: False

do_rescale

Whether to rescale the image pixel values. Defaults to True.

TYPE: bool DEFAULT: True

do_normalize

Whether to normalize the image pixel values. Defaults to True.

TYPE: bool DEFAULT: True

image_mean

The mean pixel values used for normalization. Defaults to None.

TYPE: Optional[Union[float, List[float]]] DEFAULT: None

image_std

The standard deviation of pixel values used for normalization. Defaults to None.

TYPE: Optional[Union[float, List[float]]] DEFAULT: None

include_top

Whether to include the top layers of the EfficientNet model. Defaults to True.

TYPE: bool DEFAULT: True

**kwargs

Additional keyword arguments.

DEFAULT: {}

RETURNS DESCRIPTION
None

None

Source code in mindnlp/transformers/models/efficientnet/image_processing_efficientnet.py
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
def __init__(
    self,
    do_resize: bool = True,
    size: Dict[str, int] = None,
    resample: PILImageResampling = PIL.Image.NEAREST,
    do_center_crop: bool = False,
    crop_size: Dict[str, int] = None,
    rescale_factor: Union[int, float] = 1 / 255,
    rescale_offset: bool = False,
    do_rescale: bool = True,
    do_normalize: bool = True,
    image_mean: Optional[Union[float, List[float]]] = None,
    image_std: Optional[Union[float, List[float]]] = None,
    include_top: bool = True,
    **kwargs,
) -> None:
    """
    Initializes an instance of the EfficientNetImageProcessor class.

    Args:
        do_resize (bool, optional): Whether to resize the image. Defaults to True.
        size (Dict[str, int], optional): The target size for resizing the image. Defaults to None.
        resample (PILImageResampling, optional): The resampling filter to use when resizing the image.
            Defaults to PIL.Image.NEAREST.
        do_center_crop (bool, optional): Whether to perform a center crop on the image. Defaults to False.
        crop_size (Dict[str, int], optional): The size of the center crop. Defaults to None.
        rescale_factor (Union[int, float], optional): The factor by which to rescale the image pixel values.
            Defaults to 1 / 255.
        rescale_offset (bool, optional): Whether to offset the rescaled image pixel values. Defaults to False.
        do_rescale (bool, optional): Whether to rescale the image pixel values. Defaults to True.
        do_normalize (bool, optional): Whether to normalize the image pixel values. Defaults to True.
        image_mean (Optional[Union[float, List[float]]], optional): The mean pixel values used for normalization.
            Defaults to None.
        image_std (Optional[Union[float, List[float]]], optional):
            The standard deviation of pixel values used for normalization. Defaults to None.
        include_top (bool, optional): Whether to include the top layers of the EfficientNet model. Defaults to True.
        **kwargs: Additional keyword arguments.

    Returns:
        None

    Raises:
        None
    """
    super().__init__(**kwargs)
    size = size if size is not None else {"height": 346, "width": 346}
    size = get_size_dict(size)
    crop_size = crop_size if crop_size is not None else {"height": 289, "width": 289}
    crop_size = get_size_dict(crop_size, param_name="crop_size")

    self.do_resize = do_resize
    self.size = size
    self.resample = resample
    self.do_center_crop = do_center_crop
    self.crop_size = crop_size
    self.do_rescale = do_rescale
    self.rescale_factor = rescale_factor
    self.rescale_offset = rescale_offset
    self.do_normalize = do_normalize
    self.image_mean = image_mean if image_mean is not None else IMAGENET_STANDARD_MEAN
    self.image_std = image_std if image_std is not None else IMAGENET_STANDARD_STD
    self.include_top = include_top
    self._valid_processor_keys = [
        "images",
        "do_resize",
        "size",
        "resample",
        "do_center_crop",
        "crop_size",
        "do_rescale",
        "rescale_factor",
        "rescale_offset",
        "do_normalize",
        "image_mean",
        "image_std",
        "include_top",
        "return_tensors",
        "data_format",
        "input_data_format",
    ]

mindnlp.transformers.models.efficientnet.image_processing_efficientnet.EfficientNetImageProcessor.preprocess(images, do_resize=None, size=None, resample=None, do_center_crop=None, crop_size=None, do_rescale=None, rescale_factor=None, rescale_offset=None, do_normalize=None, image_mean=None, image_std=None, include_top=None, return_tensors=None, data_format=ChannelDimension.FIRST, input_data_format=None, **kwargs)

Preprocess an image or batch of images.

PARAMETER DESCRIPTION
images

Image to preprocess. Expects a single or batch of images with pixel values ranging from 0 to 255. If passing in images with pixel values between 0 and 1, set do_rescale=False.

TYPE: `ImageInput`

do_resize

Whether to resize the image.

TYPE: `bool`, *optional*, defaults to `self.do_resize` DEFAULT: None

size

Size of the image after resize.

TYPE: `Dict[str, int]`, *optional*, defaults to `self.size` DEFAULT: None

resample

PILImageResampling filter to use if resizing the image Only has an effect if do_resize is set to True.

TYPE: `PILImageResampling`, *optional*, defaults to `self.resample` DEFAULT: None

do_center_crop

Whether to center crop the image.

TYPE: `bool`, *optional*, defaults to `self.do_center_crop` DEFAULT: None

crop_size

Size of the image after center crop. If one edge the image is smaller than crop_size, it will be padded with zeros and then cropped

TYPE: `Dict[str, int]`, *optional*, defaults to `self.crop_size` DEFAULT: None

do_rescale

Whether to rescale the image values between [0 - 1].

TYPE: `bool`, *optional*, defaults to `self.do_rescale` DEFAULT: None

rescale_factor

Rescale factor to rescale the image by if do_rescale is set to True.

TYPE: `float`, *optional*, defaults to `self.rescale_factor` DEFAULT: None

rescale_offset

Whether to rescale the image between [-scale_range, scale_range] instead of [0, scale_range].

TYPE: `bool`, *optional*, defaults to `self.rescale_offset` DEFAULT: None

do_normalize

Whether to normalize the image.

TYPE: `bool`, *optional*, defaults to `self.do_normalize` DEFAULT: None

image_mean

Image mean.

TYPE: `float` or `List[float]`, *optional*, defaults to `self.image_mean` DEFAULT: None

image_std

Image standard deviation.

TYPE: `float` or `List[float]`, *optional*, defaults to `self.image_std` DEFAULT: None

include_top

Rescales the image again for image classification if set to True.

TYPE: `bool`, *optional*, defaults to `self.include_top` DEFAULT: None

return_tensors

The type of tensors to return. Can be one of:

  • None: Return a list of np.ndarray.
  • TensorType.TENSORFLOW or 'tf': Return a batch of type tf.Tensor.
  • TensorType.PYTORCH or 'pt': Return a batch of type torch.Tensor.
  • TensorType.NUMPY or 'np': Return a batch of type np.ndarray.
  • TensorType.JAX or 'jax': Return a batch of type jax.numpy.ndarray.

TYPE: `str` or `TensorType`, *optional* DEFAULT: None

data_format

The channel dimension format for the output image. Can be one of:

  • ChannelDimension.FIRST: image in (num_channels, height, width) format.
  • ChannelDimension.LAST: image in (height, width, num_channels) format.

TYPE: `ChannelDimension` or `str`, *optional*, defaults to `ChannelDimension.FIRST` DEFAULT: FIRST

input_data_format

The channel dimension format for the input image. If unset, the channel dimension format is inferred from the input image. Can be one of:

  • "channels_first" or ChannelDimension.FIRST: image in (num_channels, height, width) format.
  • "channels_last" or ChannelDimension.LAST: image in (height, width, num_channels) format.
  • "none" or ChannelDimension.NONE: image in (height, width) format.

TYPE: `ChannelDimension` or `str`, *optional* DEFAULT: None

Source code in mindnlp/transformers/models/efficientnet/image_processing_efficientnet.py
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
def preprocess(
    self,
    images: ImageInput,
    do_resize: bool = None,
    size: Dict[str, int] = None,
    resample=None,
    do_center_crop: bool = None,
    crop_size: Dict[str, int] = None,
    do_rescale: bool = None,
    rescale_factor: float = None,
    rescale_offset: bool = None,
    do_normalize: bool = None,
    image_mean: Optional[Union[float, List[float]]] = None,
    image_std: Optional[Union[float, List[float]]] = None,
    include_top: bool = None,
    return_tensors: Optional[Union[str, TensorType]] = None,
    data_format: ChannelDimension = ChannelDimension.FIRST,
    input_data_format: Optional[Union[str, ChannelDimension]] = None,
    **kwargs,
) -> PIL.Image.Image:
    """
    Preprocess an image or batch of images.

    Args:
        images (`ImageInput`):
            Image to preprocess. Expects a single or batch of images with pixel values ranging from 0 to 255. If
            passing in images with pixel values between 0 and 1, set `do_rescale=False`.
        do_resize (`bool`, *optional*, defaults to `self.do_resize`):
            Whether to resize the image.
        size (`Dict[str, int]`, *optional*, defaults to `self.size`):
            Size of the image after `resize`.
        resample (`PILImageResampling`, *optional*, defaults to `self.resample`):
            PILImageResampling filter to use if resizing the image Only has an effect if `do_resize` is set to
            `True`.
        do_center_crop (`bool`, *optional*, defaults to `self.do_center_crop`):
            Whether to center crop the image.
        crop_size (`Dict[str, int]`, *optional*, defaults to `self.crop_size`):
            Size of the image after center crop. If one edge the image is smaller than `crop_size`, it will be
            padded with zeros and then cropped
        do_rescale (`bool`, *optional*, defaults to `self.do_rescale`):
            Whether to rescale the image values between [0 - 1].
        rescale_factor (`float`, *optional*, defaults to `self.rescale_factor`):
            Rescale factor to rescale the image by if `do_rescale` is set to `True`.
        rescale_offset (`bool`, *optional*, defaults to `self.rescale_offset`):
            Whether to rescale the image between [-scale_range, scale_range] instead of [0, scale_range].
        do_normalize (`bool`, *optional*, defaults to `self.do_normalize`):
            Whether to normalize the image.
        image_mean (`float` or `List[float]`, *optional*, defaults to `self.image_mean`):
            Image mean.
        image_std (`float` or `List[float]`, *optional*, defaults to `self.image_std`):
            Image standard deviation.
        include_top (`bool`, *optional*, defaults to `self.include_top`):
            Rescales the image again for image classification if set to True.
        return_tensors (`str` or `TensorType`, *optional*):
            The type of tensors to return. Can be one of:

            - `None`: Return a list of `np.ndarray`.
            - `TensorType.TENSORFLOW` or `'tf'`: Return a batch of type `tf.Tensor`.
            - `TensorType.PYTORCH` or `'pt'`: Return a batch of type `torch.Tensor`.
            - `TensorType.NUMPY` or `'np'`: Return a batch of type `np.ndarray`.
            - `TensorType.JAX` or `'jax'`: Return a batch of type `jax.numpy.ndarray`.
        data_format (`ChannelDimension` or `str`, *optional*, defaults to `ChannelDimension.FIRST`):
            The channel dimension format for the output image. Can be one of:

            - `ChannelDimension.FIRST`: image in (num_channels, height, width) format.
            - `ChannelDimension.LAST`: image in (height, width, num_channels) format.
        input_data_format (`ChannelDimension` or `str`, *optional*):
            The channel dimension format for the input image. If unset, the channel dimension format is inferred
            from the input image. Can be one of:

            - `"channels_first"` or `ChannelDimension.FIRST`: image in (num_channels, height, width) format.
            - `"channels_last"` or `ChannelDimension.LAST`: image in (height, width, num_channels) format.
            - `"none"` or `ChannelDimension.NONE`: image in (height, width) format.
    """
    do_resize = do_resize if do_resize is not None else self.do_resize
    resample = resample if resample is not None else self.resample
    do_center_crop = do_center_crop if do_center_crop is not None else self.do_center_crop
    do_rescale = do_rescale if do_rescale is not None else self.do_rescale
    rescale_factor = rescale_factor if rescale_factor is not None else self.rescale_factor
    rescale_offset = rescale_offset if rescale_offset is not None else self.rescale_offset
    do_normalize = do_normalize if do_normalize is not None else self.do_normalize
    image_mean = image_mean if image_mean is not None else self.image_mean
    image_std = image_std if image_std is not None else self.image_std
    include_top = include_top if include_top is not None else self.include_top

    size = size if size is not None else self.size
    size = get_size_dict(size)
    crop_size = crop_size if crop_size is not None else self.crop_size
    crop_size = get_size_dict(crop_size, param_name="crop_size")

    images = make_list_of_images(images)

    validate_kwargs(captured_kwargs=kwargs.keys(), valid_processor_keys=self._valid_processor_keys)

    if not valid_images(images):
        raise ValueError(
            "Invalid image type. Must be of type PIL.Image.Image, numpy.ndarray, "
            "torch.Tensor, tf.Tensor or jax.ndarray."
        )
    validate_preprocess_arguments(
        do_rescale=do_rescale,
        rescale_factor=rescale_factor,
        do_normalize=do_normalize,
        image_mean=image_mean,
        image_std=image_std,
        do_center_crop=do_center_crop,
        crop_size=crop_size,
        do_resize=do_resize,
        size=size,
        resample=resample,
    )
    # All transformations expect numpy arrays.
    images = [to_numpy_array(image) for image in images]

    if is_scaled_image(images[0]) and do_rescale:
        logger.warning_once(
            "It looks like you are trying to rescale already rescaled images. If the input"
            " images have pixel values between 0 and 1, set `do_rescale=False` to avoid rescaling them again."
        )

    if input_data_format is None:
        # We assume that all images have the same channel dimension format.
        input_data_format = infer_channel_dimension_format(images[0])

    if do_resize:
        images = [
            self.resize(image=image, size=size, resample=resample, input_data_format=input_data_format)
            for image in images
        ]

    if do_center_crop:
        images = [
            self.center_crop(image=image, size=crop_size, input_data_format=input_data_format) for image in images
        ]

    if do_rescale:
        images = [
            self.rescale(
                image=image, scale=rescale_factor, offset=rescale_offset, input_data_format=input_data_format
            )
            for image in images
        ]

    if do_normalize:
        images = [
            self.normalize(image=image, mean=image_mean, std=image_std, input_data_format=input_data_format)
            for image in images
        ]

    if include_top:
        images = [
            self.normalize(image=image, mean=0, std=image_std, input_data_format=input_data_format)
            for image in images
        ]

    images = [
        to_channel_dimension_format(image, data_format, input_channel_dim=input_data_format) for image in images
    ]

    data = {"pixel_values": images}
    return BatchFeature(data=data, tensor_type=return_tensors)

mindnlp.transformers.models.efficientnet.image_processing_efficientnet.EfficientNetImageProcessor.rescale(image, scale, offset=True, data_format=None, input_data_format=None, **kwargs)

Rescale an image by a scale factor.

If offset is True, the image has its values rescaled by scale and then offset by 1. If scale is 1/127.5, the image is rescaled between [-1, 1]. image = image * scale - 1

If offset is False, and scale is 1/255, the image is rescaled between [0, 1]. image = image * scale

PARAMETER DESCRIPTION
image

Image to rescale.

TYPE: `np.ndarray`

scale

Scale to apply to the image.

TYPE: `int` or `float`

offset

Whether to scale the image in both negative and positive directions.

TYPE: `bool`, *optional* DEFAULT: True

data_format

The channel dimension format of the image. If not provided, it will be the same as the input image.

TYPE: `str` or `ChannelDimension`, *optional* DEFAULT: None

input_data_format

The channel dimension format of the input image. If not provided, it will be inferred.

TYPE: `ChannelDimension` or `str`, *optional* DEFAULT: None

Source code in mindnlp/transformers/models/efficientnet/image_processing_efficientnet.py
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
def rescale(
    self,
    image: np.ndarray,
    scale: Union[int, float],
    offset: bool = True,
    data_format: Optional[Union[str, ChannelDimension]] = None,
    input_data_format: Optional[Union[str, ChannelDimension]] = None,
    **kwargs,
):
    """
    Rescale an image by a scale factor.

    If `offset` is `True`, the image has its values rescaled by `scale` and then offset by 1. If `scale` is
    1/127.5, the image is rescaled between [-1, 1].
        image = image * scale - 1

    If `offset` is `False`, and `scale` is 1/255, the image is rescaled between [0, 1].
        image = image * scale

    Args:
        image (`np.ndarray`):
            Image to rescale.
        scale (`int` or `float`):
            Scale to apply to the image.
        offset (`bool`, *optional*):
            Whether to scale the image in both negative and positive directions.
        data_format (`str` or `ChannelDimension`, *optional*):
            The channel dimension format of the image. If not provided, it will be the same as the input image.
        input_data_format (`ChannelDimension` or `str`, *optional*):
            The channel dimension format of the input image. If not provided, it will be inferred.
    """
    rescaled_image = rescale(
        image, scale=scale, data_format=data_format, input_data_format=input_data_format, **kwargs
    )

    if offset:
        rescaled_image = rescaled_image - 1

    return rescaled_image

mindnlp.transformers.models.efficientnet.image_processing_efficientnet.EfficientNetImageProcessor.resize(image, size, resample=PILImageResampling.NEAREST, data_format=None, input_data_format=None, **kwargs)

Resize an image to (size["height"], size["width"]).

PARAMETER DESCRIPTION
image

Image to resize.

TYPE: `np.ndarray`

size

Dictionary in the format {"height": int, "width": int} specifying the size of the output image.

TYPE: `Dict[str, int]`

resample

PILImageResampling filter to use when resizing the image e.g. PILImageResampling.NEAREST.

TYPE: `PILImageResampling`, *optional*, defaults to `PILImageResampling.NEAREST` DEFAULT: NEAREST

data_format

The channel dimension format for the output image. If unset, the channel dimension format of the input image is used. Can be one of:

  • "channels_first" or ChannelDimension.FIRST: image in (num_channels, height, width) format.
  • "channels_last" or ChannelDimension.LAST: image in (height, width, num_channels) format.
  • "none" or ChannelDimension.NONE: image in (height, width) format.

TYPE: `ChannelDimension` or `str`, *optional* DEFAULT: None

input_data_format

The channel dimension format for the input image. If unset, the channel dimension format is inferred from the input image. Can be one of:

  • "channels_first" or ChannelDimension.FIRST: image in (num_channels, height, width) format.
  • "channels_last" or ChannelDimension.LAST: image in (height, width, num_channels) format.
  • "none" or ChannelDimension.NONE: image in (height, width) format.

TYPE: `ChannelDimension` or `str`, *optional* DEFAULT: None

RETURNS DESCRIPTION
ndarray

np.ndarray: The resized image.

Source code in mindnlp/transformers/models/efficientnet/image_processing_efficientnet.py
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
def resize(
    self,
    image: np.ndarray,
    size: Dict[str, int],
    resample: PILImageResampling = PILImageResampling.NEAREST,
    data_format: Optional[Union[str, ChannelDimension]] = None,
    input_data_format: Optional[Union[str, ChannelDimension]] = None,
    **kwargs,
) -> np.ndarray:
    """
    Resize an image to `(size["height"], size["width"])`.

    Args:
        image (`np.ndarray`):
            Image to resize.
        size (`Dict[str, int]`):
            Dictionary in the format `{"height": int, "width": int}` specifying the size of the output image.
        resample (`PILImageResampling`, *optional*, defaults to `PILImageResampling.NEAREST`):
            `PILImageResampling` filter to use when resizing the image e.g. `PILImageResampling.NEAREST`.
        data_format (`ChannelDimension` or `str`, *optional*):
            The channel dimension format for the output image. If unset, the channel dimension format of the input
            image is used. Can be one of:

            - `"channels_first"` or `ChannelDimension.FIRST`: image in (num_channels, height, width) format.
            - `"channels_last"` or `ChannelDimension.LAST`: image in (height, width, num_channels) format.
            - `"none"` or `ChannelDimension.NONE`: image in (height, width) format.
        input_data_format (`ChannelDimension` or `str`, *optional*):
            The channel dimension format for the input image. If unset, the channel dimension format is inferred
            from the input image. Can be one of:

            - `"channels_first"` or `ChannelDimension.FIRST`: image in (num_channels, height, width) format.
            - `"channels_last"` or `ChannelDimension.LAST`: image in (height, width, num_channels) format.
            - `"none"` or `ChannelDimension.NONE`: image in (height, width) format.

    Returns:
        `np.ndarray`: The resized image.
    """
    size = get_size_dict(size)
    if "height" not in size or "width" not in size:
        raise ValueError(f"The `size` dictionary must contain the keys `height` and `width`. Got {size.keys()}")
    output_size = (size["height"], size["width"])
    return resize(
        image,
        size=output_size,
        resample=resample,
        data_format=data_format,
        input_data_format=input_data_format,
        **kwargs,
    )