Skip to content

bros

mindnlp.transformers.models.bros.configuration_bros.BrosConfig

Bases: PretrainedConfig

This is the configuration class to store the configuration of a [BrosModel] or a [TFBrosModel]. It is used to instantiate a Bros model according to the specified arguments, defining the model architecture. Instantiating a configuration with the defaults will yield a similar configuration to that of the Bros jinho8345/bros-base-uncased architecture.

Configuration objects inherit from [PretrainedConfig] and can be used to control the model outputs. Read the documentation from [PretrainedConfig] for more information.

PARAMETER DESCRIPTION
vocab_size

Vocabulary size of the Bros model. Defines the number of different tokens that can be represented by the inputs_ids passed when calling [BrosModel] or [TFBrosModel].

TYPE: `int`, *optional*, defaults to 30522 DEFAULT: 30522

hidden_size

Dimensionality of the encoder layers and the pooler layer.

TYPE: `int`, *optional*, defaults to 768 DEFAULT: 768

num_hidden_layers

Number of hidden layers in the Transformer encoder.

TYPE: `int`, *optional*, defaults to 12 DEFAULT: 12

num_attention_heads

Number of attention heads for each attention layer in the Transformer encoder.

TYPE: `int`, *optional*, defaults to 12 DEFAULT: 12

intermediate_size

Dimensionality of the "intermediate" (often named feed-forward) layer in the Transformer encoder.

TYPE: `int`, *optional*, defaults to 3072 DEFAULT: 3072

hidden_act

The non-linear activation function (function or string) in the encoder and pooler. If string, "gelu", "relu", "silu" and "gelu_new" are supported.

TYPE: `str` or `Callable`, *optional*, defaults to `"gelu"` DEFAULT: 'gelu'

hidden_dropout_prob

The dropout probability for all fully connected layers in the embeddings, encoder, and pooler.

TYPE: `float`, *optional*, defaults to 0.1 DEFAULT: 0.1

attention_probs_dropout_prob

The dropout ratio for the attention probabilities.

TYPE: `float`, *optional*, defaults to 0.1 DEFAULT: 0.1

max_position_embeddings

The maximum sequence length that this model might ever be used with. Typically set this to something large just in case (e.g., 512 or 1024 or 2048).

TYPE: `int`, *optional*, defaults to 512 DEFAULT: 512

type_vocab_size

The vocabulary size of the token_type_ids passed when calling [BrosModel] or [TFBrosModel].

TYPE: `int`, *optional*, defaults to 2 DEFAULT: 2

initializer_range

The standard deviation of the truncated_normal_initializer for initializing all weight matrices.

TYPE: `float`, *optional*, defaults to 0.02 DEFAULT: 0.02

layer_norm_eps

The epsilon used by the layer normalization layers.

TYPE: `float`, *optional*, defaults to 1e-12 DEFAULT: 1e-12

pad_token_id

The index of the padding token in the token vocabulary.

TYPE: `int`, *optional*, defaults to 0 DEFAULT: 0

dim_bbox

The dimension of the bounding box coordinates. (x0, y1, x1, y0, x1, y1, x0, y1)

TYPE: `int`, *optional*, defaults to 8 DEFAULT: 8

bbox_scale

The scale factor of the bounding box coordinates.

TYPE: `float`, *optional*, defaults to 100.0 DEFAULT: 100.0

n_relations

The number of relations for SpadeEE(entity extraction), SpadeEL(entity linking) head.

TYPE: `int`, *optional*, defaults to 1 DEFAULT: 1

classifier_dropout_prob

The dropout ratio for the classifier head.

TYPE: `float`, *optional*, defaults to 0.1 DEFAULT: 0.1

Example
>>> from transformers import BrosConfig, BrosModel
...
>>> # Initializing a BROS jinho8345/bros-base-uncased style configuration
>>> configuration = BrosConfig()
...
>>> # Initializing a model from the jinho8345/bros-base-uncased style configuration
>>> model = BrosModel(configuration)
...
>>> # Accessing the model configuration
>>> configuration = model.config
Source code in mindnlp/transformers/models/bros/configuration_bros.py
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
class BrosConfig(PretrainedConfig):
    r"""
    This is the configuration class to store the configuration of a [`BrosModel`] or a [`TFBrosModel`]. It is used to
    instantiate a Bros model according to the specified arguments, defining the model architecture. Instantiating a
    configuration with the defaults will yield a similar configuration to that of the Bros
    [jinho8345/bros-base-uncased](https://huggingface.co/jinho8345/bros-base-uncased) architecture.

    Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the
    documentation from [`PretrainedConfig`] for more information.

    Args:
        vocab_size (`int`, *optional*, defaults to 30522):
            Vocabulary size of the Bros model. Defines the number of different tokens that can be represented by the
            `inputs_ids` passed when calling [`BrosModel`] or [`TFBrosModel`].
        hidden_size (`int`, *optional*, defaults to 768):
            Dimensionality of the encoder layers and the pooler layer.
        num_hidden_layers (`int`, *optional*, defaults to 12):
            Number of hidden layers in the Transformer encoder.
        num_attention_heads (`int`, *optional*, defaults to 12):
            Number of attention heads for each attention layer in the Transformer encoder.
        intermediate_size (`int`, *optional*, defaults to 3072):
            Dimensionality of the "intermediate" (often named feed-forward) layer in the Transformer encoder.
        hidden_act (`str` or `Callable`, *optional*, defaults to `"gelu"`):
            The non-linear activation function (function or string) in the encoder and pooler. If string, `"gelu"`,
            `"relu"`, `"silu"` and `"gelu_new"` are supported.
        hidden_dropout_prob (`float`, *optional*, defaults to 0.1):
            The dropout probability for all fully connected layers in the embeddings, encoder, and pooler.
        attention_probs_dropout_prob (`float`, *optional*, defaults to 0.1):
            The dropout ratio for the attention probabilities.
        max_position_embeddings (`int`, *optional*, defaults to 512):
            The maximum sequence length that this model might ever be used with. Typically set this to something large
            just in case (e.g., 512 or 1024 or 2048).
        type_vocab_size (`int`, *optional*, defaults to 2):
            The vocabulary size of the `token_type_ids` passed when calling [`BrosModel`] or [`TFBrosModel`].
        initializer_range (`float`, *optional*, defaults to 0.02):
            The standard deviation of the truncated_normal_initializer for initializing all weight matrices.
        layer_norm_eps (`float`, *optional*, defaults to 1e-12):
            The epsilon used by the layer normalization layers.
        pad_token_id (`int`, *optional*, defaults to 0):
            The index of the padding token in the token vocabulary.
        dim_bbox (`int`, *optional*, defaults to 8):
            The dimension of the bounding box coordinates. (x0, y1, x1, y0, x1, y1, x0, y1)
        bbox_scale (`float`, *optional*, defaults to 100.0):
            The scale factor of the bounding box coordinates.
        n_relations (`int`, *optional*, defaults to 1):
            The number of relations for SpadeEE(entity extraction), SpadeEL(entity linking) head.
        classifier_dropout_prob (`float`, *optional*, defaults to 0.1):
            The dropout ratio for the classifier head.

    Example:
        ```python
        >>> from transformers import BrosConfig, BrosModel
        ...
        >>> # Initializing a BROS jinho8345/bros-base-uncased style configuration
        >>> configuration = BrosConfig()
        ...
        >>> # Initializing a model from the jinho8345/bros-base-uncased style configuration
        >>> model = BrosModel(configuration)
        ...
        >>> # Accessing the model configuration
        >>> configuration = model.config
        ```
    """
    model_type = "bros"

    def __init__(
        self,
        vocab_size=30522,
        hidden_size=768,
        num_hidden_layers=12,
        num_attention_heads=12,
        intermediate_size=3072,
        hidden_act="gelu",
        hidden_dropout_prob=0.1,
        attention_probs_dropout_prob=0.1,
        max_position_embeddings=512,
        type_vocab_size=2,
        initializer_range=0.02,
        layer_norm_eps=1e-12,
        pad_token_id=0,
        dim_bbox=8,
        bbox_scale=100.0,
        n_relations=1,
        classifier_dropout_prob=0.1,
        **kwargs,
    ):
        """
        Initializes an instance of the BrosConfig class.

        Args:
            self: The instance of the class.
            vocab_size (int, optional): The size of the vocabulary. Defaults to 30522.
            hidden_size (int, optional): The size of the hidden layer. Defaults to 768.
            num_hidden_layers (int, optional): The number of hidden layers. Defaults to 12.
            num_attention_heads (int, optional): The number of attention heads. Defaults to 12.
            intermediate_size (int, optional): The size of the intermediate layer. Defaults to 3072.
            hidden_act (str, optional): The activation function for the hidden layers. Defaults to 'gelu'.
            hidden_dropout_prob (float, optional): The dropout probability for the hidden layers. Defaults to 0.1.
            attention_probs_dropout_prob (float, optional): The dropout probability for the attention probabilities. Defaults to 0.1.
            max_position_embeddings (int, optional): The maximum number of position embeddings. Defaults to 512.
            type_vocab_size (int, optional): The size of the type vocabulary. Defaults to 2.
            initializer_range (float, optional): The range for weight initialization. Defaults to 0.02.
            layer_norm_eps (float, optional): The epsilon value for layer normalization. Defaults to 1e-12.
            pad_token_id (int, optional): The ID for padding token. Defaults to 0.
            dim_bbox (int, optional): The dimension of the bounding box. Defaults to 8.
            bbox_scale (float, optional): The scale factor for the bounding box. Defaults to 100.0.
            n_relations (int, optional): The number of relations. Defaults to 1.
            classifier_dropout_prob (float, optional): The dropout probability for the classifier. Defaults to 0.1.
            **kwargs: Additional keyword arguments.

        Returns:
            None.

        Raises:
            None.
        """
        super().__init__(
            vocab_size=vocab_size,
            hidden_size=hidden_size,
            num_hidden_layers=num_hidden_layers,
            num_attention_heads=num_attention_heads,
            intermediate_size=intermediate_size,
            hidden_act=hidden_act,
            hidden_dropout_prob=hidden_dropout_prob,
            attention_probs_dropout_prob=attention_probs_dropout_prob,
            max_position_embeddings=max_position_embeddings,
            type_vocab_size=type_vocab_size,
            initializer_range=initializer_range,
            layer_norm_eps=layer_norm_eps,
            pad_token_id=pad_token_id,
            **kwargs,
        )

        self.dim_bbox = dim_bbox
        self.bbox_scale = bbox_scale
        self.n_relations = n_relations
        self.dim_bbox_sinusoid_emb_2d = self.hidden_size // 4
        self.dim_bbox_sinusoid_emb_1d = self.dim_bbox_sinusoid_emb_2d // self.dim_bbox
        self.dim_bbox_projection = self.hidden_size // self.num_attention_heads
        self.classifier_dropout_prob = classifier_dropout_prob

mindnlp.transformers.models.bros.configuration_bros.BrosConfig.__init__(vocab_size=30522, hidden_size=768, num_hidden_layers=12, num_attention_heads=12, intermediate_size=3072, hidden_act='gelu', hidden_dropout_prob=0.1, attention_probs_dropout_prob=0.1, max_position_embeddings=512, type_vocab_size=2, initializer_range=0.02, layer_norm_eps=1e-12, pad_token_id=0, dim_bbox=8, bbox_scale=100.0, n_relations=1, classifier_dropout_prob=0.1, **kwargs)

Initializes an instance of the BrosConfig class.

PARAMETER DESCRIPTION
self

The instance of the class.

vocab_size

The size of the vocabulary. Defaults to 30522.

TYPE: int DEFAULT: 30522

hidden_size

The size of the hidden layer. Defaults to 768.

TYPE: int DEFAULT: 768

num_hidden_layers

The number of hidden layers. Defaults to 12.

TYPE: int DEFAULT: 12

num_attention_heads

The number of attention heads. Defaults to 12.

TYPE: int DEFAULT: 12

intermediate_size

The size of the intermediate layer. Defaults to 3072.

TYPE: int DEFAULT: 3072

hidden_act

The activation function for the hidden layers. Defaults to 'gelu'.

TYPE: str DEFAULT: 'gelu'

hidden_dropout_prob

The dropout probability for the hidden layers. Defaults to 0.1.

TYPE: float DEFAULT: 0.1

attention_probs_dropout_prob

The dropout probability for the attention probabilities. Defaults to 0.1.

TYPE: float DEFAULT: 0.1

max_position_embeddings

The maximum number of position embeddings. Defaults to 512.

TYPE: int DEFAULT: 512

type_vocab_size

The size of the type vocabulary. Defaults to 2.

TYPE: int DEFAULT: 2

initializer_range

The range for weight initialization. Defaults to 0.02.

TYPE: float DEFAULT: 0.02

layer_norm_eps

The epsilon value for layer normalization. Defaults to 1e-12.

TYPE: float DEFAULT: 1e-12

pad_token_id

The ID for padding token. Defaults to 0.

TYPE: int DEFAULT: 0

dim_bbox

The dimension of the bounding box. Defaults to 8.

TYPE: int DEFAULT: 8

bbox_scale

The scale factor for the bounding box. Defaults to 100.0.

TYPE: float DEFAULT: 100.0

n_relations

The number of relations. Defaults to 1.

TYPE: int DEFAULT: 1

classifier_dropout_prob

The dropout probability for the classifier. Defaults to 0.1.

TYPE: float DEFAULT: 0.1

**kwargs

Additional keyword arguments.

DEFAULT: {}

RETURNS DESCRIPTION

None.

Source code in mindnlp/transformers/models/bros/configuration_bros.py
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
def __init__(
    self,
    vocab_size=30522,
    hidden_size=768,
    num_hidden_layers=12,
    num_attention_heads=12,
    intermediate_size=3072,
    hidden_act="gelu",
    hidden_dropout_prob=0.1,
    attention_probs_dropout_prob=0.1,
    max_position_embeddings=512,
    type_vocab_size=2,
    initializer_range=0.02,
    layer_norm_eps=1e-12,
    pad_token_id=0,
    dim_bbox=8,
    bbox_scale=100.0,
    n_relations=1,
    classifier_dropout_prob=0.1,
    **kwargs,
):
    """
    Initializes an instance of the BrosConfig class.

    Args:
        self: The instance of the class.
        vocab_size (int, optional): The size of the vocabulary. Defaults to 30522.
        hidden_size (int, optional): The size of the hidden layer. Defaults to 768.
        num_hidden_layers (int, optional): The number of hidden layers. Defaults to 12.
        num_attention_heads (int, optional): The number of attention heads. Defaults to 12.
        intermediate_size (int, optional): The size of the intermediate layer. Defaults to 3072.
        hidden_act (str, optional): The activation function for the hidden layers. Defaults to 'gelu'.
        hidden_dropout_prob (float, optional): The dropout probability for the hidden layers. Defaults to 0.1.
        attention_probs_dropout_prob (float, optional): The dropout probability for the attention probabilities. Defaults to 0.1.
        max_position_embeddings (int, optional): The maximum number of position embeddings. Defaults to 512.
        type_vocab_size (int, optional): The size of the type vocabulary. Defaults to 2.
        initializer_range (float, optional): The range for weight initialization. Defaults to 0.02.
        layer_norm_eps (float, optional): The epsilon value for layer normalization. Defaults to 1e-12.
        pad_token_id (int, optional): The ID for padding token. Defaults to 0.
        dim_bbox (int, optional): The dimension of the bounding box. Defaults to 8.
        bbox_scale (float, optional): The scale factor for the bounding box. Defaults to 100.0.
        n_relations (int, optional): The number of relations. Defaults to 1.
        classifier_dropout_prob (float, optional): The dropout probability for the classifier. Defaults to 0.1.
        **kwargs: Additional keyword arguments.

    Returns:
        None.

    Raises:
        None.
    """
    super().__init__(
        vocab_size=vocab_size,
        hidden_size=hidden_size,
        num_hidden_layers=num_hidden_layers,
        num_attention_heads=num_attention_heads,
        intermediate_size=intermediate_size,
        hidden_act=hidden_act,
        hidden_dropout_prob=hidden_dropout_prob,
        attention_probs_dropout_prob=attention_probs_dropout_prob,
        max_position_embeddings=max_position_embeddings,
        type_vocab_size=type_vocab_size,
        initializer_range=initializer_range,
        layer_norm_eps=layer_norm_eps,
        pad_token_id=pad_token_id,
        **kwargs,
    )

    self.dim_bbox = dim_bbox
    self.bbox_scale = bbox_scale
    self.n_relations = n_relations
    self.dim_bbox_sinusoid_emb_2d = self.hidden_size // 4
    self.dim_bbox_sinusoid_emb_1d = self.dim_bbox_sinusoid_emb_2d // self.dim_bbox
    self.dim_bbox_projection = self.hidden_size // self.num_attention_heads
    self.classifier_dropout_prob = classifier_dropout_prob

mindnlp.transformers.models.bros.modeling_bros.BrosPreTrainedModel

Bases: PreTrainedModel

An abstract class to handle weights initialization and a simple interface for downloading and loading pretrained models.

Source code in mindnlp/transformers/models/bros/modeling_bros.py
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
class BrosPreTrainedModel(PreTrainedModel):
    """
    An abstract class to handle weights initialization and a simple interface for downloading and loading pretrained
    models.
    """
    config_class = BrosConfig
    base_model_prefix = "bros"

    def _init_weights(self, cell):
        """Initialize the weights"""
        if isinstance(cell, nn.Linear):
            # Slightly different from the TF version which uses truncated_normal for initialization
            # cf https://github.com/pytorch/pytorch/pull/5617
            ops.initialize(cell.weight, Normal(self.config.initializer_range))
            if cell.bias is not None:
                ops.initialize(cell.bias, 'zeros')
        elif isinstance(cell, nn.Embedding):
            ops.initialize(cell.weight, Normal(self.config.initializer_range))
            if cell.padding_idx is not None:
                cell.weight[cell.padding_idx] = 0
        elif isinstance(cell, nn.LayerNorm):
            ops.initialize(cell.bias, 'zeros')
            ops.initialize(cell.weight, 'ones')

mindnlp.transformers.models.bros.modeling_bros.BrosModel

Bases: BrosPreTrainedModel

A BrosModel represents a Bros language model that is used for various natural language processing tasks. It is designed to handle inputs with both text and bounding box information and provides a comprehensive set of functionalities for processing and encoding text data.

ATTRIBUTE DESCRIPTION
config

The configuration object that stores the model's hyperparameters and settings.

embeddings

An instance of BrosTextEmbeddings that handles the word embeddings for the input text.

bbox_embeddings

An instance of BrosBboxEmbeddings that handles the encoding of bounding box information.

encoder

An instance of BrosEncoder that performs the main encoding operations on the input.

pooler

An optional instance of BrosPooler that performs pooling operations on the encoded sequence.

METHOD DESCRIPTION
__init__

Initializes a BrosModel instance with the given configuration.

get_input_embeddings

Returns the word embeddings used for input text.

set_input_embeddings

Sets the word embeddings used for input text to the given value.

_prune_heads

Prunes specific attention heads in the model.

forward

Constructs the model with the given input and returns the encoded sequence and other optional outputs.

Example
>>> import torch
>>> from transformers import BrosProcessor, BrosModel
... 
>>> processor = BrosProcessor.from_pretrained("jinho8345/bros-base-uncased")
... 
>>> model = BrosModel.from_pretrained("jinho8345/bros-base-uncased")
... 
>>> encoding = processor("Hello, my dog is cute", add_special_tokens=False, return_tensors="pt")
>>> bbox = torch.tensor([[[0, 0, 1, 1]]]).repeat(1, encoding["input_ids"].shape[-1], 1)
>>> encoding["bbox"] = bbox
... 
>>> outputs = model(**encoding)
>>> last_hidden_states = outputs.last_hidden_state
Source code in mindnlp/transformers/models/bros/modeling_bros.py
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
1671
1672
1673
1674
1675
1676
1677
1678
1679
1680
1681
1682
1683
1684
1685
1686
1687
1688
1689
1690
1691
1692
1693
1694
1695
1696
1697
1698
1699
1700
1701
1702
1703
1704
1705
1706
1707
1708
1709
1710
1711
1712
1713
1714
1715
1716
1717
1718
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
1732
1733
1734
1735
1736
1737
1738
1739
1740
1741
1742
1743
1744
1745
1746
1747
1748
1749
1750
1751
1752
1753
1754
1755
1756
1757
1758
1759
1760
1761
1762
1763
1764
1765
1766
1767
1768
1769
1770
1771
1772
1773
1774
1775
1776
1777
1778
1779
1780
1781
1782
1783
1784
1785
class BrosModel(BrosPreTrainedModel):

    """
    A BrosModel represents a Bros language model that is used for various natural language processing tasks. 
    It is designed to handle inputs with both text and bounding box information and provides a comprehensive set of 
    functionalities for processing and encoding text data.

    Attributes:
        config: The configuration object that stores the model's hyperparameters and settings.
        embeddings: An instance of BrosTextEmbeddings that handles the word embeddings for the input text.
        bbox_embeddings: An instance of BrosBboxEmbeddings that handles the encoding of bounding box information.
        encoder: An instance of BrosEncoder that performs the main encoding operations on the input.
        pooler: An optional instance of BrosPooler that performs pooling operations on the encoded sequence.

    Methods:
        __init__: Initializes a BrosModel instance with the given configuration.
        get_input_embeddings: Returns the word embeddings used for input text.
        set_input_embeddings: Sets the word embeddings used for input text to the given value.
        _prune_heads: Prunes specific attention heads in the model.
        forward: Constructs the model with the given input and returns the encoded sequence and other optional outputs.

    Example:
        ```python
        >>> import torch
        >>> from transformers import BrosProcessor, BrosModel
        ... 
        >>> processor = BrosProcessor.from_pretrained("jinho8345/bros-base-uncased")
        ... 
        >>> model = BrosModel.from_pretrained("jinho8345/bros-base-uncased")
        ... 
        >>> encoding = processor("Hello, my dog is cute", add_special_tokens=False, return_tensors="pt")
        >>> bbox = torch.tensor([[[0, 0, 1, 1]]]).repeat(1, encoding["input_ids"].shape[-1], 1)
        >>> encoding["bbox"] = bbox
        ... 
        >>> outputs = model(**encoding)
        >>> last_hidden_states = outputs.last_hidden_state
        ```
    """
    def __init__(self, config, add_pooling_layer=True):
        """
        Initializes the BrosModel.

        Args:
            self (BrosModel): The instance of the BrosModel class.
            config (object): The configuration object containing model parameters and settings.
            add_pooling_layer (bool): A flag indicating whether to include a pooling layer in the model.

        Returns:
            None.

        Raises:
            None
        """
        super().__init__(config)
        self.config = config

        self.embeddings = BrosTextEmbeddings(config)
        self.bbox_embeddings = BrosBboxEmbeddings(config)
        self.encoder = BrosEncoder(config)

        self.pooler = BrosPooler(config) if add_pooling_layer else None

        self.init_weights()

    def get_input_embeddings(self):
        """
        This method retrieves the input embeddings from the BrosModel class.

        Args:
            self: BrosModel instance. The self parameter is a reference to the current instance of the class. 
                It is used to access the attributes and methods of the class within the method.

        Returns:
            None: This method does not return any value explicitly, 
                as it directly returns the input embeddings from the BrosModel class.

        Raises:
            No specific exceptions are documented to be raised by this method.
        """
        return self.embeddings.word_embeddings

    def set_input_embeddings(self, value):
        """
        Sets the input embeddings for the BrosModel.

        Args:
            self (BrosModel): The instance of the BrosModel class.
            value (object): The input embeddings value to be set for the BrosModel. 
                It should be of the appropriate type and format compatible with the word_embeddings attribute of the embeddings object.

        Returns:
            None.

        Raises:
            None
        """
        self.embeddings.word_embeddings = value

    def _prune_heads(self, heads_to_prune):
        """
        Prunes heads of the model. heads_to_prune: dict of {layer_num: list of heads to prune in this layer} See base
        class PreTrainedModel
        """
        for layer, heads in heads_to_prune.items():
            self.encoder.layer[layer].attention.prune_heads(heads)

    def forward(
        self,
        input_ids: Optional[mindspore.Tensor] = None,
        bbox: Optional[mindspore.Tensor] = None,
        attention_mask: Optional[mindspore.Tensor] = None,
        token_type_ids: Optional[mindspore.Tensor] = None,
        position_ids: Optional[mindspore.Tensor] = None,
        head_mask: Optional[mindspore.Tensor] = None,
        inputs_embeds: Optional[mindspore.Tensor] = None,
        encoder_hidden_states: Optional[mindspore.Tensor] = None,
        encoder_attention_mask: Optional[mindspore.Tensor] = None,
        past_key_values: Optional[List[mindspore.Tensor]] = None,
        use_cache: Optional[bool] = None,
        output_attentions: Optional[bool] = None,
        output_hidden_states: Optional[bool] = None,
        return_dict: Optional[bool] = None,
    ) -> Union[Tuple[mindspore.Tensor], BaseModelOutputWithPoolingAndCrossAttentions]:
        r"""

        Returns:
            Union[Tuple[mindspore.Tensor], BaseModelOutputWithPoolingAndCrossAttentions]

        Example:
            ```python
            >>> import torch
            >>> from transformers import BrosProcessor, BrosModel
            ...
            >>> processor = BrosProcessor.from_pretrained("jinho8345/bros-base-uncased")
            ...
            >>> model = BrosModel.from_pretrained("jinho8345/bros-base-uncased")
            ...
            >>> encoding = processor("Hello, my dog is cute", add_special_tokens=False, return_tensors="pt")
            >>> bbox = torch.tensor([[[0, 0, 1, 1]]]).repeat(1, encoding["input_ids"].shape[-1], 1)
            >>> encoding["bbox"] = bbox
            ...
            >>> outputs = model(**encoding)
            >>> last_hidden_states = outputs.last_hidden_state
            ```
        """
        output_attentions = output_attentions if output_attentions is not None else self.config.output_attentions
        output_hidden_states = (
            output_hidden_states if output_hidden_states is not None else self.config.output_hidden_states
        )
        return_dict = return_dict if return_dict is not None else self.config.use_return_dict

        if self.config.is_decoder:
            use_cache = use_cache if use_cache is not None else self.config.use_cache
        else:
            use_cache = False

        if input_ids is not None and inputs_embeds is not None:
            raise ValueError("You cannot specify both input_ids and inputs_embeds at the same time")
        elif input_ids is not None:
            input_shape = input_ids.shape
        elif inputs_embeds is not None:
            input_shape = inputs_embeds.shape[:-1]
        else:
            raise ValueError("You have to specify either input_ids or inputs_embeds")

        if bbox is None:
            raise ValueError("You have to specify bbox")

        batch_size, seq_length = input_shape

        # past_key_values_length
        past_key_values_length = past_key_values[0][0].shape[2] if past_key_values is not None else 0

        if attention_mask is None:
            attention_mask = ops.ones(*input_shape)

        if token_type_ids is None:
            if hasattr(self.embeddings, "token_type_ids"):
                buffered_token_type_ids = self.embeddings.token_type_ids[:, :seq_length]
                buffered_token_type_ids_expanded = ops.broadcast_to(buffered_token_type_ids, (batch_size, seq_length))
                token_type_ids = buffered_token_type_ids_expanded
            else:
                token_type_ids = ops.zeros(*input_shape, dtype=mindspore.int64)

        # We can provide a self-attention mask of dimensions [batch_size, from_seq_length, to_seq_length]
        # ourselves in which case we just need to make it broadcastable to all heads.
        extended_attention_mask: mindspore.Tensor = self.get_extended_attention_mask(attention_mask, input_shape)

        # If a 2D or 3D attention mask is provided for the cross-attention
        # we need to make broadcastable to [batch_size, num_heads, seq_length, seq_length]
        if self.config.is_decoder and encoder_hidden_states is not None:
            encoder_batch_size, encoder_sequence_length, _ = encoder_hidden_states.shape
            encoder_hidden_shape = (encoder_batch_size, encoder_sequence_length)
            if encoder_attention_mask is None:
                encoder_attention_mask = ops.ones(*encoder_hidden_shape)
            encoder_extended_attention_mask = self.invert_attention_mask(encoder_attention_mask)
        else:
            encoder_extended_attention_mask = None

        # Prepare head mask if needed
        # 1.0 in head_mask indicate we keep the head
        # attention_probs has shape bsz x n_heads x N x N
        # input head_mask has shape [num_heads] or [num_hidden_layers x num_heads]
        # and head_mask is converted to shape [num_hidden_layers x batch x num_heads x seq_length x seq_length]
        head_mask = self.get_head_mask(head_mask, self.config.num_hidden_layers)

        embedding_output = self.embeddings(
            input_ids=input_ids,
            position_ids=position_ids,
            token_type_ids=token_type_ids,
            inputs_embeds=inputs_embeds,
            past_key_values_length=past_key_values_length,
        )

        # if bbox has 2 points (4 float tensors) per token, convert it to 4 points (8 float tensors) per token
        if bbox.shape[-1] == 4:
            bbox = bbox[:, :, [0, 1, 2, 1, 2, 3, 0, 3]]
        scaled_bbox = bbox * self.config.bbox_scale
        bbox_position_embeddings = self.bbox_embeddings(scaled_bbox)

        encoder_outputs = self.encoder(
            embedding_output,
            bbox_pos_emb=bbox_position_embeddings,
            attention_mask=extended_attention_mask,
            head_mask=head_mask,
            encoder_hidden_states=encoder_hidden_states,
            encoder_attention_mask=encoder_extended_attention_mask,
            past_key_values=past_key_values,
            use_cache=use_cache,
            output_attentions=output_attentions,
            output_hidden_states=output_hidden_states,
            return_dict=return_dict,
        )
        sequence_output = encoder_outputs[0]
        pooled_output = self.pooler(sequence_output) if self.pooler is not None else None

        if not return_dict:
            return (sequence_output, pooled_output) + encoder_outputs[1:]

        return BaseModelOutputWithPoolingAndCrossAttentions(
            last_hidden_state=sequence_output,
            pooler_output=pooled_output,
            past_key_values=encoder_outputs.past_key_values,
            hidden_states=encoder_outputs.hidden_states,
            attentions=encoder_outputs.attentions,
            cross_attentions=encoder_outputs.cross_attentions,
        )

mindnlp.transformers.models.bros.modeling_bros.BrosModel.__init__(config, add_pooling_layer=True)

Initializes the BrosModel.

PARAMETER DESCRIPTION
self

The instance of the BrosModel class.

TYPE: BrosModel

config

The configuration object containing model parameters and settings.

TYPE: object

add_pooling_layer

A flag indicating whether to include a pooling layer in the model.

TYPE: bool DEFAULT: True

RETURNS DESCRIPTION

None.

Source code in mindnlp/transformers/models/bros/modeling_bros.py
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
def __init__(self, config, add_pooling_layer=True):
    """
    Initializes the BrosModel.

    Args:
        self (BrosModel): The instance of the BrosModel class.
        config (object): The configuration object containing model parameters and settings.
        add_pooling_layer (bool): A flag indicating whether to include a pooling layer in the model.

    Returns:
        None.

    Raises:
        None
    """
    super().__init__(config)
    self.config = config

    self.embeddings = BrosTextEmbeddings(config)
    self.bbox_embeddings = BrosBboxEmbeddings(config)
    self.encoder = BrosEncoder(config)

    self.pooler = BrosPooler(config) if add_pooling_layer else None

    self.init_weights()

mindnlp.transformers.models.bros.modeling_bros.BrosModel.forward(input_ids=None, bbox=None, attention_mask=None, token_type_ids=None, position_ids=None, head_mask=None, inputs_embeds=None, encoder_hidden_states=None, encoder_attention_mask=None, past_key_values=None, use_cache=None, output_attentions=None, output_hidden_states=None, return_dict=None)

RETURNS DESCRIPTION
Union[Tuple[Tensor], BaseModelOutputWithPoolingAndCrossAttentions]

Union[Tuple[mindspore.Tensor], BaseModelOutputWithPoolingAndCrossAttentions]

Example
>>> import torch
>>> from transformers import BrosProcessor, BrosModel
...
>>> processor = BrosProcessor.from_pretrained("jinho8345/bros-base-uncased")
...
>>> model = BrosModel.from_pretrained("jinho8345/bros-base-uncased")
...
>>> encoding = processor("Hello, my dog is cute", add_special_tokens=False, return_tensors="pt")
>>> bbox = torch.tensor([[[0, 0, 1, 1]]]).repeat(1, encoding["input_ids"].shape[-1], 1)
>>> encoding["bbox"] = bbox
...
>>> outputs = model(**encoding)
>>> last_hidden_states = outputs.last_hidden_state
Source code in mindnlp/transformers/models/bros/modeling_bros.py
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
1671
1672
1673
1674
1675
1676
1677
1678
1679
1680
1681
1682
1683
1684
1685
1686
1687
1688
1689
1690
1691
1692
1693
1694
1695
1696
1697
1698
1699
1700
1701
1702
1703
1704
1705
1706
1707
1708
1709
1710
1711
1712
1713
1714
1715
1716
1717
1718
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
1732
1733
1734
1735
1736
1737
1738
1739
1740
1741
1742
1743
1744
1745
1746
1747
1748
1749
1750
1751
1752
1753
1754
1755
1756
1757
1758
1759
1760
1761
1762
1763
1764
1765
1766
1767
1768
1769
1770
1771
1772
1773
1774
1775
1776
1777
1778
1779
1780
1781
1782
1783
1784
1785
def forward(
    self,
    input_ids: Optional[mindspore.Tensor] = None,
    bbox: Optional[mindspore.Tensor] = None,
    attention_mask: Optional[mindspore.Tensor] = None,
    token_type_ids: Optional[mindspore.Tensor] = None,
    position_ids: Optional[mindspore.Tensor] = None,
    head_mask: Optional[mindspore.Tensor] = None,
    inputs_embeds: Optional[mindspore.Tensor] = None,
    encoder_hidden_states: Optional[mindspore.Tensor] = None,
    encoder_attention_mask: Optional[mindspore.Tensor] = None,
    past_key_values: Optional[List[mindspore.Tensor]] = None,
    use_cache: Optional[bool] = None,
    output_attentions: Optional[bool] = None,
    output_hidden_states: Optional[bool] = None,
    return_dict: Optional[bool] = None,
) -> Union[Tuple[mindspore.Tensor], BaseModelOutputWithPoolingAndCrossAttentions]:
    r"""

    Returns:
        Union[Tuple[mindspore.Tensor], BaseModelOutputWithPoolingAndCrossAttentions]

    Example:
        ```python
        >>> import torch
        >>> from transformers import BrosProcessor, BrosModel
        ...
        >>> processor = BrosProcessor.from_pretrained("jinho8345/bros-base-uncased")
        ...
        >>> model = BrosModel.from_pretrained("jinho8345/bros-base-uncased")
        ...
        >>> encoding = processor("Hello, my dog is cute", add_special_tokens=False, return_tensors="pt")
        >>> bbox = torch.tensor([[[0, 0, 1, 1]]]).repeat(1, encoding["input_ids"].shape[-1], 1)
        >>> encoding["bbox"] = bbox
        ...
        >>> outputs = model(**encoding)
        >>> last_hidden_states = outputs.last_hidden_state
        ```
    """
    output_attentions = output_attentions if output_attentions is not None else self.config.output_attentions
    output_hidden_states = (
        output_hidden_states if output_hidden_states is not None else self.config.output_hidden_states
    )
    return_dict = return_dict if return_dict is not None else self.config.use_return_dict

    if self.config.is_decoder:
        use_cache = use_cache if use_cache is not None else self.config.use_cache
    else:
        use_cache = False

    if input_ids is not None and inputs_embeds is not None:
        raise ValueError("You cannot specify both input_ids and inputs_embeds at the same time")
    elif input_ids is not None:
        input_shape = input_ids.shape
    elif inputs_embeds is not None:
        input_shape = inputs_embeds.shape[:-1]
    else:
        raise ValueError("You have to specify either input_ids or inputs_embeds")

    if bbox is None:
        raise ValueError("You have to specify bbox")

    batch_size, seq_length = input_shape

    # past_key_values_length
    past_key_values_length = past_key_values[0][0].shape[2] if past_key_values is not None else 0

    if attention_mask is None:
        attention_mask = ops.ones(*input_shape)

    if token_type_ids is None:
        if hasattr(self.embeddings, "token_type_ids"):
            buffered_token_type_ids = self.embeddings.token_type_ids[:, :seq_length]
            buffered_token_type_ids_expanded = ops.broadcast_to(buffered_token_type_ids, (batch_size, seq_length))
            token_type_ids = buffered_token_type_ids_expanded
        else:
            token_type_ids = ops.zeros(*input_shape, dtype=mindspore.int64)

    # We can provide a self-attention mask of dimensions [batch_size, from_seq_length, to_seq_length]
    # ourselves in which case we just need to make it broadcastable to all heads.
    extended_attention_mask: mindspore.Tensor = self.get_extended_attention_mask(attention_mask, input_shape)

    # If a 2D or 3D attention mask is provided for the cross-attention
    # we need to make broadcastable to [batch_size, num_heads, seq_length, seq_length]
    if self.config.is_decoder and encoder_hidden_states is not None:
        encoder_batch_size, encoder_sequence_length, _ = encoder_hidden_states.shape
        encoder_hidden_shape = (encoder_batch_size, encoder_sequence_length)
        if encoder_attention_mask is None:
            encoder_attention_mask = ops.ones(*encoder_hidden_shape)
        encoder_extended_attention_mask = self.invert_attention_mask(encoder_attention_mask)
    else:
        encoder_extended_attention_mask = None

    # Prepare head mask if needed
    # 1.0 in head_mask indicate we keep the head
    # attention_probs has shape bsz x n_heads x N x N
    # input head_mask has shape [num_heads] or [num_hidden_layers x num_heads]
    # and head_mask is converted to shape [num_hidden_layers x batch x num_heads x seq_length x seq_length]
    head_mask = self.get_head_mask(head_mask, self.config.num_hidden_layers)

    embedding_output = self.embeddings(
        input_ids=input_ids,
        position_ids=position_ids,
        token_type_ids=token_type_ids,
        inputs_embeds=inputs_embeds,
        past_key_values_length=past_key_values_length,
    )

    # if bbox has 2 points (4 float tensors) per token, convert it to 4 points (8 float tensors) per token
    if bbox.shape[-1] == 4:
        bbox = bbox[:, :, [0, 1, 2, 1, 2, 3, 0, 3]]
    scaled_bbox = bbox * self.config.bbox_scale
    bbox_position_embeddings = self.bbox_embeddings(scaled_bbox)

    encoder_outputs = self.encoder(
        embedding_output,
        bbox_pos_emb=bbox_position_embeddings,
        attention_mask=extended_attention_mask,
        head_mask=head_mask,
        encoder_hidden_states=encoder_hidden_states,
        encoder_attention_mask=encoder_extended_attention_mask,
        past_key_values=past_key_values,
        use_cache=use_cache,
        output_attentions=output_attentions,
        output_hidden_states=output_hidden_states,
        return_dict=return_dict,
    )
    sequence_output = encoder_outputs[0]
    pooled_output = self.pooler(sequence_output) if self.pooler is not None else None

    if not return_dict:
        return (sequence_output, pooled_output) + encoder_outputs[1:]

    return BaseModelOutputWithPoolingAndCrossAttentions(
        last_hidden_state=sequence_output,
        pooler_output=pooled_output,
        past_key_values=encoder_outputs.past_key_values,
        hidden_states=encoder_outputs.hidden_states,
        attentions=encoder_outputs.attentions,
        cross_attentions=encoder_outputs.cross_attentions,
    )

mindnlp.transformers.models.bros.modeling_bros.BrosModel.get_input_embeddings()

This method retrieves the input embeddings from the BrosModel class.

PARAMETER DESCRIPTION
self

BrosModel instance. The self parameter is a reference to the current instance of the class. It is used to access the attributes and methods of the class within the method.

RETURNS DESCRIPTION
None

This method does not return any value explicitly, as it directly returns the input embeddings from the BrosModel class.

Source code in mindnlp/transformers/models/bros/modeling_bros.py
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
def get_input_embeddings(self):
    """
    This method retrieves the input embeddings from the BrosModel class.

    Args:
        self: BrosModel instance. The self parameter is a reference to the current instance of the class. 
            It is used to access the attributes and methods of the class within the method.

    Returns:
        None: This method does not return any value explicitly, 
            as it directly returns the input embeddings from the BrosModel class.

    Raises:
        No specific exceptions are documented to be raised by this method.
    """
    return self.embeddings.word_embeddings

mindnlp.transformers.models.bros.modeling_bros.BrosModel.set_input_embeddings(value)

Sets the input embeddings for the BrosModel.

PARAMETER DESCRIPTION
self

The instance of the BrosModel class.

TYPE: BrosModel

value

The input embeddings value to be set for the BrosModel. It should be of the appropriate type and format compatible with the word_embeddings attribute of the embeddings object.

TYPE: object

RETURNS DESCRIPTION

None.

Source code in mindnlp/transformers/models/bros/modeling_bros.py
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
def set_input_embeddings(self, value):
    """
    Sets the input embeddings for the BrosModel.

    Args:
        self (BrosModel): The instance of the BrosModel class.
        value (object): The input embeddings value to be set for the BrosModel. 
            It should be of the appropriate type and format compatible with the word_embeddings attribute of the embeddings object.

    Returns:
        None.

    Raises:
        None
    """
    self.embeddings.word_embeddings = value

mindnlp.transformers.models.bros.modeling_bros.BrosForTokenClassification

Bases: BrosPreTrainedModel

BrosForTokenClassification is a class for token classification tasks using the Bros model. It inherits from BrosPreTrainedModel and is designed to be used for token classification tasks such as named entity recognition or part-of-speech tagging.

RETURNS DESCRIPTION
TokenClassifierOutput

A data class that holds the outputs of the BrosForTokenClassification model.

Example
>>> import torch
>>> from transformers import BrosProcessor, BrosForTokenClassification
...
>>> processor = BrosProcessor.from_pretrained("jinho8345/bros-base-uncased")
...
>>> model = BrosForTokenClassification.from_pretrained("jinho8345/bros-base-uncased")
...
>>> encoding = processor("Hello, my dog is cute", add_special_tokens=False, return_tensors="pt")
>>> bbox = torch.tensor([[[0, 0, 1, 1]]]).repeat(1, encoding["input_ids"].shape[-1], 1)
>>> encoding["bbox"] = bbox
...
>>> outputs = model(**encoding)
Source code in mindnlp/transformers/models/bros/modeling_bros.py
1788
1789
1790
1791
1792
1793
1794
1795
1796
1797
1798
1799
1800
1801
1802
1803
1804
1805
1806
1807
1808
1809
1810
1811
1812
1813
1814
1815
1816
1817
1818
1819
1820
1821
1822
1823
1824
1825
1826
1827
1828
1829
1830
1831
1832
1833
1834
1835
1836
1837
1838
1839
1840
1841
1842
1843
1844
1845
1846
1847
1848
1849
1850
1851
1852
1853
1854
1855
1856
1857
1858
1859
1860
1861
1862
1863
1864
1865
1866
1867
1868
1869
1870
1871
1872
1873
1874
1875
1876
1877
1878
1879
1880
1881
1882
1883
1884
1885
1886
1887
1888
1889
1890
1891
1892
1893
1894
1895
1896
1897
1898
1899
1900
1901
1902
1903
1904
1905
1906
1907
1908
1909
1910
1911
1912
1913
1914
1915
1916
1917
class BrosForTokenClassification(BrosPreTrainedModel):

    """
    BrosForTokenClassification is a class for token classification tasks using the Bros model.
    It inherits from BrosPreTrainedModel and is designed to be used for token classification tasks such as named
    entity recognition or part-of-speech tagging.

    Returns:
        TokenClassifierOutput: A data class that holds the outputs of the BrosForTokenClassification model.

    Example:
        ```python
        >>> import torch
        >>> from transformers import BrosProcessor, BrosForTokenClassification
        ...
        >>> processor = BrosProcessor.from_pretrained("jinho8345/bros-base-uncased")
        ...
        >>> model = BrosForTokenClassification.from_pretrained("jinho8345/bros-base-uncased")
        ...
        >>> encoding = processor("Hello, my dog is cute", add_special_tokens=False, return_tensors="pt")
        >>> bbox = torch.tensor([[[0, 0, 1, 1]]]).repeat(1, encoding["input_ids"].shape[-1], 1)
        >>> encoding["bbox"] = bbox
        ...
        >>> outputs = model(**encoding)
        ```
    """
    _keys_to_ignore_on_load_unexpected = [r"pooler"]

    def __init__(self, config):
        """
        Initializes an instance of the BrosForTokenClassification class.

        Args:
            self (BrosForTokenClassification): The object itself.
            config (BrosConfig): The configuration object containing various settings.

        Returns:
            None

        Raises:
            None
        """
        super().__init__(config)
        self.num_labels = config.num_labels

        self.bros = BrosModel(config)
        classifier_dropout = (
            config.classifier_dropout if hasattr(config, "classifier_dropout") else config.hidden_dropout_prob
        )
        self.dropout = nn.Dropout(p=classifier_dropout)
        self.classifier = nn.Linear(config.hidden_size, config.num_labels)

        self.init_weights()

    def forward(
        self,
        input_ids: Optional[mindspore.Tensor] = None,
        bbox: Optional[mindspore.Tensor] = None,
        attention_mask: Optional[mindspore.Tensor] = None,
        bbox_first_token_mask: Optional[mindspore.Tensor] = None,
        token_type_ids: Optional[mindspore.Tensor] = None,
        position_ids: Optional[mindspore.Tensor] = None,
        head_mask: Optional[mindspore.Tensor] = None,
        inputs_embeds: Optional[mindspore.Tensor] = None,
        labels: Optional[mindspore.Tensor] = None,
        output_attentions: Optional[bool] = None,
        output_hidden_states: Optional[bool] = None,
        return_dict: Optional[bool] = None,
    ) -> Union[Tuple[mindspore.Tensor], TokenClassifierOutput]:
        r"""

        Returns:
            `Union[Tuple[mindspore.Tensor], TokenClassifierOutput]`

        Example:
            ```python
            >>> import torch
            >>> from transformers import BrosProcessor, BrosForTokenClassification
            ...
            >>> processor = BrosProcessor.from_pretrained("jinho8345/bros-base-uncased")
            ...
            >>> model = BrosForTokenClassification.from_pretrained("jinho8345/bros-base-uncased")
            ...
            >>> encoding = processor("Hello, my dog is cute", add_special_tokens=False, return_tensors="pt")
            >>> bbox = torch.tensor([[[0, 0, 1, 1]]]).repeat(1, encoding["input_ids"].shape[-1], 1)
            >>> encoding["bbox"] = bbox
            ...
            >>> outputs = model(**encoding)
            ```
        """
        return_dict = return_dict if return_dict is not None else self.config.use_return_dict

        outputs = self.bros(
            input_ids,
            bbox=bbox,
            attention_mask=attention_mask,
            token_type_ids=token_type_ids,
            position_ids=position_ids,
            head_mask=head_mask,
            inputs_embeds=inputs_embeds,
            output_attentions=output_attentions,
            output_hidden_states=output_hidden_states,
            return_dict=return_dict,
        )

        sequence_output = outputs[0]

        sequence_output = self.dropout(sequence_output)
        logits = self.classifier(sequence_output)

        loss = None
        if labels is not None:
            if bbox_first_token_mask is not None:
                bbox_first_token_mask = bbox_first_token_mask.view(-1)
                loss = F.cross_entropy(
                    logits.view(-1, self.num_labels)[bbox_first_token_mask], labels.view(-1)[bbox_first_token_mask]
                )
            else:
                loss = F.cross_entropy(logits.view(-1, self.num_labels), labels.view(-1))

        if not return_dict:
            output = (logits,) + outputs[2:]
            return ((loss,) + output) if loss is not None else output

        return TokenClassifierOutput(
            loss=loss,
            logits=logits,
            hidden_states=outputs.hidden_states,
            attentions=outputs.attentions,
        )

mindnlp.transformers.models.bros.modeling_bros.BrosForTokenClassification.__init__(config)

Initializes an instance of the BrosForTokenClassification class.

PARAMETER DESCRIPTION
self

The object itself.

TYPE: BrosForTokenClassification

config

The configuration object containing various settings.

TYPE: BrosConfig

RETURNS DESCRIPTION

None

Source code in mindnlp/transformers/models/bros/modeling_bros.py
1816
1817
1818
1819
1820
1821
1822
1823
1824
1825
1826
1827
1828
1829
1830
1831
1832
1833
1834
1835
1836
1837
1838
1839
1840
def __init__(self, config):
    """
    Initializes an instance of the BrosForTokenClassification class.

    Args:
        self (BrosForTokenClassification): The object itself.
        config (BrosConfig): The configuration object containing various settings.

    Returns:
        None

    Raises:
        None
    """
    super().__init__(config)
    self.num_labels = config.num_labels

    self.bros = BrosModel(config)
    classifier_dropout = (
        config.classifier_dropout if hasattr(config, "classifier_dropout") else config.hidden_dropout_prob
    )
    self.dropout = nn.Dropout(p=classifier_dropout)
    self.classifier = nn.Linear(config.hidden_size, config.num_labels)

    self.init_weights()

mindnlp.transformers.models.bros.modeling_bros.BrosForTokenClassification.forward(input_ids=None, bbox=None, attention_mask=None, bbox_first_token_mask=None, token_type_ids=None, position_ids=None, head_mask=None, inputs_embeds=None, labels=None, output_attentions=None, output_hidden_states=None, return_dict=None)

RETURNS DESCRIPTION
Union[Tuple[Tensor], TokenClassifierOutput]

Union[Tuple[mindspore.Tensor], TokenClassifierOutput]

Example
>>> import torch
>>> from transformers import BrosProcessor, BrosForTokenClassification
...
>>> processor = BrosProcessor.from_pretrained("jinho8345/bros-base-uncased")
...
>>> model = BrosForTokenClassification.from_pretrained("jinho8345/bros-base-uncased")
...
>>> encoding = processor("Hello, my dog is cute", add_special_tokens=False, return_tensors="pt")
>>> bbox = torch.tensor([[[0, 0, 1, 1]]]).repeat(1, encoding["input_ids"].shape[-1], 1)
>>> encoding["bbox"] = bbox
...
>>> outputs = model(**encoding)
Source code in mindnlp/transformers/models/bros/modeling_bros.py
1842
1843
1844
1845
1846
1847
1848
1849
1850
1851
1852
1853
1854
1855
1856
1857
1858
1859
1860
1861
1862
1863
1864
1865
1866
1867
1868
1869
1870
1871
1872
1873
1874
1875
1876
1877
1878
1879
1880
1881
1882
1883
1884
1885
1886
1887
1888
1889
1890
1891
1892
1893
1894
1895
1896
1897
1898
1899
1900
1901
1902
1903
1904
1905
1906
1907
1908
1909
1910
1911
1912
1913
1914
1915
1916
1917
def forward(
    self,
    input_ids: Optional[mindspore.Tensor] = None,
    bbox: Optional[mindspore.Tensor] = None,
    attention_mask: Optional[mindspore.Tensor] = None,
    bbox_first_token_mask: Optional[mindspore.Tensor] = None,
    token_type_ids: Optional[mindspore.Tensor] = None,
    position_ids: Optional[mindspore.Tensor] = None,
    head_mask: Optional[mindspore.Tensor] = None,
    inputs_embeds: Optional[mindspore.Tensor] = None,
    labels: Optional[mindspore.Tensor] = None,
    output_attentions: Optional[bool] = None,
    output_hidden_states: Optional[bool] = None,
    return_dict: Optional[bool] = None,
) -> Union[Tuple[mindspore.Tensor], TokenClassifierOutput]:
    r"""

    Returns:
        `Union[Tuple[mindspore.Tensor], TokenClassifierOutput]`

    Example:
        ```python
        >>> import torch
        >>> from transformers import BrosProcessor, BrosForTokenClassification
        ...
        >>> processor = BrosProcessor.from_pretrained("jinho8345/bros-base-uncased")
        ...
        >>> model = BrosForTokenClassification.from_pretrained("jinho8345/bros-base-uncased")
        ...
        >>> encoding = processor("Hello, my dog is cute", add_special_tokens=False, return_tensors="pt")
        >>> bbox = torch.tensor([[[0, 0, 1, 1]]]).repeat(1, encoding["input_ids"].shape[-1], 1)
        >>> encoding["bbox"] = bbox
        ...
        >>> outputs = model(**encoding)
        ```
    """
    return_dict = return_dict if return_dict is not None else self.config.use_return_dict

    outputs = self.bros(
        input_ids,
        bbox=bbox,
        attention_mask=attention_mask,
        token_type_ids=token_type_ids,
        position_ids=position_ids,
        head_mask=head_mask,
        inputs_embeds=inputs_embeds,
        output_attentions=output_attentions,
        output_hidden_states=output_hidden_states,
        return_dict=return_dict,
    )

    sequence_output = outputs[0]

    sequence_output = self.dropout(sequence_output)
    logits = self.classifier(sequence_output)

    loss = None
    if labels is not None:
        if bbox_first_token_mask is not None:
            bbox_first_token_mask = bbox_first_token_mask.view(-1)
            loss = F.cross_entropy(
                logits.view(-1, self.num_labels)[bbox_first_token_mask], labels.view(-1)[bbox_first_token_mask]
            )
        else:
            loss = F.cross_entropy(logits.view(-1, self.num_labels), labels.view(-1))

    if not return_dict:
        output = (logits,) + outputs[2:]
        return ((loss,) + output) if loss is not None else output

    return TokenClassifierOutput(
        loss=loss,
        logits=logits,
        hidden_states=outputs.hidden_states,
        attentions=outputs.attentions,
    )

mindnlp.transformers.models.bros.modeling_bros.BrosSpadeEEForTokenClassification

Bases: BrosPreTrainedModel

This class represents a BrosSpadeEEForTokenClassification model for token classification tasks. It is a subclass of BrosPreTrainedModel.

The BrosSpadeEEForTokenClassification model consists of a BrosModel backbone and two token classifiers: initial_token_classifier and subsequent_token_classifier. The initial_token_classifier is used to classify the initial tokens in the input sequence, while the subsequent_token_classifier is used to classify the subsequent tokens.

The class provides a 'forward' method that takes various input tensors such as input_ids, bbox, attention_mask, token_type_ids, etc. It returns the predicted initial token logits and subsequent token logits. Optionally, it can also return hidden states and attentions if specified.

Example
>>> import torch
>>> from transformers import BrosProcessor, BrosSpadeEEForTokenClassification
...
>>> processor = BrosProcessor.from_pretrained("jinho8345/bros-base-uncased")
>>> model = BrosSpadeEEForTokenClassification.from_pretrained("jinho8345/bros-base-uncased")
...
>>> encoding = processor("Hello, my dog is cute", add_special_tokens=False, return_tensors="pt")
>>> bbox = torch.tensor([[[0, 0, 1, 1]]]).repeat(1, encoding["input_ids"].shape[-1], 1)
>>> encoding["bbox"] = bbox
...
>>> outputs = model(**encoding)

Please note that the docstring above is a summary of the class functionality and does not include method signatures or additional details.

Source code in mindnlp/transformers/models/bros/modeling_bros.py
1920
1921
1922
1923
1924
1925
1926
1927
1928
1929
1930
1931
1932
1933
1934
1935
1936
1937
1938
1939
1940
1941
1942
1943
1944
1945
1946
1947
1948
1949
1950
1951
1952
1953
1954
1955
1956
1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
2027
2028
2029
2030
2031
2032
2033
2034
2035
2036
2037
2038
2039
2040
2041
2042
2043
2044
2045
2046
2047
2048
2049
2050
2051
2052
2053
2054
2055
2056
2057
2058
2059
2060
2061
2062
2063
2064
2065
2066
2067
2068
2069
2070
2071
2072
2073
2074
2075
2076
2077
2078
2079
2080
2081
2082
2083
2084
2085
2086
2087
2088
2089
2090
2091
2092
2093
2094
2095
2096
2097
2098
2099
class BrosSpadeEEForTokenClassification(BrosPreTrainedModel):

    """
    This class represents a BrosSpadeEEForTokenClassification model for token classification tasks.
    It is a subclass of BrosPreTrainedModel.

    The BrosSpadeEEForTokenClassification model consists of a BrosModel backbone and two token classifiers:
    initial_token_classifier and subsequent_token_classifier. The initial_token_classifier is used to
    classify the initial tokens in the input sequence, while the subsequent_token_classifier is used to classify the subsequent tokens.

    The class provides a 'forward' method that takes various input tensors such as input_ids, bbox, attention_mask,
    token_type_ids, etc. It returns the predicted initial token logits and subsequent token
    logits. Optionally, it can also return hidden states and attentions if specified.

    Example:
        ```python
        >>> import torch
        >>> from transformers import BrosProcessor, BrosSpadeEEForTokenClassification
        ...
        >>> processor = BrosProcessor.from_pretrained("jinho8345/bros-base-uncased")
        >>> model = BrosSpadeEEForTokenClassification.from_pretrained("jinho8345/bros-base-uncased")
        ...
        >>> encoding = processor("Hello, my dog is cute", add_special_tokens=False, return_tensors="pt")
        >>> bbox = torch.tensor([[[0, 0, 1, 1]]]).repeat(1, encoding["input_ids"].shape[-1], 1)
        >>> encoding["bbox"] = bbox
        ...
        >>> outputs = model(**encoding)
        ```

    Please note that the docstring above is a summary of the class functionality and does not include method signatures or additional details.
    """
    _keys_to_ignore_on_load_unexpected = [r"pooler"]

    def __init__(self, config):
        """
        Initializes a BrosSpadeEEForTokenClassification instance.

        Args:
            self: The instance of the class.
            config:
                A configuration object containing the model configuration parameters.

                - Type: object
                - Purpose: The configuration for initializing the model.
                - Restrictions: Must contain the following attributes: 'num_labels', 'n_relations', 'hidden_size', 'classifier_dropout', 'hidden_dropout_prob'.

        Returns:
            None.

        Raises:
            AttributeError: If the 'config' object does not contain the required attributes.
            ValueError: If the 'config' attributes have invalid values or types.
            TypeError: If the 'config' parameter is not of type object.
        """
        super().__init__(config)
        self.config = config
        self.num_labels = config.num_labels
        self.n_relations = config.n_relations
        self.backbone_hidden_size = config.hidden_size

        self.bros = BrosModel(config)
        classifier_dropout = (
            config.classifier_dropout if hasattr(config, "classifier_dropout") else config.hidden_dropout_prob
        )

        # Initial token classification for Entity Extraction (NER)
        self.initial_token_classifier = nn.Sequential(
            nn.Dropout(p=classifier_dropout),
            nn.Linear(config.hidden_size, config.hidden_size),
            nn.Dropout(p=classifier_dropout),
            nn.Linear(config.hidden_size, config.num_labels),
        )

        # Subsequent token classification for Entity Extraction (NER)
        self.subsequent_token_classifier = BrosRelationExtractor(config)

        self.init_weights()

    def forward(
        self,
        input_ids: Optional[mindspore.Tensor] = None,
        bbox: Optional[mindspore.Tensor] = None,
        attention_mask: Optional[mindspore.Tensor] = None,
        bbox_first_token_mask: Optional[mindspore.Tensor] = None,
        token_type_ids: Optional[mindspore.Tensor] = None,
        position_ids: Optional[mindspore.Tensor] = None,
        head_mask: Optional[mindspore.Tensor] = None,
        inputs_embeds: Optional[mindspore.Tensor] = None,
        initial_token_labels: Optional[mindspore.Tensor] = None,
        subsequent_token_labels: Optional[mindspore.Tensor] = None,
        output_attentions: Optional[bool] = None,
        output_hidden_states: Optional[bool] = None,
        return_dict: Optional[bool] = None,
    ) -> Union[Tuple[mindspore.Tensor], BrosSpadeOutput]:
        r"""

        Returns:
            Union[Tuple[mindspore.Tensor], BrosSpadeOutput]

        Example:
            ```python
            >>> import torch
            >>> from transformers import BrosProcessor, BrosSpadeEEForTokenClassification
            ...
            >>> processor = BrosProcessor.from_pretrained("jinho8345/bros-base-uncased")
            ...
            >>> model = BrosSpadeEEForTokenClassification.from_pretrained("jinho8345/bros-base-uncased")
            ...
            >>> encoding = processor("Hello, my dog is cute", add_special_tokens=False, return_tensors="pt")
            >>> bbox = torch.tensor([[[0, 0, 1, 1]]]).repeat(1, encoding["input_ids"].shape[-1], 1)
            >>> encoding["bbox"] = bbox
            ...
            >>> outputs = model(**encoding)
            ```
        """
        return_dict = return_dict if return_dict is not None else self.config.use_return_dict

        outputs = self.bros(
            input_ids=input_ids,
            bbox=bbox,
            attention_mask=attention_mask,
            token_type_ids=token_type_ids,
            position_ids=position_ids,
            head_mask=head_mask,
            inputs_embeds=inputs_embeds,
            output_attentions=output_attentions,
            output_hidden_states=output_hidden_states,
            return_dict=return_dict,
        )

        last_hidden_states = outputs[0]
        last_hidden_states = last_hidden_states.swapaxes(0, 1)
        initial_token_logits = self.initial_token_classifier(last_hidden_states).swapaxes(0, 1)
        subsequent_token_logits = self.subsequent_token_classifier(last_hidden_states, last_hidden_states).squeeze(0)

        # make subsequent token (sequence token classification) mask
        inv_attention_mask = 1 - attention_mask
        batch_size, max_seq_length = inv_attention_mask.shape
        invalid_token_mask = ops.cat([inv_attention_mask, ops.zeros(batch_size, 1).astype(inv_attention_mask.dtype)], dim=1).bool()
        subsequent_token_logits = subsequent_token_logits.masked_fill(
            invalid_token_mask[:, None, :], float(ops.finfo(subsequent_token_logits.dtype).min)
        )
        self_token_mask = ops.eye(max_seq_length, max_seq_length + 1).bool()
        subsequent_token_logits = subsequent_token_logits.masked_fill(
            self_token_mask[None, :, :], float(ops.finfo(subsequent_token_logits.dtype).min)
        )
        subsequent_token_mask = attention_mask.view(-1).bool()

        loss = None
        if initial_token_labels is not None and subsequent_token_labels is not None:
            # get initial token loss
            initial_token_labels = initial_token_labels.view(-1)
            if bbox_first_token_mask is not None:
                bbox_first_token_mask = bbox_first_token_mask.view(-1)
                initial_token_loss = F.cross_entropy(
                    initial_token_logits.view(-1, self.num_labels)[bbox_first_token_mask],
                    initial_token_labels[bbox_first_token_mask],
                )
            else:
                initial_token_loss = F.cross_entropy(initial_token_logits.view(-1, self.num_labels), initial_token_labels)

            subsequent_token_labels = subsequent_token_labels.view(-1)
            subsequent_token_loss = F.cross_entropy(
                subsequent_token_logits.view(-1, max_seq_length + 1)[subsequent_token_mask],
                subsequent_token_labels[subsequent_token_mask],
            )

            loss = initial_token_loss + subsequent_token_loss

        if not return_dict:
            output = (initial_token_logits, subsequent_token_logits) + outputs[2:]
            return ((loss,) + output) if loss is not None else output

        return BrosSpadeOutput(
            loss=loss,
            initial_token_logits=initial_token_logits,
            subsequent_token_logits=subsequent_token_logits,
            hidden_states=outputs.hidden_states,
            attentions=outputs.attentions,
        )

mindnlp.transformers.models.bros.modeling_bros.BrosSpadeEEForTokenClassification.__init__(config)

Initializes a BrosSpadeEEForTokenClassification instance.

PARAMETER DESCRIPTION
self

The instance of the class.

config

A configuration object containing the model configuration parameters.

  • Type: object
  • Purpose: The configuration for initializing the model.
  • Restrictions: Must contain the following attributes: 'num_labels', 'n_relations', 'hidden_size', 'classifier_dropout', 'hidden_dropout_prob'.

RETURNS DESCRIPTION

None.

RAISES DESCRIPTION
AttributeError

If the 'config' object does not contain the required attributes.

ValueError

If the 'config' attributes have invalid values or types.

TypeError

If the 'config' parameter is not of type object.

Source code in mindnlp/transformers/models/bros/modeling_bros.py
1953
1954
1955
1956
1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
def __init__(self, config):
    """
    Initializes a BrosSpadeEEForTokenClassification instance.

    Args:
        self: The instance of the class.
        config:
            A configuration object containing the model configuration parameters.

            - Type: object
            - Purpose: The configuration for initializing the model.
            - Restrictions: Must contain the following attributes: 'num_labels', 'n_relations', 'hidden_size', 'classifier_dropout', 'hidden_dropout_prob'.

    Returns:
        None.

    Raises:
        AttributeError: If the 'config' object does not contain the required attributes.
        ValueError: If the 'config' attributes have invalid values or types.
        TypeError: If the 'config' parameter is not of type object.
    """
    super().__init__(config)
    self.config = config
    self.num_labels = config.num_labels
    self.n_relations = config.n_relations
    self.backbone_hidden_size = config.hidden_size

    self.bros = BrosModel(config)
    classifier_dropout = (
        config.classifier_dropout if hasattr(config, "classifier_dropout") else config.hidden_dropout_prob
    )

    # Initial token classification for Entity Extraction (NER)
    self.initial_token_classifier = nn.Sequential(
        nn.Dropout(p=classifier_dropout),
        nn.Linear(config.hidden_size, config.hidden_size),
        nn.Dropout(p=classifier_dropout),
        nn.Linear(config.hidden_size, config.num_labels),
    )

    # Subsequent token classification for Entity Extraction (NER)
    self.subsequent_token_classifier = BrosRelationExtractor(config)

    self.init_weights()

mindnlp.transformers.models.bros.modeling_bros.BrosSpadeEEForTokenClassification.forward(input_ids=None, bbox=None, attention_mask=None, bbox_first_token_mask=None, token_type_ids=None, position_ids=None, head_mask=None, inputs_embeds=None, initial_token_labels=None, subsequent_token_labels=None, output_attentions=None, output_hidden_states=None, return_dict=None)

RETURNS DESCRIPTION
Union[Tuple[Tensor], BrosSpadeOutput]

Union[Tuple[mindspore.Tensor], BrosSpadeOutput]

Example
>>> import torch
>>> from transformers import BrosProcessor, BrosSpadeEEForTokenClassification
...
>>> processor = BrosProcessor.from_pretrained("jinho8345/bros-base-uncased")
...
>>> model = BrosSpadeEEForTokenClassification.from_pretrained("jinho8345/bros-base-uncased")
...
>>> encoding = processor("Hello, my dog is cute", add_special_tokens=False, return_tensors="pt")
>>> bbox = torch.tensor([[[0, 0, 1, 1]]]).repeat(1, encoding["input_ids"].shape[-1], 1)
>>> encoding["bbox"] = bbox
...
>>> outputs = model(**encoding)
Source code in mindnlp/transformers/models/bros/modeling_bros.py
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
2027
2028
2029
2030
2031
2032
2033
2034
2035
2036
2037
2038
2039
2040
2041
2042
2043
2044
2045
2046
2047
2048
2049
2050
2051
2052
2053
2054
2055
2056
2057
2058
2059
2060
2061
2062
2063
2064
2065
2066
2067
2068
2069
2070
2071
2072
2073
2074
2075
2076
2077
2078
2079
2080
2081
2082
2083
2084
2085
2086
2087
2088
2089
2090
2091
2092
2093
2094
2095
2096
2097
2098
2099
def forward(
    self,
    input_ids: Optional[mindspore.Tensor] = None,
    bbox: Optional[mindspore.Tensor] = None,
    attention_mask: Optional[mindspore.Tensor] = None,
    bbox_first_token_mask: Optional[mindspore.Tensor] = None,
    token_type_ids: Optional[mindspore.Tensor] = None,
    position_ids: Optional[mindspore.Tensor] = None,
    head_mask: Optional[mindspore.Tensor] = None,
    inputs_embeds: Optional[mindspore.Tensor] = None,
    initial_token_labels: Optional[mindspore.Tensor] = None,
    subsequent_token_labels: Optional[mindspore.Tensor] = None,
    output_attentions: Optional[bool] = None,
    output_hidden_states: Optional[bool] = None,
    return_dict: Optional[bool] = None,
) -> Union[Tuple[mindspore.Tensor], BrosSpadeOutput]:
    r"""

    Returns:
        Union[Tuple[mindspore.Tensor], BrosSpadeOutput]

    Example:
        ```python
        >>> import torch
        >>> from transformers import BrosProcessor, BrosSpadeEEForTokenClassification
        ...
        >>> processor = BrosProcessor.from_pretrained("jinho8345/bros-base-uncased")
        ...
        >>> model = BrosSpadeEEForTokenClassification.from_pretrained("jinho8345/bros-base-uncased")
        ...
        >>> encoding = processor("Hello, my dog is cute", add_special_tokens=False, return_tensors="pt")
        >>> bbox = torch.tensor([[[0, 0, 1, 1]]]).repeat(1, encoding["input_ids"].shape[-1], 1)
        >>> encoding["bbox"] = bbox
        ...
        >>> outputs = model(**encoding)
        ```
    """
    return_dict = return_dict if return_dict is not None else self.config.use_return_dict

    outputs = self.bros(
        input_ids=input_ids,
        bbox=bbox,
        attention_mask=attention_mask,
        token_type_ids=token_type_ids,
        position_ids=position_ids,
        head_mask=head_mask,
        inputs_embeds=inputs_embeds,
        output_attentions=output_attentions,
        output_hidden_states=output_hidden_states,
        return_dict=return_dict,
    )

    last_hidden_states = outputs[0]
    last_hidden_states = last_hidden_states.swapaxes(0, 1)
    initial_token_logits = self.initial_token_classifier(last_hidden_states).swapaxes(0, 1)
    subsequent_token_logits = self.subsequent_token_classifier(last_hidden_states, last_hidden_states).squeeze(0)

    # make subsequent token (sequence token classification) mask
    inv_attention_mask = 1 - attention_mask
    batch_size, max_seq_length = inv_attention_mask.shape
    invalid_token_mask = ops.cat([inv_attention_mask, ops.zeros(batch_size, 1).astype(inv_attention_mask.dtype)], dim=1).bool()
    subsequent_token_logits = subsequent_token_logits.masked_fill(
        invalid_token_mask[:, None, :], float(ops.finfo(subsequent_token_logits.dtype).min)
    )
    self_token_mask = ops.eye(max_seq_length, max_seq_length + 1).bool()
    subsequent_token_logits = subsequent_token_logits.masked_fill(
        self_token_mask[None, :, :], float(ops.finfo(subsequent_token_logits.dtype).min)
    )
    subsequent_token_mask = attention_mask.view(-1).bool()

    loss = None
    if initial_token_labels is not None and subsequent_token_labels is not None:
        # get initial token loss
        initial_token_labels = initial_token_labels.view(-1)
        if bbox_first_token_mask is not None:
            bbox_first_token_mask = bbox_first_token_mask.view(-1)
            initial_token_loss = F.cross_entropy(
                initial_token_logits.view(-1, self.num_labels)[bbox_first_token_mask],
                initial_token_labels[bbox_first_token_mask],
            )
        else:
            initial_token_loss = F.cross_entropy(initial_token_logits.view(-1, self.num_labels), initial_token_labels)

        subsequent_token_labels = subsequent_token_labels.view(-1)
        subsequent_token_loss = F.cross_entropy(
            subsequent_token_logits.view(-1, max_seq_length + 1)[subsequent_token_mask],
            subsequent_token_labels[subsequent_token_mask],
        )

        loss = initial_token_loss + subsequent_token_loss

    if not return_dict:
        output = (initial_token_logits, subsequent_token_logits) + outputs[2:]
        return ((loss,) + output) if loss is not None else output

    return BrosSpadeOutput(
        loss=loss,
        initial_token_logits=initial_token_logits,
        subsequent_token_logits=subsequent_token_logits,
        hidden_states=outputs.hidden_states,
        attentions=outputs.attentions,
    )

mindnlp.transformers.models.bros.modeling_bros.BrosSpadeELForTokenClassification

Bases: BrosPreTrainedModel

This class represents a Bros Spade Entity Linking model for token classification.

The BrosSpadeELForTokenClassification class is a subclass of the BrosPreTrainedModel class and is used for token classification tasks. It inherits the init and forward methods from the BrosPreTrainedModel class.

ATTRIBUTE DESCRIPTION
config

The configuration object used to initialize the model.

num_labels

The number of labels for token classification.

n_relations

The number of relations used in the model.

backbone_hidden_size

The hidden size of the model's backbone.

bros

An instance of the BrosModel class.

entity_linker

An instance of the BrosRelationExtractor class.

METHOD DESCRIPTION
__init__

Initializes the BrosSpadeELForTokenClassification object with the given config.

forward

Constructs the model and performs token classification.

RETURNS DESCRIPTION

Conditional returns:

  • If return_dict is False:

    • A tuple containing the logits and other model outputs.
  • If return_dict is True:

    • An instance of the TokenClassifierOutput class containing the loss, logits, hidden states, and attentions.
Example
>>> import torch
>>> from transformers import BrosProcessor, BrosSpadeELForTokenClassification
...
>>> processor = BrosProcessor.from_pretrained("jinho8345/bros-base-uncased")
...
>>> model = BrosSpadeELForTokenClassification.from_pretrained("jinho8345/bros-base-uncased")
...
>>> encoding = processor("Hello, my dog is cute", add_special_tokens=False, return_tensors="pt")
>>> bbox = torch.tensor([[[0, 0, 1, 1]]]).repeat(1, encoding["input_ids"].shape[-1], 1)
>>> encoding["bbox"] = bbox
...
>>> outputs = model(**encoding)
Source code in mindnlp/transformers/models/bros/modeling_bros.py
2102
2103
2104
2105
2106
2107
2108
2109
2110
2111
2112
2113
2114
2115
2116
2117
2118
2119
2120
2121
2122
2123
2124
2125
2126
2127
2128
2129
2130
2131
2132
2133
2134
2135
2136
2137
2138
2139
2140
2141
2142
2143
2144
2145
2146
2147
2148
2149
2150
2151
2152
2153
2154
2155
2156
2157
2158
2159
2160
2161
2162
2163
2164
2165
2166
2167
2168
2169
2170
2171
2172
2173
2174
2175
2176
2177
2178
2179
2180
2181
2182
2183
2184
2185
2186
2187
2188
2189
2190
2191
2192
2193
2194
2195
2196
2197
2198
2199
2200
2201
2202
2203
2204
2205
2206
2207
2208
2209
2210
2211
2212
2213
2214
2215
2216
2217
2218
2219
2220
2221
2222
2223
2224
2225
2226
2227
2228
2229
2230
2231
2232
2233
2234
2235
2236
2237
2238
2239
2240
2241
2242
2243
2244
2245
2246
2247
2248
2249
2250
2251
2252
2253
2254
2255
2256
2257
2258
2259
2260
2261
2262
2263
2264
2265
2266
2267
2268
2269
2270
2271
class BrosSpadeELForTokenClassification(BrosPreTrainedModel):

    """
    This class represents a Bros Spade Entity Linking model for token classification.

    The BrosSpadeELForTokenClassification class is a subclass of the BrosPreTrainedModel class and is used for token classification tasks. It inherits the __init__ and forward methods from the
    BrosPreTrainedModel class.

    Attributes:
        config: The configuration object used to initialize the model.
        num_labels: The number of labels for token classification.
        n_relations: The number of relations used in the model.
        backbone_hidden_size: The hidden size of the model's backbone.
        bros: An instance of the BrosModel class.
        entity_linker: An instance of the BrosRelationExtractor class.

    Methods:
        __init__(self, config): Initializes the BrosSpadeELForTokenClassification object with the given config.
        forward: Constructs the model and performs token classification.

    Returns:
        Conditional returns:

            - If return_dict is False:

                - A tuple containing the logits and other model outputs.

            - If return_dict is True:

                - An instance of the TokenClassifierOutput class containing the loss, logits, hidden states, and attentions.

    Example:
        ```python
        >>> import torch
        >>> from transformers import BrosProcessor, BrosSpadeELForTokenClassification
        ...
        >>> processor = BrosProcessor.from_pretrained("jinho8345/bros-base-uncased")
        ...
        >>> model = BrosSpadeELForTokenClassification.from_pretrained("jinho8345/bros-base-uncased")
        ...
        >>> encoding = processor("Hello, my dog is cute", add_special_tokens=False, return_tensors="pt")
        >>> bbox = torch.tensor([[[0, 0, 1, 1]]]).repeat(1, encoding["input_ids"].shape[-1], 1)
        >>> encoding["bbox"] = bbox
        ...
        >>> outputs = model(**encoding)
        ```
    """
    _keys_to_ignore_on_load_unexpected = [r"pooler"]

    def __init__(self, config):
        """
        Initializes an instance of the BrosSpadeELForTokenClassification class.

        Args:
            self: The object instance.
            config: An instance of the BrosConfig class containing the configuration parameters.
                It should have the following attributes:

                - num_labels (int): The number of possible labels for token classification.
                - n_relations (int): The number of possible relations.
                - hidden_size (int): The hidden size of the model.

        Returns:
            None.

        Raises:
            None.

        Note:
            This method initializes the BrosSpadeELForTokenClassification instance by setting the provided configuration parameters.
            It also initializes the bros model and the entity linker for relation extraction.
            The method init_weights() is called to initialize the weights of the model.
        """
        super().__init__(config)
        self.config = config
        self.num_labels = config.num_labels
        self.n_relations = config.n_relations
        self.backbone_hidden_size = config.hidden_size

        self.bros = BrosModel(config)
        # (config.classifier_dropout if hasattr(config, "classifier_dropout") else config.hidden_dropout_prob)

        self.entity_linker = BrosRelationExtractor(config)

        self.init_weights()

    def forward(
        self,
        input_ids: Optional[mindspore.Tensor] = None,
        bbox: Optional[mindspore.Tensor] = None,
        attention_mask: Optional[mindspore.Tensor] = None,
        bbox_first_token_mask: Optional[mindspore.Tensor] = None,
        token_type_ids: Optional[mindspore.Tensor] = None,
        position_ids: Optional[mindspore.Tensor] = None,
        head_mask: Optional[mindspore.Tensor] = None,
        inputs_embeds: Optional[mindspore.Tensor] = None,
        labels: Optional[mindspore.Tensor] = None,
        output_attentions: Optional[bool] = None,
        output_hidden_states: Optional[bool] = None,
        return_dict: Optional[bool] = None,
    ) -> Union[Tuple[mindspore.Tensor], TokenClassifierOutput]:
        r"""
        Returns:
            Union[Tuple[mindspore.Tensor], TokenClassifierOutput]

        Example:
            ```python
            >>> import torch
            >>> from transformers import BrosProcessor, BrosSpadeELForTokenClassification
            ...
            >>> processor = BrosProcessor.from_pretrained("jinho8345/bros-base-uncased")
            ...
            >>> model = BrosSpadeELForTokenClassification.from_pretrained("jinho8345/bros-base-uncased")
            ...
            >>> encoding = processor("Hello, my dog is cute", add_special_tokens=False, return_tensors="pt")
            >>> bbox = torch.tensor([[[0, 0, 1, 1]]]).repeat(1, encoding["input_ids"].shape[-1], 1)
            >>> encoding["bbox"] = bbox
            ...
            >>> outputs = model(**encoding)
            ```
        """
        return_dict = return_dict if return_dict is not None else self.config.use_return_dict

        outputs = self.bros(
            input_ids=input_ids,
            bbox=bbox,
            attention_mask=attention_mask,
            token_type_ids=token_type_ids,
            position_ids=position_ids,
            head_mask=head_mask,
            inputs_embeds=inputs_embeds,
            output_attentions=output_attentions,
            output_hidden_states=output_hidden_states,
            return_dict=return_dict,
        )

        last_hidden_states = outputs[0]
        last_hidden_states = last_hidden_states.swapaxes(0, 1)

        logits = self.entity_linker(last_hidden_states, last_hidden_states).squeeze(0)

        loss = None
        if labels is not None:
            batch_size, max_seq_length = attention_mask.shape

            self_token_mask = ops.eye(max_seq_length, max_seq_length + 1).bool()

            mask = bbox_first_token_mask.view(-1)
            bbox_first_token_mask = ops.cat(
                [
                    ~bbox_first_token_mask,
                    ops.zeros(batch_size, 1, dtype=mindspore.bool_),
                ],
                dim=1,
            )
            logits = logits.masked_fill(bbox_first_token_mask[:, None, :], float(ops.finfo(logits.dtype).min))
            logits = logits.masked_fill(self_token_mask[None, :, :], float(ops.finfo(logits.dtype).min))

            loss = F.cross_entropy(logits.view(-1, max_seq_length + 1)[mask], labels.view(-1)[mask])

        if not return_dict:
            output = (logits,) + outputs[2:]
            return ((loss,) + output) if loss is not None else output

        return TokenClassifierOutput(
            loss=loss,
            logits=logits,
            hidden_states=outputs.hidden_states,
            attentions=outputs.attentions,
        )

mindnlp.transformers.models.bros.modeling_bros.BrosSpadeELForTokenClassification.__init__(config)

Initializes an instance of the BrosSpadeELForTokenClassification class.

PARAMETER DESCRIPTION
self

The object instance.

config

An instance of the BrosConfig class containing the configuration parameters. It should have the following attributes:

  • num_labels (int): The number of possible labels for token classification.
  • n_relations (int): The number of possible relations.
  • hidden_size (int): The hidden size of the model.

RETURNS DESCRIPTION

None.

Note

This method initializes the BrosSpadeELForTokenClassification instance by setting the provided configuration parameters. It also initializes the bros model and the entity linker for relation extraction. The method init_weights() is called to initialize the weights of the model.

Source code in mindnlp/transformers/models/bros/modeling_bros.py
2151
2152
2153
2154
2155
2156
2157
2158
2159
2160
2161
2162
2163
2164
2165
2166
2167
2168
2169
2170
2171
2172
2173
2174
2175
2176
2177
2178
2179
2180
2181
2182
2183
2184
2185
2186
def __init__(self, config):
    """
    Initializes an instance of the BrosSpadeELForTokenClassification class.

    Args:
        self: The object instance.
        config: An instance of the BrosConfig class containing the configuration parameters.
            It should have the following attributes:

            - num_labels (int): The number of possible labels for token classification.
            - n_relations (int): The number of possible relations.
            - hidden_size (int): The hidden size of the model.

    Returns:
        None.

    Raises:
        None.

    Note:
        This method initializes the BrosSpadeELForTokenClassification instance by setting the provided configuration parameters.
        It also initializes the bros model and the entity linker for relation extraction.
        The method init_weights() is called to initialize the weights of the model.
    """
    super().__init__(config)
    self.config = config
    self.num_labels = config.num_labels
    self.n_relations = config.n_relations
    self.backbone_hidden_size = config.hidden_size

    self.bros = BrosModel(config)
    # (config.classifier_dropout if hasattr(config, "classifier_dropout") else config.hidden_dropout_prob)

    self.entity_linker = BrosRelationExtractor(config)

    self.init_weights()

mindnlp.transformers.models.bros.modeling_bros.BrosSpadeELForTokenClassification.forward(input_ids=None, bbox=None, attention_mask=None, bbox_first_token_mask=None, token_type_ids=None, position_ids=None, head_mask=None, inputs_embeds=None, labels=None, output_attentions=None, output_hidden_states=None, return_dict=None)

RETURNS DESCRIPTION
Union[Tuple[Tensor], TokenClassifierOutput]

Union[Tuple[mindspore.Tensor], TokenClassifierOutput]

Example
>>> import torch
>>> from transformers import BrosProcessor, BrosSpadeELForTokenClassification
...
>>> processor = BrosProcessor.from_pretrained("jinho8345/bros-base-uncased")
...
>>> model = BrosSpadeELForTokenClassification.from_pretrained("jinho8345/bros-base-uncased")
...
>>> encoding = processor("Hello, my dog is cute", add_special_tokens=False, return_tensors="pt")
>>> bbox = torch.tensor([[[0, 0, 1, 1]]]).repeat(1, encoding["input_ids"].shape[-1], 1)
>>> encoding["bbox"] = bbox
...
>>> outputs = model(**encoding)
Source code in mindnlp/transformers/models/bros/modeling_bros.py
2188
2189
2190
2191
2192
2193
2194
2195
2196
2197
2198
2199
2200
2201
2202
2203
2204
2205
2206
2207
2208
2209
2210
2211
2212
2213
2214
2215
2216
2217
2218
2219
2220
2221
2222
2223
2224
2225
2226
2227
2228
2229
2230
2231
2232
2233
2234
2235
2236
2237
2238
2239
2240
2241
2242
2243
2244
2245
2246
2247
2248
2249
2250
2251
2252
2253
2254
2255
2256
2257
2258
2259
2260
2261
2262
2263
2264
2265
2266
2267
2268
2269
2270
2271
def forward(
    self,
    input_ids: Optional[mindspore.Tensor] = None,
    bbox: Optional[mindspore.Tensor] = None,
    attention_mask: Optional[mindspore.Tensor] = None,
    bbox_first_token_mask: Optional[mindspore.Tensor] = None,
    token_type_ids: Optional[mindspore.Tensor] = None,
    position_ids: Optional[mindspore.Tensor] = None,
    head_mask: Optional[mindspore.Tensor] = None,
    inputs_embeds: Optional[mindspore.Tensor] = None,
    labels: Optional[mindspore.Tensor] = None,
    output_attentions: Optional[bool] = None,
    output_hidden_states: Optional[bool] = None,
    return_dict: Optional[bool] = None,
) -> Union[Tuple[mindspore.Tensor], TokenClassifierOutput]:
    r"""
    Returns:
        Union[Tuple[mindspore.Tensor], TokenClassifierOutput]

    Example:
        ```python
        >>> import torch
        >>> from transformers import BrosProcessor, BrosSpadeELForTokenClassification
        ...
        >>> processor = BrosProcessor.from_pretrained("jinho8345/bros-base-uncased")
        ...
        >>> model = BrosSpadeELForTokenClassification.from_pretrained("jinho8345/bros-base-uncased")
        ...
        >>> encoding = processor("Hello, my dog is cute", add_special_tokens=False, return_tensors="pt")
        >>> bbox = torch.tensor([[[0, 0, 1, 1]]]).repeat(1, encoding["input_ids"].shape[-1], 1)
        >>> encoding["bbox"] = bbox
        ...
        >>> outputs = model(**encoding)
        ```
    """
    return_dict = return_dict if return_dict is not None else self.config.use_return_dict

    outputs = self.bros(
        input_ids=input_ids,
        bbox=bbox,
        attention_mask=attention_mask,
        token_type_ids=token_type_ids,
        position_ids=position_ids,
        head_mask=head_mask,
        inputs_embeds=inputs_embeds,
        output_attentions=output_attentions,
        output_hidden_states=output_hidden_states,
        return_dict=return_dict,
    )

    last_hidden_states = outputs[0]
    last_hidden_states = last_hidden_states.swapaxes(0, 1)

    logits = self.entity_linker(last_hidden_states, last_hidden_states).squeeze(0)

    loss = None
    if labels is not None:
        batch_size, max_seq_length = attention_mask.shape

        self_token_mask = ops.eye(max_seq_length, max_seq_length + 1).bool()

        mask = bbox_first_token_mask.view(-1)
        bbox_first_token_mask = ops.cat(
            [
                ~bbox_first_token_mask,
                ops.zeros(batch_size, 1, dtype=mindspore.bool_),
            ],
            dim=1,
        )
        logits = logits.masked_fill(bbox_first_token_mask[:, None, :], float(ops.finfo(logits.dtype).min))
        logits = logits.masked_fill(self_token_mask[None, :, :], float(ops.finfo(logits.dtype).min))

        loss = F.cross_entropy(logits.view(-1, max_seq_length + 1)[mask], labels.view(-1)[mask])

    if not return_dict:
        output = (logits,) + outputs[2:]
        return ((loss,) + output) if loss is not None else output

    return TokenClassifierOutput(
        loss=loss,
        logits=logits,
        hidden_states=outputs.hidden_states,
        attentions=outputs.attentions,
    )

mindnlp.transformers.models.bros.processing_bros.BrosProcessor

Bases: ProcessorMixin

Constructs a Bros processor which wraps a BERT tokenizer.

[BrosProcessor] offers all the functionalities of [BertTokenizerFast]. See the docstring of [~BrosProcessor.__call__] and [~BrosProcessor.decode] for more information.

PARAMETER DESCRIPTION
tokenizer

An instance of ['BertTokenizerFast`]. The tokenizer is a required input.

TYPE: `BertTokenizerFast`, *optional* DEFAULT: None

Source code in mindnlp/transformers/models/bros/processing_bros.py
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
class BrosProcessor(ProcessorMixin):
    r"""
    Constructs a Bros processor which wraps a BERT tokenizer.

    [`BrosProcessor`] offers all the functionalities of [`BertTokenizerFast`]. See the docstring of
    [`~BrosProcessor.__call__`] and [`~BrosProcessor.decode`] for more information.

    Args:
        tokenizer (`BertTokenizerFast`, *optional*):
            An instance of ['BertTokenizerFast`]. The tokenizer is a required input.
    """
    attributes = ["tokenizer"]
    tokenizer_class = ("BertTokenizer", "BertTokenizerFast")

    def __init__(self, tokenizer=None, **kwargs):
        """
        Initializes an instance of the BrosProcessor class.

        Args:
            self: The instance of the BrosProcessor class.
            tokenizer: An optional tokenizer object used for tokenizing the input. If not provided, a ValueError is raised.

        Returns:
            None

        Raises:
            ValueError: If the tokenizer parameter is not specified.
        """
        if tokenizer is None:
            raise ValueError("You need to specify a `tokenizer`.")

        super().__init__(tokenizer)

    def __call__(
        self,
        text: Union[TextInput, PreTokenizedInput, List[TextInput], List[PreTokenizedInput]] = None,
        add_special_tokens: bool = True,
        padding: Union[bool, str, PaddingStrategy] = False,
        truncation: Union[bool, str, TruncationStrategy] = None,
        max_length: Optional[int] = None,
        stride: int = 0,
        pad_to_multiple_of: Optional[int] = None,
        return_token_type_ids: Optional[bool] = None,
        return_attention_mask: Optional[bool] = None,
        return_overflowing_tokens: bool = False,
        return_special_tokens_mask: bool = False,
        return_offsets_mapping: bool = False,
        return_length: bool = False,
        verbose: bool = True,
        return_tensors: Optional[Union[str, TensorType]] = None,
        **kwargs,
    ) -> BatchEncoding:
        """
        This method uses [`BertTokenizerFast.__call__`] to prepare text for the model.

        Please refer to the docstring of the above two methods for more information.
        """
        encoding = self.tokenizer(
            text=text,
            add_special_tokens=add_special_tokens,
            padding=padding,
            truncation=truncation,
            max_length=max_length,
            stride=stride,
            pad_to_multiple_of=pad_to_multiple_of,
            return_token_type_ids=return_token_type_ids,
            return_attention_mask=return_attention_mask,
            return_overflowing_tokens=return_overflowing_tokens,
            return_special_tokens_mask=return_special_tokens_mask,
            return_offsets_mapping=return_offsets_mapping,
            return_length=return_length,
            verbose=verbose,
            return_tensors=return_tensors,
            **kwargs,
        )

        return encoding

    def batch_decode(self, *args, **kwargs):
        """
        This method forwards all its arguments to BertTokenizerFast's [`~PreTrainedTokenizer.batch_decode`]. Please
        refer to the docstring of this method for more information.
        """
        return self.tokenizer.batch_decode(*args, **kwargs)

    def decode(self, *args, **kwargs):
        """
        This method forwards all its arguments to BertTokenizerFast's [`~PreTrainedTokenizer.decode`]. Please refer to
        the docstring of this method for more information.
        """
        return self.tokenizer.decode(*args, **kwargs)

    @property
    def model_input_names(self):
        """
        This method returns a list of unique model input names used by the BrosProcessor's tokenizer.

        Args:
            self: BrosProcessor instance.
                The self parameter refers to the current BrosProcessor object.

        Returns:
            None.

        Raises:
            No exceptions are raised within this method.
        """
        tokenizer_input_names = self.tokenizer.model_input_names
        return list(dict.fromkeys(tokenizer_input_names))

mindnlp.transformers.models.bros.processing_bros.BrosProcessor.model_input_names property

This method returns a list of unique model input names used by the BrosProcessor's tokenizer.

PARAMETER DESCRIPTION
self

BrosProcessor instance. The self parameter refers to the current BrosProcessor object.

RETURNS DESCRIPTION

None.

mindnlp.transformers.models.bros.processing_bros.BrosProcessor.__call__(text=None, add_special_tokens=True, padding=False, truncation=None, max_length=None, stride=0, pad_to_multiple_of=None, return_token_type_ids=None, return_attention_mask=None, return_overflowing_tokens=False, return_special_tokens_mask=False, return_offsets_mapping=False, return_length=False, verbose=True, return_tensors=None, **kwargs)

This method uses [BertTokenizerFast.__call__] to prepare text for the model.

Please refer to the docstring of the above two methods for more information.

Source code in mindnlp/transformers/models/bros/processing_bros.py
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
def __call__(
    self,
    text: Union[TextInput, PreTokenizedInput, List[TextInput], List[PreTokenizedInput]] = None,
    add_special_tokens: bool = True,
    padding: Union[bool, str, PaddingStrategy] = False,
    truncation: Union[bool, str, TruncationStrategy] = None,
    max_length: Optional[int] = None,
    stride: int = 0,
    pad_to_multiple_of: Optional[int] = None,
    return_token_type_ids: Optional[bool] = None,
    return_attention_mask: Optional[bool] = None,
    return_overflowing_tokens: bool = False,
    return_special_tokens_mask: bool = False,
    return_offsets_mapping: bool = False,
    return_length: bool = False,
    verbose: bool = True,
    return_tensors: Optional[Union[str, TensorType]] = None,
    **kwargs,
) -> BatchEncoding:
    """
    This method uses [`BertTokenizerFast.__call__`] to prepare text for the model.

    Please refer to the docstring of the above two methods for more information.
    """
    encoding = self.tokenizer(
        text=text,
        add_special_tokens=add_special_tokens,
        padding=padding,
        truncation=truncation,
        max_length=max_length,
        stride=stride,
        pad_to_multiple_of=pad_to_multiple_of,
        return_token_type_ids=return_token_type_ids,
        return_attention_mask=return_attention_mask,
        return_overflowing_tokens=return_overflowing_tokens,
        return_special_tokens_mask=return_special_tokens_mask,
        return_offsets_mapping=return_offsets_mapping,
        return_length=return_length,
        verbose=verbose,
        return_tensors=return_tensors,
        **kwargs,
    )

    return encoding

mindnlp.transformers.models.bros.processing_bros.BrosProcessor.__init__(tokenizer=None, **kwargs)

Initializes an instance of the BrosProcessor class.

PARAMETER DESCRIPTION
self

The instance of the BrosProcessor class.

tokenizer

An optional tokenizer object used for tokenizing the input. If not provided, a ValueError is raised.

DEFAULT: None

RETURNS DESCRIPTION

None

RAISES DESCRIPTION
ValueError

If the tokenizer parameter is not specified.

Source code in mindnlp/transformers/models/bros/processing_bros.py
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
def __init__(self, tokenizer=None, **kwargs):
    """
    Initializes an instance of the BrosProcessor class.

    Args:
        self: The instance of the BrosProcessor class.
        tokenizer: An optional tokenizer object used for tokenizing the input. If not provided, a ValueError is raised.

    Returns:
        None

    Raises:
        ValueError: If the tokenizer parameter is not specified.
    """
    if tokenizer is None:
        raise ValueError("You need to specify a `tokenizer`.")

    super().__init__(tokenizer)

mindnlp.transformers.models.bros.processing_bros.BrosProcessor.batch_decode(*args, **kwargs)

This method forwards all its arguments to BertTokenizerFast's [~PreTrainedTokenizer.batch_decode]. Please refer to the docstring of this method for more information.

Source code in mindnlp/transformers/models/bros/processing_bros.py
104
105
106
107
108
109
def batch_decode(self, *args, **kwargs):
    """
    This method forwards all its arguments to BertTokenizerFast's [`~PreTrainedTokenizer.batch_decode`]. Please
    refer to the docstring of this method for more information.
    """
    return self.tokenizer.batch_decode(*args, **kwargs)

mindnlp.transformers.models.bros.processing_bros.BrosProcessor.decode(*args, **kwargs)

This method forwards all its arguments to BertTokenizerFast's [~PreTrainedTokenizer.decode]. Please refer to the docstring of this method for more information.

Source code in mindnlp/transformers/models/bros/processing_bros.py
111
112
113
114
115
116
def decode(self, *args, **kwargs):
    """
    This method forwards all its arguments to BertTokenizerFast's [`~PreTrainedTokenizer.decode`]. Please refer to
    the docstring of this method for more information.
    """
    return self.tokenizer.decode(*args, **kwargs)