faknow.model.content_based.multi_modal

faknow.model.content_based.multi_modal.cafe

class faknow.model.content_based.multi_modal.cafe.CAFE(feature_dim=96, h_dim=64)[source]

Bases: AbstractModel

CAFE: Cross-modal Ambiguity Learning for Multimodal Fake News Detection, WWW 2022 paper: https://dl.acm.org/doi/10.1145/3485447.3511968 code: https://github.com/cyxanna/CAFE

__init__(feature_dim=96, h_dim=64)[source]
Parameters:
  • feature_dim (int) – number of feature dim

  • h_dim (int) – number of hidden dim

calculate_loss(data: Dict[str, Tensor]) Tensor[source]

process raw data using similarity_module calculate loss via CrossEntropyLoss :param data: batch data tuple,including text, image and label :type data: Tuple[Tensor]

Returns:

loss

Return type:

torch.Tensor

forward(text: Tensor, image: Tensor) Tensor[source]
Parameters:
  • text (Tensor) – the raw text, shape=(batch_size, 30, 200)

  • image (Tensor) – the raw image, shape=(batch_size, 512)

Returns:

prediction of being fake news, shape=(batch_size, 2)

Return type:

Tensor

predict(data: Dict[str, Tensor]) Tensor[source]
Parameters:

data (Tuple[Tensor]) – batch data tuple, including text, image and label

Returns:

softmax probability, shape=(batch_size, 2)

Return type:

Tensor

training: bool

faknow.model.content_based.multi_modal.eann

class faknow.model.content_based.multi_modal.eann.EANN(event_num: int, embed_weight: Tensor, reverse_lambda=1.0, hidden_size=32)[source]

Bases: AbstractModel

EANN: Event Adversarial Neural Networks for Multi-Modal Fake News Detection, KDD 2018 paper: https://dl.acm.org/doi/abs/10.1145/3219819.3219903 code: https://github.com/yaqingwang/EANN-KDD18

__init__(event_num: int, embed_weight: Tensor, reverse_lambda=1.0, hidden_size=32)[source]
Parameters:
  • event_num (int) – number of events

  • embed_weight (Tensor) – weights for word embedding layer, shape=(vocab_size, embedding_size)

  • reverse_lambda (float) – lambda for gradient reverse layer. Default=1

  • hidden_size (int) – size for hidden layers. Default=32

calculate_loss(data: Dict[str, Any]) Dict[str, Tensor][source]

calculate total loss, classification loss and domain loss via CrossEntropyLoss, where total loss = classification loss + domain loss

Parameters:

data (Dict[str, Any]) – batch data dict

Returns:

loss dict, key: total_loss, class_loss, domain_loss

Return type:

Dict[str, Tensor]

forward(token_id: Tensor, mask: Tensor, image: Tensor)[source]
Parameters:
  • token_id (Tensor) – text token ids

  • image (Tensor) – image pixels

  • mask (Tensor) – text masks

Returns:

  • class_output (Tensor): prediction of being fake news, shape=(batch_size, 2)

  • domain_output (Tensor): prediction of belonging to which domain, shape=(batch_size, domain_num)

Return type:

tuple

predict(data_without_label) Tensor[source]

predict the probability of being fake news

Parameters:

data_without_label (Dict[str, Any]) – batch data dict

Returns:

softmax probability, shape=(batch_size, 2)

Return type:

Tensor

training: bool

faknow.model.content_based.multi_modal.hmcan

class faknow.model.content_based.multi_modal.hmcan.HMCAN(left_num_layers=2, left_num_heads=12, dropout=0.1, right_num_layers=2, right_num_heads=12, alpha=0.7, pre_trained_bert_name='bert-base-uncased')[source]

Bases: AbstractModel

HMCAN: Hierarchical Multi-modal Contextual Attention Network for fake news Detection, SIGIR 2021 paper: https://dl.acm.org/doi/10.1145/3404835.3462871 code: https://github.com/wangjinguang502/HMCAN

__init__(left_num_layers=2, left_num_heads=12, dropout=0.1, right_num_layers=2, right_num_heads=12, alpha=0.7, pre_trained_bert_name='bert-base-uncased')[source]
Parameters:
  • left_num_layers (int) – the numbers of the left Attention&FFN layer in Contextual Transformer, Default=2.

  • left_num_heads (int) – the numbers of head in Multi-Head Attention layer(in the left Attention&FFN), Default=12.

  • dropout (float) – dropout rate, Default=0.1.

  • right_num_layers (int) – the numbers of the right Attention&FFN layer in Contextual Transformer, Default=2.

  • right_num_heads (int) – the numbers of head in Multi-Head Attention layer(in the right Attention&FFN), Default=12.

  • alpha (float) – the weight of the first Attention&FFN layer’s output, Default=0.7.

  • pre_trained_bert_name (str) – the bert name str. default=’bert-base-uncased’

calculate_loss(data: Dict[str, Any]) Tensor[source]

calculate total loss

Parameters:

data (Dict[str, any]) – batch data dict

Returns:

total_loss

Return type:

Tensor

forward(token_id: Tensor, mask: Tensor, image: Tensor)[source]
Parameters:
  • token_id (Tensor) – text token ids

  • image (Tensor) – image pixels

  • mask (torch.Tensor) – text masks

Returns:

prediction of being fake news, shape=(batch_size, 2)

Return type:

Tensor

predict(data_without_label: Dict[str, Any]) Tensor[source]

predict the probability of being fake news

Parameters:

data_without_label (Dict[str, Any]) – batch data dict

Returns:

softmax probability, shape=(batch_size, 2)

Return type:

Tensor

training: bool

faknow.model.content_based.multi_modal.mcan

class faknow.model.content_based.multi_modal.mcan.MCAN(bert: str, kernel_sizes: List[int] | None = None, num_channels: List[int] | None = None, model_dim=256, drop_and_bn='drop-bn', num_layers=1, num_heads=8, ffn_dim=2048, dropout=0.5)[source]

Bases: AbstractModel

Multimodal Fusion with Co-Attention Networks for Fake News Detection, ACL 2021 paper: https://aclanthology.org/2021.findings-acl.226/ code: https://github.com/wuyang45/MCAN_code

__init__(bert: str, kernel_sizes: List[int] | None = None, num_channels: List[int] | None = None, model_dim=256, drop_and_bn='drop-bn', num_layers=1, num_heads=8, ffn_dim=2048, dropout=0.5)[source]
Parameters:
  • bert (str) – bert model name

  • kernel_sizes (List[int]) – kernel sizes of DctCNN. Default=[3, 3, 3]

  • num_channels (List[int]) – number of channels of DctCNN. Default=[32, 64, 128]

  • model_dim (int) – model dimension. Default=256

  • drop_and_bn (str) – dropout and batch normalization. ‘drop-bn’, ‘bn-drop’, ‘drop’, ‘bn’ or None. Default=’drop-bn’

  • num_layers (int) – number of co-attention layers. Default=1

  • num_heads (int) – number of heads in multi-head attention. Default=8

  • ffn_dim (int) – dimension of feed forward network. Default=2048

  • dropout (float) – dropout rate. Default=0.5

calculate_loss(data) Tensor[source]

calculate loss via CrossEntropyLoss

Parameters:

data (dict) – batch data dict

Returns:

loss value

Return type:

loss (Tensor)

drop_bn_layer(x, part='dct')[source]

drop out and batch normalization

Parameters:
  • x (torch.Tensor) – input tensor

  • part (str) – ‘dct’, ‘vgg’ or ‘bert’. Default=’dct’

forward(input_ids: Tensor, mask: Tensor, image: Tensor, dct_img: Tensor) Tensor[source]
Parameters:
  • input_ids (Tensor) – shape=(batch_size, max_len)

  • mask (Tensor) – shape=(batch_size, max_len)

  • image (Tensor) – transformed image tensor, shape=(batch_size, 3, 224, 224)

  • dct_img (Tensor) – dtc image tensor, shape=(batch_size, N*N, 250)

Returns:

shape=(batch_size, 2)

Return type:

output (Tensor)

predict(data) Tensor[source]

predict the probability of being fake news

Parameters:

data_without_label (Dict[str, Any]) – batch data dict

Returns:

softmax probability, shape=(batch_size, 2)

Return type:

Tensor

training: bool

faknow.model.content_based.multi_modal.mfan

class faknow.model.content_based.multi_modal.mfan.MFAN(word_vectors: Tensor, node_num: int, node_embedding: Tensor, adj_matrix: Tensor, dropout_rate=0.6)[source]

Bases: AbstractModel

MFAN: Multi-modal Feature-enhanced TransformerBlock Networks for Rumor Detection, IJCAI 2022 paper: https://www.ijcai.org/proceedings/2022/335 code: https://github.com/drivsaf/MFAN

__init__(word_vectors: Tensor, node_num: int, node_embedding: Tensor, adj_matrix: Tensor, dropout_rate=0.6)[source]
Parameters:
  • word_vectors (Tensor) – pretrained weights for word embedding

  • node_num (int) – number of nodes in graph

  • node_embedding (Tensor) – pretrained weights for node embedding

  • adj_matrix (Tensor) – adjacent matrix of graph

  • dropout_rate (float) – drop out rate. Default=0.6

calculate_loss(data) Dict[str, Tensor][source]

calculate total loss, classification loss(via CrossEntropyLoss) and distance loss(via MSELoss), where total loss = classification loss + distance loss

Parameters:

data (Dict[str, Any]) – batch data dict

Returns:

loss dict, key: total_loss, class_loss, dis_loss

Return type:

Dict[str, Tensor]

forward(post_id: Tensor, text: Tensor, image: Tensor)[source]
Parameters:
  • post_id (Tensor) – id of post, shape=(batch_size,)

  • text (Tensor) – token ids, shape=(batch_size, max_len)

  • image (Tensor) – shape=(batch_size, 3, width, height)

Returns:

  • class_output (Tensor): prediction of being fake news, shape=(batch_size, 2)

  • dist (List[Tensor]): aligned text and aligned graph, shape=(batch_size, embedding_size)

Return type:

tuple

predict(data_without_label) Tensor[source]

predict the probability of being fake news

Parameters:

data_without_label (Dict[str, Any]) – batch data dict

Returns:

softmax probability, shape=(batch_size, 2)

Return type:

Tensor

training: bool

faknow.model.content_based.multi_modal.safe

class faknow.model.content_based.multi_modal.safe.SAFE(embedding_size: int = 300, conv_in_size: int = 32, filter_num: int = 128, cnn_out_size: int = 200, dropout: float = 0.0, loss_weights: List[float] | None = None)[source]

Bases: AbstractModel

SAFE: Similarity-Aware Multi-Modal Fake News Detection

__init__(embedding_size: int = 300, conv_in_size: int = 32, filter_num: int = 128, cnn_out_size: int = 200, dropout: float = 0.0, loss_weights: List[float] | None = None)[source]
Parameters:
  • embedding_size (int) – embedding size of text.

  • conv_in_size (int) – number of in channels in TextCNN. Default=32

  • filter_num (int) – number of filters in TextCNN. Default=128

  • cnn_out_size (int) – output size of FC layer in TextCNN. Default=200

  • dropout (float) – drop out rate. Default=0.0

  • loss_weights (List[float]) – list of loss weights. Default=[1.0, 1.0]

calculate_loss(data) Dict[str, Tensor][source]

Calculate the loss for the SAFE model.

Parameters:

data (Dict[str, Tensor]) – Input data containing ‘head’, ‘body’, ‘image’, and ‘label’ tensors.

Returns:

Dictionary containing computed losses.

Return type:

Dict[str, Tensor]

forward(head: Tensor, body: Tensor, image: Tensor)[source]
Parameters:
  • head (Tensor) – embedded title of post, shape=(batch_size, title_len, embedding_size)

  • body (Tensor) – embedded content of post, shape=(batch_size, content_len, embedding_size)

  • image (Tensor) – embedded sentence converted from post image, shape=(batch_size, sentence_len, embedding_size)

Returns:

  • class_output (Tensor): prediction of being fake news, shape=(batch_size, 2)

  • cos_dis_sim (Tensor): prediction of belonging to which domain, shape=(batch_size, 2)

Return type:

tuple

predict(data_without_label) Tensor[source]

Perform prediction with the SAFE model.

Parameters:

data_without_label (Dict[str, Tensor]) – Input data containing ‘head’, ‘body’, and ‘image’ tensors.

Returns:

Predicted class output tensor.

Return type:

torch.Tensor

training: bool

faknow.model.content_based.multi_modal.spotfake

class faknow.model.content_based.multi_modal.spotfake.SpotFake(text_fc2_out: int = 32, text_fc1_out: int = 2742, dropout_p: float = 0.4, fine_tune_text_module: bool = False, img_fc1_out: int = 2742, img_fc2_out: int = 32, fine_tune_vis_module: bool = False, fusion_output_size: int = 35, loss_func=BCELoss(), pre_trained_bert_name='bert-base-uncased')[source]

Bases: AbstractModel

SpotFake: A Multi-modal Framework for Fake News Detection, BigMM 2019 paper: https://ieeexplore.ieee.org/document/8919302 code: https://github.com/shiivangii/SpotFake

__init__(text_fc2_out: int = 32, text_fc1_out: int = 2742, dropout_p: float = 0.4, fine_tune_text_module: bool = False, img_fc1_out: int = 2742, img_fc2_out: int = 32, fine_tune_vis_module: bool = False, fusion_output_size: int = 35, loss_func=BCELoss(), pre_trained_bert_name='bert-base-uncased')[source]
Parameters:
  • text_fc2_out (int) – size of the second fully connected layer of the text module. Default=32

  • text_fc1_out (int) – size of the first fully connected layer of the text module. Default=2742

  • dropout_p (float) – drop out rate. Default=0.4

  • fine_tune_text_module (bool) – text model fine-tuning or not. Default=False

  • img_fc1_out (int) – size of the first fully connected layer of the visual module. Default=2742

  • img_fc2_out (int) – size of the second fully connected layer of the visual module. Default=32

  • fine_tune_vis_module (bool) – visual model fine-tuning or not. Default=False

  • fusion_output_size (int) – size of the output layer after multimodal fusion. Default=35

  • loss_func – loss function. Default=nn.BCELoss()

  • pre_trained_bert_name – pretrained bert name. Default=”bert-base-uncased”

calculate_loss(data) Tensor[source]

calculate loss

Parameters:

data – batch data

Returns:

loss or a dict of loss if there are multiple losses

Return type:

Union[Tensor, Dict[str, Tensor]]

forward(text: Tensor, mask: Tensor, domain: Tensor)[source]

Forward pass of the SpotFake model.

Parameters:
  • text (Tensor) – Text input. shape=(batch_size, max_len)

  • mask (Tensor) – Attention mask. shape=(batch_size, max_len)

  • domain (Tensor) – Image input. shape=(batch_size, 3, 224, 224)

Returns:

Output predictions. shape=(8,)

Return type:

self.model([text, mask], image=domain) (Tensor)

predict(data_without_label)[source]

predict the probability of being fake news

Parameters:

data_without_label – batch data

Returns:

probability, shape=(batch_size, 2)

Return type:

Tensor

training: bool