multimodal

multimodal.run_eann

class multimodal.run_eann.TokenizerEANN(vocab: Dict[str, int], max_len=255, stop_words: List[str] | None = None, language='zh')[source]

Bases: object

tokenizer for EANN

__init__(vocab: Dict[str, int], max_len=255, stop_words: List[str] | None = None, language='zh') None[source]
Parameters:
  • vocab (Dict[str, int]) – vocabulary of the corpus

  • max_len (int) – max length of the text, default=255

  • stop_words (List[str]) – stop words, default=None

  • language (str) – language of the corpus, ‘zh’ or ‘en’, default=’zh’

multimodal.run_eann.run_eann(train_path: str, vocab: Dict[str, int], stop_words: List[str], word_vectors: Tensor, language='zh', max_len=255, batch_size=100, event_num: int | None = None, lr=0.001, num_epochs=100, metrics: List | None = None, validate_path: str | None = None, test_path: str | None = None, device='cpu') None[source]

run EANN, including training, validation and testing. If validate_path and test_path are None, only training is performed.

Parameters:
  • train_path (str) – path of the training set

  • vocab (Dict[str, int]) – vocabulary of the corpus

  • stop_words (List[str]) – stop words

  • word_vectors (torch.Tensor) – word vectors

  • language (str) – language of the corpus, ‘zh’ or ‘en’, default=’zh’

  • max_len (int) – max length of the text, default=255

  • batch_size (int) – batch size, default=100

  • event_num (int) – number of events, default=None

  • lr (float) – learning rate, default=0.001

  • num_epochs (int) – number of epochs, default=100

  • metrics (List) – metrics, if None, [‘accuracy’, ‘precision’, ‘recall’, ‘f1’] is used, default=None

  • validate_path (str) – path of the validation set, default=None

  • test_path (str) – path of the test set, default=None

  • device (str) – device, default=’cpu’

multimodal.run_eann.run_eann_from_yaml(path: str) None[source]

run EANN from yaml config file

Parameters:

path (str) – yaml config file path

multimodal.run_eann.transform_eann(path: str) Tensor[source]

transform image to tensor for EANN

Parameters:

path (str) – image path

Returns:

tensor of the image, shape=(3, 224, 224)

Return type:

torch.Tensor

multimodal.run_mcan

multimodal.run_mcan.get_optimizer_mcan(model: MCAN, lr=0.0001, weight_decay=0.15, bert_lr=1e-05, vgg_lr=1e-05, dtc_lr=1e-05, fusion_lr=0.01, linear_lr=0.01, classifier_lr=0.01)[source]

generate optimizer for MCAN

Parameters:
  • model (MCAN) – MCAN model

  • lr (float) – learning rate, default=0.0001

  • weight_decay (float) – weight decay, default=0.15

  • bert_lr (float) – learning rate of bert, default=1e-5

  • vgg_lr (float) – learning rate of vgg, default=1e-5

  • dtc_lr (float) – learning rate of dct, default=1e-5

  • fusion_lr (float) – learning rate of fusion layers, default=1e-2

  • linear_lr (float) – learning rate of linear layers, default=1e-2

  • classifier_lr (float) – learning rate of classifier layers, default=1e-2

Returns:

optimizer for MCAN

Return type:

torch.optim.Optimizer

multimodal.run_mcan.process_dct_mcan(img: Tensor) Tensor[source]

process image with dct(Discrete Cosine Transform) for MCAN

Parameters:

img (torch.Tensor) – image tensor to be processed

Returns:

dct processed image tensor

Return type:

torch.Tensor

multimodal.run_mcan.run_mcan(train_path: str, bert='bert-base-chinese', max_len=255, batch_size=16, num_epochs=100, metrics: List | None = None, validate_path: str | None = None, test_path: str | None = None, device='cpu', **optimizer_kargs)[source]

run MCAN

Parameters:
  • train_path (str) – path of training data

  • bert (str) – bert model, default=’bert-base-chinese’

  • max_len (int) – max length of text, default=255

  • batch_size (int) – batch size, default=16

  • num_epochs (int) – number of epochs, default=100

  • metrics (List) – metrics, if None, [‘accuracy’, ‘precision’, ‘recall’, ‘f1’] will be used, default=None

  • validate_path (str) – path of validation data, default=None

  • test_path (str) – path of test data, default=None

  • device (str) – device, default=’cpu’

  • **optimizer_kargs – optimizer kargs

multimodal.run_mcan.run_mcan_from_yaml(path: str)[source]

run MCAN from yaml file

Parameters:

path (str) – path of yaml file

multimodal.run_mcan.text_preprocessing(texts: List[str])[source]
multimodal.run_mcan.transform_mcan(path: str) Dict[str, Tensor][source]

transform image to tensor for MCAN

Parameters:

path (str) – path of the image

Returns:

transformed image with key ‘vgg’ and ‘dct’

Return type:

Dict[str, torch.Tensor]

multimodal.run_mfan

class multimodal.run_mfan.TokenizerMFAN(vocab: Dict[str, int], max_len=50, stop_words: List[str] | None = None, language='zh')[source]

Bases: object

Tokenizer for MFAN

__init__(vocab: Dict[str, int], max_len=50, stop_words: List[str] | None = None, language='zh') None[source]
Parameters:
  • vocab (Dict[str, int]) – vocabulary dict

  • max_len (int) – max length of text, default=50

  • stop_words (List[str]) – stop words list, default=None

  • language (str) – language of text, ‘zh’ or ‘en’, default=’zh’

multimodal.run_mfan.load_adj_matrix_mfan(path: str, node_num: int)[source]

load adjacence matrix for MFAN

Parameters:
  • path (str) – path of adjacence list file

  • node_num (int) – number of nodes

Returns:

adjacence matrix, shape=(node_num, node_num)

Return type:

torch.Tensor

multimodal.run_mfan.run_mfan(train_path: str, node_embedding: Tensor, node_num: int, adj_matrix: Tensor, vocab: Dict[str, int], word_vectors: Tensor, max_len=50, batch_size=64, num_epochs=20, lr=0.002, metrics: List | None = None, validate_path: str | None = None, test_path: str | None = None, device='cpu')[source]

run MFAN, including training, validation and testing. If validate_path and test_path are None, only training is performed.

Parameters:
  • train_path (str) – path of train data

  • node_embedding (torch.Tensor) – node embedding, shape=(node_num, node_embedding_dim)

  • node_num (int) – number of nodes

  • adj_matrix (torch.Tensor) – adjacence matrix, shape=(node_num, node_num)

  • vocab (Dict[str, int]) – vocabulary dict

  • word_vectors (torch.Tensor) – word vectors, shape=(vocab_size, word_vector_dim)

  • max_len (int) – max length of text, default=50

  • batch_size (int) – batch size, default=64

  • num_epochs (int) – number of epochs, default=20

  • lr (float) – learning rate, default=2e-3

  • metrics (List) – metrics to evaluate, if None, [‘accuracy’, ‘precision’, ‘recall’, ‘f1’] is used, default=None

  • validate_path (str) – path of validate data, default=None

  • test_path (str) – path of test data, default=None

  • device (str) – device to run, default=’cpu’

multimodal.run_mfan.run_mfan_from_yaml(path: str)[source]

run MFAN from yaml config file

Parameters:

path (str) – yaml config file path

multimodal.run_mfan.transform_mfan(path: str) Tensor[source]

transform image to tensor for EANN

Parameters:

path (str) – image path

Returns:

tensor of the image, shape=(3, 224, 224)

Return type:

torch.Tensor

multimodal.run_safe

multimodal.run_safe.run_safe(train_path: str, validate_path: str | None = None, test_path: str | None = None, embedding_size: int = 300, conv_in_size: int = 32, filter_num: int = 128, cnn_out_size: int = 200, dropout: float = 0.0, loss_weights: List[float] | None = None, batch_size=64, lr=0.00025, metrics: List | None = None, num_epochs=100, device='cpu')[source]

Train and evaluate the SAFE model.

Parameters:
  • train_path (str) – Path to the training data.

  • validate_path (str, optional) – Path to the validation data. Defaults to None.

  • test_path (str, optional) – Path to the test data. Defaults to None.

  • embedding_size (int, optional) – Size of the embedding for SAFE model. Defaults to 300.

  • conv_in_size (int, optional) – Size of the input for convolutional layer. Defaults to 32.

  • filter_num (int, optional) – Number of filters for convolutional layer. Defaults to 128.

  • cnn_out_size (int, optional) – Size of the output for the convolutional layer. Defaults to 200.

  • dropout (float, optional) – Dropout probability. Defaults to 0.0.

  • loss_weights (List[float], optional) – List of loss weights. Defaults to None.

  • batch_size (int, optional) – Batch size. Defaults to 64.

  • lr (float, optional) – Learning rate. Defaults to 0.00025.

  • metrics (List, optional) – List of evaluation metrics. Defaults to None.

  • num_epochs (int, optional) – Number of training epochs. Defaults to 100.

  • device (str, optional) – Device to run the training on (‘cpu’ or ‘cuda’). Defaults to ‘cpu’.

multimodal.run_safe.run_safe_from_yaml(path: str)[source]

Load SAFE configuration from YAML file and run the training and evaluation.

Parameters:

path (str) – Path to the YAML configuration file.

multimodal.run_spotfake

multimodal.run_spotfake.run_spotfake(train_path: str, validate_path: str | None = None, test_path: str | None = None, text_fc2_out: int = 32, text_fc1_out: int = 2742, dropout_p: float = 0.4, fine_tune_text_module: bool = False, img_fc1_out: int = 2742, img_fc2_out: int = 32, fine_tune_vis_module: bool = False, fusion_output_size: int = 35, loss_func=BCELoss(), pre_trained_bert_name='bert-base-uncased', batch_size=8, epochs=50, max_len=500, lr=3e-05, metrics: List | None = None, device='cpu')[source]

Train and evaluate the SpotFake model.

Parameters:
  • train_path (str) – Path to the training data.

  • validate_path (str, optional) – Path to the validation data. Defaults to None.

  • test_path (str, optional) – Path to the test data. Defaults to None.

  • text_fc2_out (int, optional) – Output size for the text FC2 layer. Defaults to 32.

  • text_fc1_out (int, optional) – Output size for the text FC1 layer. Defaults to 2742.

  • dropout_p (float, optional) – Dropout probability. Defaults to 0.4.

  • fine_tune_text_module (bool, optional) – Fine-tune text module. Defaults to False.

  • img_fc1_out (int, optional) – Output size for the image FC1 layer. Defaults to 2742.

  • img_fc2_out (int, optional) – Output size for the image FC2 layer. Defaults to 32.

  • fine_tune_vis_module (bool, optional) – Fine-tune visual module. Defaults to False.

  • fusion_output_size (int, optional) – Output size for the fusion layer. Defaults to 35.

  • loss_func (nn.Module, optional) – Loss function. Defaults to nn.BCELoss().

  • pre_trained_bert_name (str, optional) – Name of the pre-trained BERT model. Defaults to “bert-base-uncased”.

  • batch_size (int, optional) – Batch size. Defaults to 8.

  • epochs (int, optional) – Number of training epochs. Defaults to 50.

  • max_len (int, optional) – Maximum length for tokenization. Defaults to 500.

  • lr (float, optional) – Learning rate. Defaults to 3e-5.

  • metrics (List, optional) – List of evaluation metrics. Defaults to None.

  • device (str, optional) – Device to run the training on (‘cpu’ or ‘cuda’). Defaults to ‘cpu’.

multimodal.run_spotfake.run_spotfake_from_yaml(path: str)[source]

Load SpotFake configuration from a YAML file and run the training and evaluation.

Parameters:

path (str) – Path to the YAML configuration file.

multimodal.run_spotfake.text_preprocessing(texts: List[str])[source]

Preprocess the given text.

  • Remove entity ‘@’ symbols (e.g., “@united”)

  • Correct errors (e.g., ‘&’ to ‘&’)

Parameters:

texts (List[str]) – a list of texts to be processed.

Returns:

The preprocessed texts.

Return type:

List[str]

multimodal.run_spotfake.transform_spotfake(path: str) Tensor[source]

Transform the image data at the given path.

Parameters:

path (str) – Path to the image file.

Returns:

Transformed image data.

Return type:

torch.Tensor