faknow.model.content_based

faknow.model.content_based.multi_modal

faknow.model.content_based.endef

class faknow.model.content_based.endef.ENDEF(pre_trained_bert_name: str, base_model: AbstractModel, mlp_dims: List[int] | None = None, dropout_rate=0.2, entity_weight=0.1, loss_weight=0.2)[source]

Bases: AbstractModel

Generalizing to the Future: Mitigating Entity Bias in Fake News Detection, SIGIR 2022 paper: https://dl.acm.org/doi/10.1145/3477495.3531816 code: https://github.com/ICTMCG/ENDEF-SIGIR2022

__init__(pre_trained_bert_name: str, base_model: AbstractModel, mlp_dims: List[int] | None = None, dropout_rate=0.2, entity_weight=0.1, loss_weight=0.2)[source]

Parameters:

pre_trained_bert_name (str) – the name or local path of pre-trained bert model
base_model (AbstractModel) – the base model(content_based) using with entity features
mlp_dims (List[int]) – a list of the dimensions in MLP layer, if None, [384] will be taken as default
dropout_rate (float) – dropout rate. Default=0.2
entity_weight (float) – the weight of entity in train. Default=0.1
loss_weight (float) – the weight of entity in loss. Default=0.2

calculate_loss(data) → Tensor[source]

calculate loss via BCELoss

Parameters:: data (dict) – batch data dict
Returns:: loss value
Return type:: loss (Tensor)

dict_to_dict(inputs: Dict)[source]

change inputs to one layer dict if it’s nesting

Parameters:: inputs (Dict) – dict need to be processed

forward(base_model_params: Dict, entity_token_id: Tensor, entity_mask: Tensor)[source]

Parameters:

base_model_params (Dict) – a dictionary including all param base_model.forward() need
entity_token_id (Tensor) – entity’s token ids from bert tokenizer, shape=(batch, max_len)
entity_mask (Tensor) – mask from bert tokenizer, shape=(batch_size, max_len)

Returns:

unbiased_prediction(Tensor): considering both biased_predction and entity_prediction, prediction of being fake, shape = (batch_size, ).
entity_prediction(Tensor): prediction of entity shape = (batch_size, ).

Return type:

tuple

predict(data_without_label) → Tensor[source]

predict the probability of being fake news

Parameters:: data_without_label (Dict[str, Any]) – batch data dict
Returns:: shape is same as base_model
Return type:: Tensor

training: bool

faknow.model.content_based.m3fend

class faknow.model.content_based.m3fend.M3FEND(emb_dim, mlp_dims, dropout, semantic_num, emotion_num, style_num, LNN_dim, domain_num, dataset)[source]

Bases: AbstractModel

M3FEND: Memory-Guided Multi-View Multi-Domain Fake News Detection paper: https://ieeexplore.ieee.org/document/9802916, TKDE 2022 code: https://github.com/ICTMCG/M3FEND

__init__(emb_dim, mlp_dims, dropout, semantic_num, emotion_num, style_num, LNN_dim, domain_num, dataset)[source]

Parameters:

emb_dim (int) – Dimensionality of the embeddings.
mlp_dims (List[int]) – List of dimensions for the MLP layers.
dropout (float) – Dropout probability.
semantic_num (int) – Number of semantic experts.
emotion_num (int) – Number of emotion experts.
style_num (int) – Number of style experts.
LNN_dim (int) – Dimensionality of the Latent Neural Network (LNN).
domain_num (int) – Number of domains.
dataset (str) – Dataset identifier (‘ch’ for Chinese, ‘en’ for English).

calculate_loss(batch_data) → Tensor[source]

Calculate the loss for the M3FEND model.

Parameters:

batch_data (Dict[str, Tensor]) –
'content' (Input data containing) –
'content_masks' –
'comments' –
'comments_masks' –
'content_emotion' –

:param : :param ‘comments_emotion’: :param ‘emotion_gap’: :param ‘style_feature’: :param ‘category’: :param ‘label’ tensors.:

Returns:: loss
Return type:: Tensor

forward(**kwargs)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

init_memory()[source]: 通过 K-Means 聚类，为每个领域创建一个域内存，该域内存包含了该领域内样本特征的聚类中心这有助于模型学习领域内的代表性特征，提高模型对不同领域数据的适应能力

predict(data_without_label) → Tensor[source]

predict the probability of being fake news

Parameters:: data_without_label (Dict[str, Tensor]) – batch data dict
Returns:: softmax probability, shape=(batch_size, 2)
Return type:: Tensor

save_feature(**kwargs)[source]: 这段代码的作用是将所有样本的归一化特征按照它们的领域（domain）信息保存在self.all_feature 字典中。键对应一个领域（domain）的整数，值是一个包含该领域所有样本特征的列表。每个特征都以 NumPy 数组的形式表示。

training: bool

write(**kwargs)[source]

class faknow.model.content_based.m3fend.MemoryNetwork(input_dim, emb_dim, domain_num, memory_num=10)[source]

Bases: Module

__init__(input_dim, emb_dim, domain_num, memory_num=10)[source]: Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(feature, category)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool

write(all_feature, category)[source]

faknow.model.content_based.m3fend.cal_length(x)[source]

faknow.model.content_based.m3fend.convert_to_onehot(label, batch_size, num)[source]

faknow.model.content_based.m3fend.norm(x)[source]

faknow.model.content_based.mdfend

class faknow.model.content_based.mdfend.MDFEND(pre_trained_bert_name: str, domain_num: int, mlp_dims: List[int] | None = None, dropout_rate=0.2, expert_num=5)[source]

Bases: AbstractModel

MDFEND: Multi-domain Fake News Detection, CIKM 2021 paper: https://dl.acm.org/doi/10.1145/3459637.3482139 code: https://github.com/kennqiang/MDFEND-Weibo21

__init__(pre_trained_bert_name: str, domain_num: int, mlp_dims: List[int] | None = None, dropout_rate=0.2, expert_num=5)[source]

Parameters:

pre_trained_bert_name (str) – the name or local path of pre-trained bert model
domain_num (int) – total number of all domains
mlp_dims (List[int]) – a list of the dimensions in MLP layer, if None, [384] will be taken as default, default=384
dropout_rate (float) – rate of Dropout layer, default=0.2
expert_num (int) – number of experts also called TextCNNLayer, default=5

calculate_loss(data) → Tensor[source]

calculate loss via BCELoss

Parameters:: data (dict) – batch data dict
Returns:: loss value
Return type:: loss (Tensor)

forward(token_id: Tensor, mask: Tensor, domain: Tensor) → Tensor[source]

Parameters:

token_id (Tensor) – token ids from bert tokenizer, shape=(batch_size, max_len)
mask (Tensor) – mask from bert tokenizer, shape=(batch_size, max_len)
domain (Tensor) – domain id, shape=(batch_size,)

Returns:

the prediction of being fake, shape=(batch_size,)

Return type:

FloatTensor

predict(data_without_label) → Tensor[source]

predict the probability of being fake news

Parameters:: data_without_label (Dict[str, Any]) – batch data dict
Returns:: one-hot probability, shape=(batch_size, 2)
Return type:: Tensor

training: bool

faknow.model.content_based.textcnn

class faknow.model.content_based.textcnn.TextCNN(word_vectors: ~torch.Tensor, filter_num=100, kernel_sizes: ~typing.List[int] | None = None, activate_func: ~typing.Callable | None = <function relu>, dropout=0.5, freeze=False)[source]

Bases: AbstractModel

Convolutional Neural Networks for Sentence Classification, EMNLP 2014 paper: https://aclanthology.org/D14-1181/ code: https://github.com/yoonkim/CNN_sentence

__init__(word_vectors: ~torch.Tensor, filter_num=100, kernel_sizes: ~typing.List[int] | None = None, activate_func: ~typing.Callable | None = <function relu>, dropout=0.5, freeze=False)[source]

Parameters:

word_vectors (torch.Tensor) – weights of word embedding layer, shape=(vocab_size, embedding_size)
filter_num (int) – number of filters in conv layer. Default=100
kernel_sizes (List[int]) – list of different kernel_num sizes for TextCNNLayer. Default=[3, 4, 5]
activate_func (Callable) – activate function for TextCNNLayer. Default=relu
dropout (float) – drop out rate of fully connected layer. Default=0.5
freeze (bool) – whether to freeze weights in word embedding layer while training. Default=False

calculate_loss(data) → Tensor[source]

calculate loss via CrossEntropyLoss

Parameters:: data – batch data tuple
Returns:: loss
Return type:: torch.Tensor

forward(text: Tensor) → Tensor[source]

Parameters:: text – batch data, shape=(batch_size, max_len)
Returns:: output, shape=(batch_size, 2)
Return type:: Tensor

predict(data_without_label)[source]

predict the probability of being fake news

Parameters:: data_without_label – batch data
Returns:: softmax probability, shape=(batch_size, 2)
Return type:: Tensor

training: bool