faknow.run.social_context
faknow.run.social_context.run_bigcn
- faknow.run.social_context.run_bigcn.run_bigcn(train_data: List, data_path: str, tree_dic: Dict, val_data: List | None = None, test_data: List | None = None, batch_size=128, epochs=200, feature_size=5000, hidden_size=64, output_size=64, td_drop_rate=0.2, bu_drop_rate=0.2, lower=2, upper=100000, lr=0.0005, weight_decay=0.0001, metrics: List | None = None, device='cpu')[source]
run BiGCN, including training, validation and testing. If validation and testing data are not provided, only training is performed.
- Parameters:
train_data (List) – index list of training nodes.
tree_dic (Dict) – the dictionary of graph edge.
data_path (str) – path of data doc.
val_data (Optional[List]) – index list of validation nodes, default=None
test_data (Optional[List]) – index list of test nodes, default=None
batch_size (int) – batch size. default=128.
epochs (int) – epoch num. default=200.
feature_size (int) – the feature size of input. default=5000.
hidden_size (int) – the feature size of hidden embedding in RumorGCN. default=64.
output_size (int) – the feature size of output embedding in RumorGCN. default=64.
td_drop_rate (float) – the dropout rate of TDgraph. default=0.2.
bu_drop_rate (float) – the dropout rate of BUgraph. default=0.2.
lower (int) – the minimum of graph size. default=2.
upper (int) – the maximum of graph size. default=100000.
lr (float) – learning rate. default=0.0005.
weight_decay (float) – weight decay. default=0.0001.
metrics (List) – metrics for evaluation, if None, [‘accuracy’, ‘precision’, ‘recall’, ‘f1’] is used, default=None
device (str) – device. default=’cpu’.
faknow.run.social_context.run_dudef
faknow.run.social_context.run_ebgcn
- faknow.run.social_context.run_ebgcn.run_ebgcn(train_data: List, data_path: str, tree_dic: Dict, val_data: List | None = None, test_data: List | None = None, batch_size=128, input_size=5000, hidden_size=64, output_size=64, edge_num=2, dropout=0.5, num_class=4, edge_loss_weight=0.2, lr=0.0005, weight_decay=0.1, lr_scale_bu=5, lr_scale_td=1, metrics=None, num_epochs=200, device='cpu')[source]
run EBGCN, including training, validation and testing. If validate_path and test_path are None, only training is performed.
- Parameters:
train_data (List) – index list of training nodes.
tree_dic (Dict) – the dictionary of graph edge.
data_path (str) – path of data doc.
val_data (Optional[List]) – index list of validation nodes, default=None
test_data (Optional[List]) – index list of test nodes, default=None
batch_size (int) – batch size. default=128.
input_size (int) – the feature size of input. default=5000.
hidden_size (int) – the feature size of hidden embedding. default=64.
output_size (int) – the feature size of output embedding. default=64.
edge_num (int) – the num of edge type. default=2.
dropout (float) – dropout rate. default=0.5.
num_class (int) – the num of output type. default=4
edge_loss_weight (float) – the weight of edge loss. default=0.2.
lr (float) – learning rate. default=0.0005.
weight_decay (float) – weight decay. default=0.1.
lr_scale_bu (int) – learning rate scale for down-top direction. default=5.
lr_scale_td (int) – learning rate scale for top-down direction. default=1.
metrics (List) – metrics for evaluation, if None, [‘accuracy’, ‘precision’, ‘recall’, ‘f1’] is used, default=None
num_epochs (int) – epoch num. default=200.
device (str) – device. default=’cpu’.
faknow.run.social_context.run_eddfn
- faknow.run.social_context.run_eddfn.run_eddfn(train_pool_input: Tensor, train_pool_label: Tensor, domain_embedding: Tensor, budget_size=0.8, num_h=10, batch_size=32, num_epochs=100, lr=0.02, metrics: List | None = None, device='cpu')[source]
run EDDFN
- Parameters:
train_pool_input (Tensor) – train pool input, shape=(train_pool_size, input_size)
train_pool_label (Tensor) – train pool label, shape=(train_pool_size, )
domain_embedding (Tensor) – domain embedding, shape=(train_pool_size, domain_size)
budget_size (float) – budget size, default=0.8
num_h (int) – number of hash functions, default=10
batch_size (int) – batch size, default=32
num_epochs (int) – number of epochs, default=100
lr (float) – learning rate, default=0.02
metrics (List) – evaluation metrics, if None, use default metrics, default=None
device (str) – device, default=’cpu’
faknow.run.social_context.run_fang
- faknow.run.social_context.run_fang.run_fang(data_root: str, metrics=None, lr=0.001, weight_decay=0.0005, batch_size=32, num_epochs=100, input_size=100, embedding_size=16, num_stance=4, num_stance_hidden=4, timestamp_size=2, num_classes=2, dropout=0.0, device='cpu')[source]
run FANG, including training, validation and testing.
If validate_path and test_path are None, only training is performed
- Parameters:
data_root (str) – the data path. including entities.txt, entity_features.tsv, source_citation.tsv, source_publication.tsv, user_relationships.tsv, news_info.tsv, report.tsv, support_neutral.tsv, support_negative.tsv, deny.tsv. example referring to dataset/example/FANG.
metrics (List) – metrics for evaluation, if None, [‘accuracy’, ‘precision’, ‘recall’, ‘f1’] is used, default=None
lr (float) – learning rate. default=1e-4.
weight_decay (float) – weight dacay. default=5e-4.
batch_size (int) – batch size. default=32.
num_epochs (int) – epoch num. default=100.
input_size (int) – embedding size of raw feature. default=100.
embedding_size (int) – graphsage output size. default=16.
num_stance (int) – total num of stance. default=4.
num_stance_hidden (int) – stance’s embedding size. please let num_stance_hidden * num_stance = embedding_size. default=4.
timestamp_size (int) – timestamp’s embedding size. default=2
num_classes (int) – label num. default=2.
dropout (float) – dropout rate. default=0.1.
device (str) – compute device. default=’cpu’.
faknow.run.social_context.run_gcnfn
- faknow.run.social_context.run_gcnfn.run_gcnfn(root: str, name: str, feature='content', splits=None, batch_size=128, epochs=100, hidden_size=128, lr=0.001, weight_decay=0.01, metrics: List | None = None, device='cpu')[source]
run GCNFN using UPFD dataset, including training, validation and testing. If validation and testing data are not provided, only training is performed.
- Parameters:
root (str) – Root directory where the dataset should be saved
name (str) – The name of the graph set (
"politifact"
,"gossipcop"
)feature (str) – The node feature type (
"profile"
,"spacy"
,"bert"
,"content"
) If set to"profile"
, the 10-dimensional node feature is composed of ten Twitter user profile attributes. If set to"spacy"
, the 300-dimensional node feature is composed of Twitter user historical tweets encoded by the spaCy word2vec encoder. If set to"bert"
, the 768-dimensional node feature is composed of Twitter user historical tweets encoded by the bert-as-service. If set to"content"
, the 310-dimensional node feature is composed of a 300-dimensional “spacy” vector plus a 10-dimensional “profile” vector. default=’content’splits (List[str]) – dataset split, including ‘train’, ‘val’ and ‘test’. If None, [‘train’, ‘val’, ‘test’] will be used. Default=None
batch_size (int) – batch size, default=128
epochs (int) – number of epochs, default=100
hidden_size (int) – dimension of hidden layer, default=128
lr (float) – learning rate, default=0.001
weight_decay (float) – weight decay, default=0.01
metrics (List) – evaluation metrics, if None, [‘accuracy’, ‘precision’, ‘recall’, ‘f1’] is used, default=None
device (str) – device, default=’cpu’
faknow.run.social_context.run_gnncl
- faknow.run.social_context.run_gnncl.run_gnncl(root: str, name: str, feature='profile', splits=None, batch_size=128, max_nodes=500, lr=0.001, epochs=60, metrics: List | None = None, device='cpu')[source]
run GNNCL using UPFD dataset, including training, validation and testing. If validation and testing data are not provided, only training is performed.
- Parameters:
root (str) – Root directory where the dataset should be saved
name (str) – The name of the graph set (
"politifact"
,"gossipcop"
)feature (str) –
The node feature type (
"profile"
,"spacy"
,"bert"
,"content"
) If set to"profile"
, the 10-dimensional node feature is composed of ten Twitter user profile attributes. If set to"spacy"
, the 300-dimensional node feature is composed of Twitter user historical tweets encoded by the spaCy word2vec encoder. If set to"bert"
, the 768-dimensional node feature is composed of Twitter user historical tweets encoded by the bert-as-service. If set to"content"
, the 310-dimensional node feature is composed of a 300-dimensional “spacy” vector plus a 10-dimensional “profile” vector. default=’profile’splits (List[str]) – dataset split, including ‘train’, ‘val’ and ‘test’. If None, [‘train’, ‘val’, ‘test’] will be used. Default=None
batch_size (int) – batch size, default=128
max_nodes (int) – max number of nodes, default=500
lr (float) – learning rate, default=0.001
epochs (int) – number of epochs, default=60
metrics (List) – evaluation metrics, if None, [‘accuracy’, ‘precision’, ‘recall’, ‘f1’] is used, default=None
device (str) – device, default=’cpu’
faknow.run.social_context.run_trustrd
- faknow.run.social_context.run_trustrd.run_trustrd(train_data: List, data_path: str, tree_dic: Dict, val_data: List | None = None, test_data: List | None = None, sigma_m=0.1, eta=0.4, zeta=0.02, drop_rate=0.4, input_feature=192, hidden_feature=64, num_classes=4, batch_size=128, pre_train_epoch=25, epochs=200, net_hidden_dim=64, net_gcn_layers=3, lr=0.0005, weight_decay=0.0001, metrics: List | None = None, device='cpu')[source]
- Parameters:
train_data (List) – index list of training nodes.
tree_dic (Dict) – the dictionary of graph edge.
data_path (str) – path of data doc.
val_data (Optional[List]) – index list of validation nodes, default=None
test_data (Optional[List]) – index list of test nodes, default=None
sigma_m (float) – data perturbation Standard Deviation. default=0.1
eta (float) – data perturbation weight. default=0.4.
zeta (float) – parameter perturbation weight. default=0.02
drop_rate (float) – drop rate of edge. default=0.4.
input_feature (int) – the feature size of input. default=192.
hidden_feature (int) – the feature size of hidden embedding. default=64.
num_classes (int) – the num of class. default=4.
batch_size (int) – batch size. default=128.
pre_train_epoch (int) – pretrained epoch num. default=25.
epochs (int) – epoch num. default=200.
net_hidden_dim (int) – the feature size of hidden embedding. defult=64.
net_gcn_layers (int) – the gcn encoder layer num. default=3.
lr (float) – learning rate. default=0.0005.
weight_decay (float) – weight decay. default=1e-4.
metrics (List) – metrics for evaluation, if None, [‘accuracy’, ‘precision’, ‘recall’, ‘f1’] is used, default=None
device (str) – device. default=’cpu’.
faknow.run.social_context.run_upfd
- faknow.run.social_context.run_upfd.run_upfd(root: str, name: str, feature='bert', splits=None, base_model='sage', batch_size=128, epochs=30, lr=0.01, weight_decay=0.01, metrics: List | None = None, device='cpu')[source]
run UPFD using UPFD dataset, including training, validation and testing. If validation and testing data are not provided, only training is performed.
- Parameters:
root (str) – Root directory where the dataset should be saved
name (str) – The name of the graph set (
"politifact"
,"gossipcop"
)feature (str) –
The node feature type (
"profile"
,"spacy"
,"bert"
,"content"
) If set to"profile"
, the 10-dimensional node feature is composed of ten Twitter user profile attributes. If set to"spacy"
, the 300-dimensional node feature is composed of Twitter user historical tweets encoded by the spaCy word2vec encoder. If set to"bert"
, the 768-dimensional node feature is composed of Twitter user historical tweets encoded by the bert-as-service. If set to"content"
, the 310-dimensional node feature is composed of a 300-dimensional “spacy” vector plus a 10-dimensional “profile” vector. default=’bert’splits (List[str]) – dataset split, including ‘train’, ‘val’ and ‘test’. If None, [‘train’, ‘val’, ‘test’] will be used, default=None
base_model (str) – base model for UPFD, including ‘sage’, ‘gcn’, ‘gat’, ‘gcnfn’, default=’sage’
batch_size (int) – batch size, default=128
epochs (int) – number of epochs, default=30
lr (float) – learning rate, default=0.01
weight_decay (float) – weight decay, default=0.01
metrics (List) – evaluation metrics, if None, [‘accuracy’, ‘precision’, ‘recall’, ‘f1’] is used, default=None
device (str) – device, default=’cpu’