faknow.run.social_context

faknow.run.social_context.run_bigcn

faknow.run.social_context.run_bigcn.run_bigcn(train_data: List, data_path: str, tree_dic: Dict, val_data: List | None = None, test_data: List | None = None, batch_size=128, epochs=200, feature_size=5000, hidden_size=64, output_size=64, td_drop_rate=0.2, bu_drop_rate=0.2, lower=2, upper=100000, lr=0.0005, weight_decay=0.0001, metrics: List | None = None, device='cpu')[source]

run BiGCN, including training, validation and testing. If validation and testing data are not provided, only training is performed.

Parameters:
  • train_data (List) – index list of training nodes.

  • tree_dic (Dict) – the dictionary of graph edge.

  • data_path (str) – path of data doc.

  • val_data (Optional[List]) – index list of validation nodes, default=None

  • test_data (Optional[List]) – index list of test nodes, default=None

  • batch_size (int) – batch size. default=128.

  • epochs (int) – epoch num. default=200.

  • feature_size (int) – the feature size of input. default=5000.

  • hidden_size (int) – the feature size of hidden embedding in RumorGCN. default=64.

  • output_size (int) – the feature size of output embedding in RumorGCN. default=64.

  • td_drop_rate (float) – the dropout rate of TDgraph. default=0.2.

  • bu_drop_rate (float) – the dropout rate of BUgraph. default=0.2.

  • lower (int) – the minimum of graph size. default=2.

  • upper (int) – the maximum of graph size. default=100000.

  • lr (float) – learning rate. default=0.0005.

  • weight_decay (float) – weight decay. default=0.0001.

  • metrics (List) – metrics for evaluation, if None, [‘accuracy’, ‘precision’, ‘recall’, ‘f1’] is used, default=None

  • device (str) – device. default=’cpu’.

faknow.run.social_context.run_bigcn.run_bigcn_from_yaml(path: str)[source]

run BiGCN from yaml config file

Parameters:

path (str) – yaml config file path

faknow.run.social_context.run_dudef

faknow.run.social_context.run_ebgcn

faknow.run.social_context.run_ebgcn.run_ebgcn(train_data: List, data_path: str, tree_dic: Dict, val_data: List | None = None, test_data: List | None = None, batch_size=128, input_size=5000, hidden_size=64, output_size=64, edge_num=2, dropout=0.5, num_class=4, edge_loss_weight=0.2, lr=0.0005, weight_decay=0.1, lr_scale_bu=5, lr_scale_td=1, metrics=None, num_epochs=200, device='cpu')[source]

run EBGCN, including training, validation and testing. If validate_path and test_path are None, only training is performed.

Parameters:
  • train_data (List) – index list of training nodes.

  • tree_dic (Dict) – the dictionary of graph edge.

  • data_path (str) – path of data doc.

  • val_data (Optional[List]) – index list of validation nodes, default=None

  • test_data (Optional[List]) – index list of test nodes, default=None

  • batch_size (int) – batch size. default=128.

  • input_size (int) – the feature size of input. default=5000.

  • hidden_size (int) – the feature size of hidden embedding. default=64.

  • output_size (int) – the feature size of output embedding. default=64.

  • edge_num (int) – the num of edge type. default=2.

  • dropout (float) – dropout rate. default=0.5.

  • num_class (int) – the num of output type. default=4

  • edge_loss_weight (float) – the weight of edge loss. default=0.2.

  • lr (float) – learning rate. default=0.0005.

  • weight_decay (float) – weight decay. default=0.1.

  • lr_scale_bu (int) – learning rate scale for down-top direction. default=5.

  • lr_scale_td (int) – learning rate scale for top-down direction. default=1.

  • metrics (List) – metrics for evaluation, if None, [‘accuracy’, ‘precision’, ‘recall’, ‘f1’] is used, default=None

  • num_epochs (int) – epoch num. default=200.

  • device (str) – device. default=’cpu’.

faknow.run.social_context.run_ebgcn.run_ebgcn_from_yaml(path: str)[source]

run EBGCN from yaml config file

Parameters:

path (str) – yaml config file path

faknow.run.social_context.run_eddfn

faknow.run.social_context.run_eddfn.run_eddfn(train_pool_input: Tensor, train_pool_label: Tensor, domain_embedding: Tensor, budget_size=0.8, num_h=10, batch_size=32, num_epochs=100, lr=0.02, metrics: List | None = None, device='cpu')[source]

run EDDFN

Parameters:
  • train_pool_input (Tensor) – train pool input, shape=(train_pool_size, input_size)

  • train_pool_label (Tensor) – train pool label, shape=(train_pool_size, )

  • domain_embedding (Tensor) – domain embedding, shape=(train_pool_size, domain_size)

  • budget_size (float) – budget size, default=0.8

  • num_h (int) – number of hash functions, default=10

  • batch_size (int) – batch size, default=32

  • num_epochs (int) – number of epochs, default=100

  • lr (float) – learning rate, default=0.02

  • metrics (List) – evaluation metrics, if None, use default metrics, default=None

  • device (str) – device, default=’cpu’

faknow.run.social_context.run_eddfn.run_eddfn_from_yaml(path: str)[source]

run EDDFN from yaml config file

Parameters:

path (str) – yaml config file path

faknow.run.social_context.run_fang

faknow.run.social_context.run_fang.run_fang(data_root: str, metrics=None, lr=0.001, weight_decay=0.0005, batch_size=32, num_epochs=100, input_size=100, embedding_size=16, num_stance=4, num_stance_hidden=4, timestamp_size=2, num_classes=2, dropout=0.0, device='cpu')[source]

run FANG, including training, validation and testing.

If validate_path and test_path are None, only training is performed

Parameters:
  • data_root (str) – the data path. including entities.txt, entity_features.tsv, source_citation.tsv, source_publication.tsv, user_relationships.tsv, news_info.tsv, report.tsv, support_neutral.tsv, support_negative.tsv, deny.tsv. example referring to dataset/example/FANG.

  • metrics (List) – metrics for evaluation, if None, [‘accuracy’, ‘precision’, ‘recall’, ‘f1’] is used, default=None

  • lr (float) – learning rate. default=1e-4.

  • weight_decay (float) – weight dacay. default=5e-4.

  • batch_size (int) – batch size. default=32.

  • num_epochs (int) – epoch num. default=100.

  • input_size (int) – embedding size of raw feature. default=100.

  • embedding_size (int) – graphsage output size. default=16.

  • num_stance (int) – total num of stance. default=4.

  • num_stance_hidden (int) – stance’s embedding size. please let num_stance_hidden * num_stance = embedding_size. default=4.

  • timestamp_size (int) – timestamp’s embedding size. default=2

  • num_classes (int) – label num. default=2.

  • dropout (float) – dropout rate. default=0.1.

  • device (str) – compute device. default=’cpu’.

faknow.run.social_context.run_fang.run_fang_from_yaml(path: str)[source]

run FANG from yaml config file

Parameters:

path (str) – yaml config file path

faknow.run.social_context.run_gcnfn

faknow.run.social_context.run_gcnfn.run_gcnfn(root: str, name: str, feature='content', splits=None, batch_size=128, epochs=100, hidden_size=128, lr=0.001, weight_decay=0.01, metrics: List | None = None, device='cpu')[source]

run GCNFN using UPFD dataset, including training, validation and testing. If validation and testing data are not provided, only training is performed.

Parameters:
  • root (str) – Root directory where the dataset should be saved

  • name (str) – The name of the graph set ("politifact", "gossipcop")

  • feature (str) – The node feature type ("profile", "spacy", "bert", "content") If set to "profile", the 10-dimensional node feature is composed of ten Twitter user profile attributes. If set to "spacy", the 300-dimensional node feature is composed of Twitter user historical tweets encoded by the spaCy word2vec encoder. If set to "bert", the 768-dimensional node feature is composed of Twitter user historical tweets encoded by the bert-as-service. If set to "content", the 310-dimensional node feature is composed of a 300-dimensional “spacy” vector plus a 10-dimensional “profile” vector. default=’content’

  • splits (List[str]) – dataset split, including ‘train’, ‘val’ and ‘test’. If None, [‘train’, ‘val’, ‘test’] will be used. Default=None

  • batch_size (int) – batch size, default=128

  • epochs (int) – number of epochs, default=100

  • hidden_size (int) – dimension of hidden layer, default=128

  • lr (float) – learning rate, default=0.001

  • weight_decay (float) – weight decay, default=0.01

  • metrics (List) – evaluation metrics, if None, [‘accuracy’, ‘precision’, ‘recall’, ‘f1’] is used, default=None

  • device (str) – device, default=’cpu’

faknow.run.social_context.run_gcnfn.run_gcnfn_from_yaml(path: str)[source]

run GCNFN from yaml config file

Parameters:

path (str) – yaml config file path

faknow.run.social_context.run_gnncl

faknow.run.social_context.run_gnncl.run_gnncl(root: str, name: str, feature='profile', splits=None, batch_size=128, max_nodes=500, lr=0.001, epochs=60, metrics: List | None = None, device='cpu')[source]

run GNNCL using UPFD dataset, including training, validation and testing. If validation and testing data are not provided, only training is performed.

Parameters:
  • root (str) – Root directory where the dataset should be saved

  • name (str) – The name of the graph set ("politifact", "gossipcop")

  • feature (str) –

    The node feature type ("profile", "spacy", "bert", "content") If set to "profile", the 10-dimensional node feature is composed of ten Twitter user profile attributes. If set to "spacy", the 300-dimensional node feature is composed of Twitter user historical tweets encoded by the spaCy word2vec encoder. If set to "bert", the 768-dimensional node feature is composed of Twitter user historical tweets encoded by the bert-as-service. If set to "content", the 310-dimensional node feature is composed of a 300-dimensional “spacy” vector plus a 10-dimensional “profile” vector. default=’profile’

  • splits (List[str]) – dataset split, including ‘train’, ‘val’ and ‘test’. If None, [‘train’, ‘val’, ‘test’] will be used. Default=None

  • batch_size (int) – batch size, default=128

  • max_nodes (int) – max number of nodes, default=500

  • lr (float) – learning rate, default=0.001

  • epochs (int) – number of epochs, default=60

  • metrics (List) – evaluation metrics, if None, [‘accuracy’, ‘precision’, ‘recall’, ‘f1’] is used, default=None

  • device (str) – device, default=’cpu’

faknow.run.social_context.run_gnncl.run_gnncl_from_yaml(path: str)[source]

run GNNCL from yaml config file

Parameters:

path (str) – yaml config file path

faknow.run.social_context.run_trustrd

faknow.run.social_context.run_trustrd.run_trustrd(train_data: List, data_path: str, tree_dic: Dict, val_data: List | None = None, test_data: List | None = None, sigma_m=0.1, eta=0.4, zeta=0.02, drop_rate=0.4, input_feature=192, hidden_feature=64, num_classes=4, batch_size=128, pre_train_epoch=25, epochs=200, net_hidden_dim=64, net_gcn_layers=3, lr=0.0005, weight_decay=0.0001, metrics: List | None = None, device='cpu')[source]
Parameters:
  • train_data (List) – index list of training nodes.

  • tree_dic (Dict) – the dictionary of graph edge.

  • data_path (str) – path of data doc.

  • val_data (Optional[List]) – index list of validation nodes, default=None

  • test_data (Optional[List]) – index list of test nodes, default=None

  • sigma_m (float) – data perturbation Standard Deviation. default=0.1

  • eta (float) – data perturbation weight. default=0.4.

  • zeta (float) – parameter perturbation weight. default=0.02

  • drop_rate (float) – drop rate of edge. default=0.4.

  • input_feature (int) – the feature size of input. default=192.

  • hidden_feature (int) – the feature size of hidden embedding. default=64.

  • num_classes (int) – the num of class. default=4.

  • batch_size (int) – batch size. default=128.

  • pre_train_epoch (int) – pretrained epoch num. default=25.

  • epochs (int) – epoch num. default=200.

  • net_hidden_dim (int) – the feature size of hidden embedding. defult=64.

  • net_gcn_layers (int) – the gcn encoder layer num. default=3.

  • lr (float) – learning rate. default=0.0005.

  • weight_decay (float) – weight decay. default=1e-4.

  • metrics (List) – metrics for evaluation, if None, [‘accuracy’, ‘precision’, ‘recall’, ‘f1’] is used, default=None

  • device (str) – device. default=’cpu’.

faknow.run.social_context.run_trustrd.run_trustrd_from_yaml(path: str)[source]

run TrustRD from yaml config file

Parameters:

path (str) – yaml config file path

faknow.run.social_context.run_upfd

faknow.run.social_context.run_upfd.run_upfd(root: str, name: str, feature='bert', splits=None, base_model='sage', batch_size=128, epochs=30, lr=0.01, weight_decay=0.01, metrics: List | None = None, device='cpu')[source]

run UPFD using UPFD dataset, including training, validation and testing. If validation and testing data are not provided, only training is performed.

Parameters:
  • root (str) – Root directory where the dataset should be saved

  • name (str) – The name of the graph set ("politifact", "gossipcop")

  • feature (str) –

    The node feature type ("profile", "spacy", "bert", "content") If set to "profile", the 10-dimensional node feature is composed of ten Twitter user profile attributes. If set to "spacy", the 300-dimensional node feature is composed of Twitter user historical tweets encoded by the spaCy word2vec encoder. If set to "bert", the 768-dimensional node feature is composed of Twitter user historical tweets encoded by the bert-as-service. If set to "content", the 310-dimensional node feature is composed of a 300-dimensional “spacy” vector plus a 10-dimensional “profile” vector. default=’bert’

  • splits (List[str]) – dataset split, including ‘train’, ‘val’ and ‘test’. If None, [‘train’, ‘val’, ‘test’] will be used, default=None

  • base_model (str) – base model for UPFD, including ‘sage’, ‘gcn’, ‘gat’, ‘gcnfn’, default=’sage’

  • batch_size (int) – batch size, default=128

  • epochs (int) – number of epochs, default=30

  • lr (float) – learning rate, default=0.01

  • weight_decay (float) – weight decay, default=0.01

  • metrics (List) – evaluation metrics, if None, [‘accuracy’, ‘precision’, ‘recall’, ‘f1’] is used, default=None

  • device (str) – device, default=’cpu’

faknow.run.social_context.run_upfd.run_upfd_from_yaml(path: str)[source]

run UPFD from yaml config file

Parameters:

path (str) – yaml config file path