dgld.utils.sample

This is a program about sample random walk.

class dgld.utils.sample.BaseSubGraphSampling[source]

Bases: object

An abstract class for writing transforms on subgraph sampling.

class dgld.utils.sample.CoLASubGraphSampling(length=4)[source]

Bases: BaseSubGraphSampling

we adopt random walk with restart (RWR) as local subgraph sampling strategy due to its usability and efficiency. we fixed the size 𝑆 of the sampled subgraph (number of nodes in the subgraph) to 4. For isolated nodes or the nodes which belong to a community with a size smaller than the predetermined subgraph size, we sample the available nodes repeatedly until an overlapping subgraph with the set size is obtained.” described in [CoLA Anomaly Detection on Attributed Networks via Contrastive Self-Supervised Learning](https://arxiv.org/abs/2103.00113)

Parameters

length (int) – size of subgraph

Examples

>>> cola_sampler = CoLASubGraphSampling()
>>> g = dgl.graph(([0, 1, 2, 3, 6], [1, 2, 3, 4, 0]))
>>> g = dgl.add_reverse_edges(g)
>>> g = dgl.add_self_loop(g)
>>> ans = cola_sampler(g, [1, 2, 3, 5])
>>> print(ans)
>>> [[1, 0, 2, 3], [2, 1, 0, 6], [3, 1, 2, 0], [5, 5, 5, 5]]
class dgld.utils.sample.SLGAD_SubGraphSampling(length=4)[source]

Bases: BaseSubGraphSampling

we adopt random walk with restart (RWR) as local subgraph sampling strategy due to its usability and efficiency. we fixed the size 𝑆 of the sampled subgraph (number of nodes in the subgraph) to 4. For isolated nodes or the nodes which belong to a community with a size smaller than the predetermined subgraph size, we sample the available nodes repeatedly until an overlapping subgraph with the set size is obtained.” described in [CoLA Anomaly Detection on Attributed Networks via Contrastive Self-Supervised Learning](https://arxiv.org/abs/2103.00113)

Parameters

length (int) – size of subgraph

class dgld.utils.sample.UniformNeighborSampling(length=4)[source]

Bases: BaseSubGraphSampling

Uniform sampling Neighbors to generate subgraph.

Parameters

length (int) – the size of subgraph (default 4)

dgld.utils.sample.generate_random_walk(g, start_nodes, length, multi_length, restart_prob, Q=None)[source]

get random walk from block of target node by mutliThread accelerating and store in Queue if necessary

Parameters
  • g (dgl.graph) – the graph to generate random walk

  • start_nodes_block (list) – target node to generate random walk

  • length (int) – the size of subgraph, default 4

  • multi_length (int) – multitime of subgraph to get more node

  • restart_prob (float) – probability of restart, which means return to target node after each hip

  • Q (multiprocessing.Queue) – Queue to store random walk, default None

Returns

rwl – random walk from target nodes

Return type

list[list]

dgld.utils.sample.generate_random_walk_multiThread(g, start_nodes, length, multi_length, restart_prob)[source]

get random walk from block of target node, by mutliThread accelerating

Parameters
  • g (dgl.graph) – the graph to generate random walk

  • start_nodes_block (list) – target node to generate random walk

  • length (int) – the size of subgraph, default 4

  • multi_length (int) – multitime of subgraph to get more node

  • restart_prob (float) – probability of restart, which means return to target node after each hip

Returns

rwl – random walk from target nodes

Return type

list[list]

dgld.utils.sample.generate_random_walk_multiThread_high_level(g, start_nodes_block, paces_block, length, multi_length, restart_prob, Q=None)[source]

get random walk from block of target node by mutliThread accelerating, generate new random walk if the length of pace from target node is not enough, and store in Queue if necessary

Parameters
  • g (dgl.graph) – the graph to generate random walk

  • start_nodes_block (list) – target node to generate random walk

  • paces_block (list[list]) – rough paces_block from target node

  • length (int) – the size of subgraph, default 4

  • multi_length (int) – multitime of subgraph to get more node

  • restart_prob (float) – probability of restart, which means return to target node after each hip

  • Q (multiprocessing.Queue) – Queue to store random walk, default None

Returns

rwl – random walk from target nodes

Return type

list[list]

dgld.utils.sample.generate_random_walk_singleThread(g, start_nodes, length, multi_length, restart_prob, Q=None)[source]

get random walk from block of target node by mutliThread accelerating and store in Queue if necessary

Parameters
  • g (dgl.graph) – the graph to generate random walk

  • start_nodes_block (list) – target node to generate random walk

  • length (int) – the size of subgraph, default 4

  • multi_length (int) – multitime of subgraph to get more node

  • restart_prob (float) – probability of restart, which means return to target node after each hip

  • Q (multiprocessing.Queue) – Queue to store random walk, default None

Returns

rwl – random walk from target nodes

Return type

list[list]