WIPER

What is WIPER?
WHY use WIPER?
What is the input and output?
What are the parameters?

WIPER is the abbreviation of Weighted In-Path Edge Ranking, which can help evaluate all the biomolecular interactions/associations ("edges") in a network model to generate a rank order of them, based on their in-path traversal scores and a statistical significance test result.

WHY USE WIPER?

WIPER can help biology users to highlight potentially critical molecular signaling or gene regulatory events in the network model. To validate whether WIPER worked as designed, we first tested the algorithm on a synthetic network model. Our test results showed that WIPER can reliably discover both critical “well traversed in-path edges” which are statistically more traversed than normal edges and “peripheral in-path edges” which are less traversed than normal edges. Compared with other simple measures such as betweenness centrality, WIPER provides better biological interpretations.

What is the input and output?

WIPER can take the input of weighted edges and rank them in a probabilistic network. In contrast, a conventional “edge betweenness” measurement that simply counts the number of shortest paths for any given edge would not be useful in practice. This makes WIPER a more pragmatic tool to solve biomedical problems such as finding a therapeutic strategy by "targeting the right interactions in the interactome". WIPER can provide a list of ranked-order edges and attributes together as follows:

Attributes	Description
Edge	The edge identifier using the colon to separate the two endpoints, e.g. a:b.
Degree	The edge degree calculated in the "edge-to-edge" network
W[0]	WIPER's initial edge score calculated by the method same to r_p score calculation described.
UFC	Usage Fold Change score calculate using the WIPER score divided by the medium of the WIPER scores' distribution.
logUFC[0]	log2-based Usage Fold Change score
UFC[0] rank	The WIPER edge ranking based on UFC[0] score
UFC[0] rank in network	The WIPER edge ranking based on UFC[0] score in each isolated networks, e.g. 1(a:b) means the edge is ranked the 1st in the network containing the edge a:b.
p-value[0]	The UFC[0] score's p-value from the distribution of UFC[0] scores.
significance[0]	The UFC[0] score's significance from the distribution of UFC scores. - represents p-value>0.05, * represents p-value≤0.05 and p-value≤0.01, ** represents p-value≤0.01
W[2]	WIPER's iterative edge score calculated by the method described (WIPER: Weighted in-path edge ranking for biomolecular association networks, under submission.).
extended	'No' indicates seed edges, 'Yes' indicates novel edges.

WHAT ARE THE PARAMETERS?

Parameters	Description
Interaction	A list of the edges consists of "nodeA", "nodeB", and "edge confidant score". The elements and edges are seperated by space.
Sigma	A damping factor varied from [0,1] determines the probability of a node continue in an information flow.
ppi	The threshold of edge confidant score.
method	"ant" represents the ant colony algorithm described in the paper "WIPER: Weighted in-path edge ranking for biomolecular association networks". 'page' represents a page-rank algorithm
interaction	The default iteration is 200 in the paper method.

METHODS

Entity prioritization:
Firstly, we apply the heuristic scoring function (Initial Ranking Algorithm) to calculate the initial relevant protein score (RP_(i:0)):
RP_(i:0) = e^{(k × ln(∑_(i,j)conf(i,j)) - ln(∑_(i,j)N(i,j)))},
where i and j are the indexes of proteins from the selected module, k is a constant (k=2 in this study). The term conf(p, q) is the interaction confidence score assigned by HAPPI-2, where conf(p, q) 𝞊 [0,1].
Secondly, we perform iterative scoring and ranking to calculate the protein iterative score (RP_(i:t)) similar to page rank algorithm:
RP_(i:t) = (1-σ) × RP_(i:0) + σ × (∑_(i,j)conf(i,j) × RP_(j:t-1))/(∑_(i,j)N(i,j)),
Where i and j are indices of proteins in the sub-network, k is and empirical constant (k > 1; we set k = 2), conf(i,j) is the initial interaction confidant score between protein i and j, N(i,j) is the degree of the protein i in the sub-network. t is the number of iterations (default = 200), σ is the damping factor (default = 0.8).
Thirdly, we also perform iterative scoring and ranking to calculate the protein iterative score (RP_(i:t)) similar to ant colony algorithm:
RP_(i:t) = (1-σ) × RP_(i:0) + σ × (∑_(i,j)conf(i,j) × RP_(j:t-1) - ∑_(i,j)conf(i,j) × RP_(i:t-1))/(∑_(i,j)N(i,j)),
Where i and j are indices of proteins in the sub-network, k is and empirical constant (k > 1; we set k = 2), conf(i,j) is the initial interaction confidant score between protein i and j, N(i,j) is the degree of the protein i in the sub-network. t is the number of iterations (default = 200), σ is the damping factor (default = 0.8).
P-value estimation:
We rank the RP-score and take the 1% of RP-score from two-tail and mark as very significant (**), then we take the 5% of RP-score from two-tail and mark as significant (*).

Edge prioritization:
Firstly, we perform path weighted matrix (PWM) construction using every node-to-node highest weighted pathway by applying the dijkstra likewise algorithm (multiple the PPI socre).
Secondly, we perform the edge weighted matrix (EWM) construction using the path weighted matrix score in formula:
EWM_{(E(A_i,A_j) - E(B_i,B_j))} = conf(A_i,A_j)×conf(B_i,B_j) × max⁡(PWM_(A(i)-B(i)),PWM_(A(j)-B(i)),PWM_(A(j)-B(i)),PWM_(A(j)-B(j))),
Where conf(A_i,A_j) is the PPI confidant score of edge A consisting of node i and node j.
Thirdly, we drop off the bottom 25% values from EWM and feed remain values to the iterative scoring and ranking formula same above to calculate the edge score.