Skip to main content

Table 1 Statistical metrics

From: Using an ensemble of statistical metrics to quantify large sets of plant transcription factor binding sites

Metric

Equation

Output range

Reference

Confidence (CF)

m a x(P(G a |t i ),P(t i |G a ))

0…1

[26]

Cosine (CO)

P ( t i , G a ) P ( t i ) P ( G a )

0… P ( t i , G a ) …1

[27]

Jaccard (JAC)

P ( t i , G a ) P ( t i ) + P ( G a ) − P ( t i , G a )

0…1

[28]

Kappa coefficient (K)

P ( t i , G a ) + P ( t i , G a ¯ ) − P ( t i ) P ( G a ) − P ( t i ¯ ) P ( G a ¯ ) ) 1 − P ( t i ) P ( G a ) − P ( t i ¯ ) P ( G a ¯ )

−1…1

[29]

Laplace Correction (LP)

max NP ( t i , G a ) + 1 NP ( t i ) + 2 , NP ( t i , G a ) + 1 NP ( G a ) + 2

0…1

[30]

Lift (LI)

P ( t i , G a ) P ( t i ) P ( G a )

0…∞

[31]

Phi coefficient (PHI)

P ( t i , G a ) − P ( t i ) P ( G a ) P ( t i ) P ( G a ) ( 1 − P ( t i ) ) ( 1 − P ( G a ) )

−1…1

[32]

  1. Given a group, G a , and a TFBS, t i , magnitude of TFBS over-representation can be determined using a set of statistical metrics.