Source of dataset

PPDB

Wiki

#Phrase-Paraphrase pairs

26,455

13,954

P (paraphrase/phrase [49]

0.414

-

Heuristic scoring in PPDB 1.0 [50]

0.407

-

Cosine similarity based on word embeddings of rare words [46]

0.463

Supervised scoring model in PPDB 2.0 [46]

0.713

-

WEEM4PG (Ours)

0.435

0.382