Aug 14, 2022
I'm referring to using Shannon entropy to define entropy. MMR (as described in http://www.cs.cmu.edu/~jgc/publication/The_Use_MMR_Diversity_Based_LTMIR_1998.pdf) uses pairwise document similarity, which has its benefits but doesn't scale well if you need to compute a similarity among a large number of documents. If you're going to do something that n^2, you want to avoid having a large n.