Mining Massive Datasets by J.D. Ullman and A. Rajaraman (Cambridge University Press, UK 2012)
http://i.stanford.edu/~ullman/mmds/book.pdf
Introduction to Information Retrieval by Christopher Manning, Prabhakar Raghavan and Hinrich Schutze (Cambridge University Press, UK 2008)
http://nlp.stanford.edu/IR-book/
На Coursera сейчас идет Web Intelligence and Big Data, вроде, об этом.