What is Text Mining?
Text Mi-n-ing (noun): It’s no different than Data Mining, but in this case your data is text, rather than numerical statistics. If you want to measure words – such as how often John Oliver urges his audience to “be fair”, or who the top cited authors are in a given field – you will need to do some text mining. People call this type of analysis “mining” because the amount of data can be expansive, and you will need to sort out the valuable data. For instance, the most common words in any set (or ‘corpus’) will be what are called “stop words”: ‘the’, ‘and’, ‘but’, ‘like’, etc. If we start by stripping these words from our data, we will have a much smaller pile to dig through. To read more on this research methodology, click here. And to see how text mining becomes data visualizations, click here.