Gratis dating advertenties
In this paper, we start modestly, by attempting to derive just the gender of the authors 1 automatically, purely on the basis of the content of their tweets, using author profiling techniques.For our experiment, we selected 600 authors for whom we were able to determine with a high degree of certainty a) that they were human individuals and b) what gender they were.The authors do not report the set of slang words, but the non-dictionary words appear to be more related to style than to content, showing that purely linguistic behaviour can contribute information for gender recognition as well.Gender recognition has also already been applied to Tweets. (2010) examined various traits of authors from India tweeting in English, combining character N-grams and sociolinguistic features like manner of laughing, honorifics, and smiley use.A group which is very active in studying gender recognition (among other traits) on the basis of text is that around Moshe Koppel. 2002) they report gender recognition on formal written texts taken from the British National Corpus (and also give a good overview of previous work), reaching about 80% correct attributions using function words and parts of speech.
In this case, the Twitter profiles of the authors are available, but these consist of freeform text rather than fixed information fields.
One gets the impression that gender recognition is more sociological than linguistic, showing what women and men were blogging about back in A later study (Goswami et al.
2009) managed to increase the gender recognition quality to 89.2%, using sentence length, 35 non-dictionary words, and 52 slang words.
In the following sections, we first present some previous work on gender recognition (Section 2). Currently the field is getting an impulse for further development now that vast data sets of user generated data is becoming available. (2012) show that authorship recognition is also possible (to some degree) if the number of candidate authors is as high as 100,000 (as compared to the usually less than ten in traditional studies).
Then we describe our experimental data and the evaluation method (Section 3), after which we proceed to describe the various author profiling strategies that we investigated (Section 4). Gender Recognition Gender recognition is a subtask in the general field of authorship recognition and profiling, which has reached maturity in the last decades(for an overview, see e.g. Even so, there are circumstances where outright recognition is not an option, but where one must be content with profiling, i.e.