Datasets for the PLOS ONE Articles
Datasets described in this section are related to articles published or under review in the journal PLOS ONE an open access scientific journal published by the Public Library of Science (PLOS).
The Role of Datasets on Scientific Influence within Conflict Research
by Tracy Van Holt, Jeffery C. Johnson, Shiloh Moates, and Kathleen M. Carley- OriginalText
- Conflictall_noreviews.txt: The original, uncleaned text file, as extracted from the Web of Science.
- clean.WoS: is a file that was cleaned by WoS file to standardize files. This file can be imported into Pajek
- Files for the Critical Path (Fig 1 in the manuscript)
- Cite.net, Cite.xml (DyNetML): is the Pajek file that was processed using Pajek WoS,which is the input file for the critical path analysis (see codes in Appendix 2)
- Figure1_criticalpath.vna: The UCINET critical path file (vna format)
- Keywords from only critical path articles (Fig S2)
- From only the 49 critical path articles, WKAff.##d and WKAff.##h files which are works by keywords files for UCINET. DyNetML Files WK-Aff.xml
- The excel sheet with all keywords from the WK affiliation matrix
- The final figure kwfigurefinal2.vna file, which contains only the keywords that were retained.
- Keywords from 2010 only (Figure 1 in the manuscript)
- 2010deg59+.vna: A UNINET file (vna format) that is the affiliation matrix of a works by keyword file just for 2010. For a node to link,>59 articles had to discuss that keyword pair in common
- Other Files (not used in manuscript)
- WA.net, WA.xml (DyNetML): works by author file as generated by WoS Pajek from 1.2
- WJ.net, WJ.xml (DyNetML): works by journal file as generated by WoS Pajek from 1.2
- WK.net, WK.xml (DyNetML): works by keywords file as generated by WoS Pajek from 1.2
Software Used:
- Batagelj V. WoS2Pajek Networks from Web of Science Version 0.7.University of Ljubljana; 2009.
- Batagelj V, Mrvar A. Pajek-analysis and visualization of large networks version 2.03. 2003.
- Borgatti, S.P., Everett, M.G. and Freeman, L.C. 2002. Ucinet for Windows:Software for Social Network Analysis. Harvard, MA: Analytic Technologies.
How to Cite: Van Holt, T., Johnson, J. C., Moates, S., Carley, K.M. (2016) The role of datasets on scientific influence within conflict research.PLOS ONE. DOI: 10.1371/journal.pone.0154148.
The Broad Reach of Online Extremism: Understanding the ISIS Supporting Community on Twitter
by Matthew Benigni, Kenneth Joseph and Kathleen M. Carley- Description
- This article is under review by PLOS ONE and can be analyzed using R source code provided at: https://github.com/mbenigni/OSNThreatGroups.
- Files
- The following files are contained in this dataset:
Files - deIdentified_attributes.csv - contains node attribute information for users associated with the 2 hop snowball sample described in the aformantioned work. The file contains the following fields: anonID,followingCount,followerCount,tweetCount,lastTweet,creation_date,lang,suspended,official. AnonID refers to a unique identifier assigned to each user and corresponds to nodes in the provided edge lists. The suspended field refers to accounts that were suspended by Twitter between NOV14 and MAR15. Some of these suspended accounts were used as positive case labels. A full explanation is provided in the article. The official field refers to a list of human verified media, government, and celebrity accounts used to train the 'official classifier' in our presented work. All other fields correspond to fields provided by the Twitter API.
- deIdentified_friend_edges.csv, deIdentified_friend_edges.xml (DyNetML) - a directed network edge list of the following or friend ties associated with all nodes listed in deIdentified_attributes.csv.
- deIdentified_mention_edges.csv, deIdentified_mention_edges.xml (DyNetML) - a directed network edge list of the mention ties associated with all nodes listed in deIdentified_attributes.csv. Additionally epoch time for each edge is provided in the 'date' field.
- deIdentified_user_ht_edges.csv, deIdentified_user_ht_edges.xml (DyNetML) - a bipartite network edge list of the user to hash tag ties associated with all nodes listed in deIdentified_attributes.csv. Additionally epoch time for each edge is provided in the 'date' field.
How to Cite: Matthew Benigni, Kenneth Joseph and Kathleen M. Carley, 2017-forthcoming, "Online Extremism and the Communities that Sustain It: Detecting the ISIS Supporting Community on Twitter," PLOS ONE.