sex drugs and intellectual freedom

Twitter/LoC, Part IV: Internet Research Ethics

I have reached the final installment of my look at the recent Twitter/Library of Congress agreement. It is a busy time in the semester for me, and thinking through these posts has taken up some time and energy that I maybe should have been using elsewhere, but – regardless – I think it is important that we think through the implications of this deal. I’d like to thank everyone who has encouraged me to see this series of posts through. (For reference, here are parts one, two, and three.)

At this point, I would like to turn my attention to the issue of Internet research ethics – another area where the LoC Twitter archive presents some deep and pressing issues. I would also like to note that this post is directed at researchers, and not specifically the Library of Congress or Twitter. As a young researcher and scholar myself, I approach the ever-changing, always-expanding area of Internet research ethics in the spirit of conversation and community.

That being said, I think the topics I presented in parts one through three each pose real, and perhaps irreconcilable, dilemmas for Internet research:

From part 1: How do we conduct meaningful research in the archive and still respect the rights and privacy of individual Twitterers who did not necessarily consent to being researched? It takes more than simply scrubbing the data. In the analog world, researchers often struggle with the right amount of detail to use in describing a research site or participant so that the end result is meaningful, but does not betray the anonymity of the site/participant. This only gets harder with tweets. For example, an intrepid ethnographer might want to report some compelling narratives extracted in 140 character increments from the archive, and any such account would demand quoting directly at least the most illustrative examples. But any quoted tweet will likely (and easily) be discoverable through a number of online search tools, and the anonymity of Twitter users (i.e. the participants) would be betrayed.

Now, I can see two objections to this. One might point to Twitter’s terms of service, the agreement between the Library of Congress and Twitter, and whatever terms of use the Library of Congress develops for the archive and claim that, so long as the researcher did not violate these contracts, they are off the hook. They will probably also point to the fact that the only tweet streams collected were public in the first place. But, to say, for these reasons, that a more rigorous definition of consent does not apply is to 1) set the ethical standard for Internet research awfully low and 2) exploit the false public/private dichotomy in ways that are hardly different from Facebook, Google, or any other company that has routinely abused user privacy. To do so is to fundamentally misunderstand the complex nature of information shared online; it does not take seriously the epistemological divide that separates the analog from the digital, the industrial society from an informational one. Further, by setting the ethical bar so low, we run the risk of compromising our intellectual authority as Internet researchers.

From part 2: How do we make sense of this data in a way that is meaningful anywhere outside the context of Twitter itself? If we are honest with ourselves, we know that this archive is simply a convenience sample made up of relatively young, connected (albeit ethnically diverse) Internet users willing to share (and overshare) in ways generally dictated by the system itself (140 characters, hashtags, @replies, etc…). In that respect, the data in the archive is not representative of ordinary people. Sure, researchers hedge and work to set off the limitations of their samples all the time, in order to create meaning and lend validity to their findings. But, how do we do this within the Twitter archive without running into the concerns raised in the previous paragraphs? There will not be some magic formula – some hypothetical “sweet spot” – that researchers will be able to reference in order to determine the precise point at which they’ve delineated their sample in ways that are meaningful without having betrayed the anonymity of the participants.

From part 3: How will we handle the issue of intercultural information ethics and representation when we conduct research on this archive? Whatever the tools developed to make the archive functional for researchers are, we can be sure of one thing: they will not be neutral. In this sense, it is imperative that we be more critical of the tools we are using than anything else. Without a clear understanding of how they work and influence our research, our methods will be flawed from the outset.

In light of this recent agreement, I believe these are a few of the issues Internet researchers should be discussing today. Moreover, the same questions can be asked of research in a variety of digital settings, and not just the LoC’s Twitter archive. And if we, as Internet researchers, don’t have answers to these questions, then we run the risk of losing credibility and compromising our intellectual authority. In other words: what defines us, if not our methods and our ethics?

(Previously: Parts one, two and three.)

Advertisement

One Response

Subscribe to comments with RSS.

  1. david michel said, on 25 August 10 at 3:53 AM

    people are stupid


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.