BITC / Data Capture - What is Biodiversity Data - Q&A

[Participant] Something just came to my mind as you were talking about primary data versus secondary data. I remember trying to come up with such an initiative to put the primary data that we were digitizing onto our organization's website. I was met with a lot of resistance. How are others dealing with intellectual property rights and the authorities not wanting everything to be online because they feel as though they will be bypassed and that they will become obsolete? [Town] That's a very good question. A bunch of us remember back 15-20 years when these arguments were really, really hot. The reasoning is: I've spent a century taking care of this collection. Or, I spent years assembling this dataset. I know it's valuable. The data are valuable. The specimens are valuable. If it's valuable, I ought to be able to get some value out of it. And yet, I never seem to get any value out of it. I take care of a pretty big bird collection, and nobody's ever given me money to use the bird collection. So, where's the value in it? This is the thinking that goes on. The last thing I'm going to do is just open the doors and let anybody who wants the information come in and take the information. That's giving away something valuable for free and without getting any benefit. The very interesting realization has been that many times the value is multiplied by the use to which the data are put. You can have an unbelievably valuable insect collection locked inside of a bank safe and nobody ever takes information away from it. If it's never used, does it matter to the future of the world whether that collection exists or not? [Participant] I suppose not. [Town] Yeah, exactly. That bank safe could be full of wood; or it could be full of beautiful insect collections. My point is that it's a sociological transition where people start to realize, 'oh, people using, accessing, analyzing, and publishing on our data does not reduce the value of our data'. That's the lesson. Now, how do you get that across to people? That's the hard part. One of the best lessons is simply that there are around 400 institutions already doing this. And, I know of no major problems of information theft or misuse. Everybody around the world, this is the standard and the norm. Now, there are discussions about intellectual property. But, we can also ask, 'okay, that bug that was collected in the southeastern extreme of Zimbabwe. Who does that belong to? The museum owns the specimen. But is whatever intellectual content in that bug the museum's? Or maybe the local community? Or the nation? Many of us see biodiversity information as a common good. So, it's simultaneously the property of the local community, the nation, the owner institution, and the world because it's all global biodiversity. I'm not helping you concretely, but I'm giving you a thinking framework. Institutions around the world have analyzed the degree to which they are able to assert ownership of information associated with biodiversity material. And, they've come to the conclusion that it's perfectly fine to open access. If anybody else has comments.... John? Let me come stand next to you. [John] I don't know anything about the legal side of intellectual property rights, especially because it varies across countries around the world. However, as Town said, we have been through this painful progression of collections not certain about sharing their data. In the beginning, there were many difficulties. Institutions who had bad experiences in the past where they had shared their data and someone else then began to sell the data In fact, selling it back to the institution that gave it to them to initially. This set a very bad precedence that we had to overcome in the beginning. We did that with a simple strategy. One was to create our distributed database networks with those collections who had no doubts about sharing their data. They became the experiment. An example. Will we survive? Will everything be OK if we share our data? And, we created a very successful distributed database network with those people. In the meantime, there were very large institutions who were uncomfortable with that. They weren't ready to take that step. They were uncertain and they had several kinds of excuses why not to do so, including fairly lame excuses such as, 'our data are not perfect.' 'We don't want to share our data because they're not clean'. Well, an institution like our national museum -because it's so big- they are never going to be clean. So, it's an excuse to never share their data. It wasn't the real reason, I don't think. As time progressed, and they saw other institutions successfully sharing their data (and getting benefit from it that had not been foreseen), they began to say, 'hey, can we share our data in your network also?' Now, with the vertebrate network, we're in a position where we cannot keep up with the demand to participate in the network - with open sharing. We are in a position of trying to get the institutions to commit formally to a data license, which has never been done before. Always before we allowed them to create a statement about the use of their data if they wanted to do so. Basically, an intellectual property rights statement. But, it had no legal foundation; and, it had no way for the data consumer really to understand what they were supposed to do. And it varied from one institution to another. In a big aggregation, it is very difficult for the data user to use the data responsibly. To use it in a way that is responsible. Now, we're trying to get the institutions to commit to a formal data use agreement: a license. We're promoting creative commons licenses. In fact, we're promoting the most open of them which is called a 'CC0'. It's not even a license. It's a public domain dedication, which means these data are free for any use. Period. That's what we're trying to promote. A surprising number of institutions are saying, 'yes. These data were collected under public funds. It is our responsibility to share them as a public product.' And, the majority so far are doing so. Again, the bigger institutions are a little more conservative. They're asking questions in committees inside of institutions. 'This is a big step for us. What do we do? What do we want back?' Etc. So, they're considering other alternatives that are open licenses for the use of the data, but are requiring that the use be attributed. If the data are used, you must say where they came from and what you used them for. Which is a good alternative. It's harder for the data consumer to use those data, but at least it's clear how they can use those data. This is not the case if there's no license. We can probably share some articles. There are a couple of very interesting blog posts about this problem from the perspective of someone who wants to use data. And it makes it almost impossible unless you have these licenses.

Posted by: townpeterson on Jun 22, 2016

In English. Portion of course that covers biodiversity data capture, held 13-22 January 2014, in Accra, Ghana. Experts included Melissa Tulig, Kim Watson, Christiane Weirauch, John Wieczorek, and Town Peterson.

