what to do about supporting information

March 9, 2009 at 3:54 pm | | literature, open thread, science community, wild web

I met with David Martinsen from ACS Pubs today to discuss the interface between publishing and technology (internet, Kindle, Facebook, CiteULike, etc.). Interesting discussion. One topic that came up a few times was the supporting information (SI) for articles.

Page limits as well as increasingly complex experimental methods have caused SIs to balloon to sometimes ridiculous lengths. Combine this with the fact that SIs are often barely readable, marginally refereed (if at all), and crammed with unexplained figures, and SIs become ridiculous. Sometime, “see the SI” is a ploy to lull readers and referees into feeling that a statement is supported by data, even when the data is total crap. I’ve seen spectra in SIs that wouldn’t pass muster an undergrad lab, much less a peer-reviewed journal, but they were the primary data of the letter. Nevertheless, much of the core scientific information is often buried in these monstrosities.

Authors need to be encouraged to compile their SIs to be coherent, clean, correct, and scientific supplements to their papers. I have some suggestions. I start with simple suggestions, and move to more fundamental ones:

  1. Offer a single-PDF option. Have an option to download a single PDF that contains all the content for the articles (i.e. the main text as well as the SI). Because the SI often contains vital information to interpret or repeat the results, it’s important to have this data (or synthesis) along with the main text. ACS already has two PDF options (hi-res PDF and PDF with links), so it would be as simple as adding a third link to the page.
  2. Format SIs. This can be as simple as providing a Word template just like there is available for the main text and figures of the article. This shouldn’t increase the editors’ responsibilities, but would make SIs more readable.
  3. Referee SIs. Encourage referees to carefully read and scrutinize the SI, just as they do the main text. If something is unscientific, sloppy, or wrong in the SI, it should not be included. Referees should be encouraged to request SI data or methods be clarified, tested, eliminated, or repeated if necessary as a condition for publication.
  4. Include raw data. Provide space for authors to present raw data if they wish (e.g. structure, NMR, crystallographic, spectral, etc.) in the SI. I personally don’t think this is as important as careful writing, editing, and refereeing of the SI; however, providing raw data could be another level of evidence for the authors’ claims. I don’t think this should be a requirement, because we should be able to trust researchers and not spend all our time redoing someone else’s analysis.
  5. Offer full articles online. Putting 1-3 together, there could be a “full” article form of any paper, which includes all information from the SI, but formatted and organized like a lengthy paper. That way, the paper form of the journal could stay short, while online forms could be complete, yet still professional, readable, and cogent. However, this would add some burden to editors, referees, authors, copy-editors, etc. Paper journals could become more of a collection of executive summaries, with the full scientific data online. (I believe that some Nature journals do this to some extent: having a short Methods section added to the PDF form but not in the paper journal. But there’s still an SI in addition.)

I think #5 is where journals should be going. True, it will add expense to publishers, but that will help them justify the subscription fees when no one receives paper journals anymore.

Other ideas?

UPDATE: Here are some other ideas, which I’ll update as needed:

  • Allow authors to republish. Make it easier for authors to republish the data and figures that are in a SI again in the main text of a subsequent paper. This would permit the SI data to someday be published in a fully refereed form in a follow-up paper. Otherwise, data and figures in the SI will never be properly reported. Of course, there may be a lot of complications with copyright and getting all authors to agree.

18 Comments »

RSS feed for comments on this post. TrackBack URI

  1. raw NMR data? how many hundreds of MB-GB do you want us to upload?

    Comment by Rick — March 9, 2009 #

  2. true. same with huge microscopy movies in my research.

    like i said, i don’t think the raw data is as important as reporting it well. sharing raw data for every paper would be a radical change in the way science is reported.

    Comment by sam — March 9, 2009 #

  3. Raw data can be faked, too.

    Pet peeve: authors that hide their experimental procedures in supporting information, as a reference to a previous paper, which in turn references a 1948 German paper.

    Comment by joel — March 11, 2009 #

  4. I don’t know how the cost of subscribing to a print journal compares to online but there are certainly some journals that we have in our libraries that are not online. Unless we had access to all journals online I would hate to see print journals reduced to summaries.

    Comment by David — March 11, 2009 #

  5. They already are reduced to summaries. Have you read Angew. paper or a JACS letter lately? Not to mention Science or Nature. I’m saying that journals should supply a full article in addition to what they currently publish—only offering more online, not less in print than they already do.

    Comment by sam — March 11, 2009 #

  6. I’m not trying to solve the problem of scientific fraud. That’s a different discussion. I’m trying to fix the problem that a bulk of important scientific information is vomited into SIs instead of carefully presented in the main paper.

    But I don’t blame authors for relegating their experimental methods to SIs: length limits on articles and letters leaves barely enough room to explain your results and properly cite other work. If you have interesting results, it seems that your options are to try to publish a short explanation in JACS, Angew., Science, and the like; these journals don’t leave enough room for detailed results, much less full experimental procedures. Are we supposed to only publish in J. Phys. Chem. and other lower-impact journals that allow much more space to fully explain yourself?

    Comment by sam — March 11, 2009 #

  7. Here’s a study about the fleeting nature of online supplementary information
    http://www.fasebj.org/cgi/reprint/19/14/1943.pdf
    Not like this isn’t a fixable problem but still.

    I don’t know much about sci-publishing, but is there any way to “chunk” the supplementary info? That’s what we do with law journal articles. We try to break down a slew of “authorities” (that’s kind of our raw data) and spew it at the reader in yummy bite-sized increments. My guess is that your supp-info plotzes down all at once causing your lil beady scientific eyeballs to glaze over.

    Or Maybe you guys could have an online repository of the supplementary info, even raw data. And each paper, online or otherwise, could link there. Journals, together, could pay a small fee for a 3rd party to maintain the site, and those Journals, of course, could pass on some of that cost to their subscribers.

    Comment by jordan — March 11, 2009 #

  8. jordan, why are you searching for “faseBJ”? ;)

    Comment by sam — March 11, 2009 #

  9. I’m all for a template for SI. Seems like most people don’t put any effort into it, even though that’s were some of the most subtle but important details can be found.

    I think a lot of people use it to include the dirty laundry: the stuff that takes away from the main point, but reviewers made them include. Maybe that’s just me.

    I’m also for larger and duller SI, if it means that it includes enough detail to allow independent reproduction of the results. I think the SI is supposed to be dull. It’s like the appendix to the paper, including details that otherwise interrupt the flow of the story in the main text.

    Comment by PI — March 12, 2009 #

  10. Charming, Sam.

    Comment by jordan — March 12, 2009 #

  11. What I don’t understand is why a short, exciting paper in Science/Nature/etc isn’t then followed up with a proper publication in one of the “lower-impact” journals. It seems like it would solve both the “oh snap I have to publish in a glamour mag” and “jeez I wish I knew how other people actually did their science” problems at once. And you’d get a second publication out of it.

    Comment by PhilipJ — March 12, 2009 #

  12. Good point: I don’t care if it is dull, so long as it is carefully refereed!

    Comment by sam — March 12, 2009 #

  13. But there’s a dilemma: in order to get published in a hot journal, you need to supply a lot of supporting data and explanation and methods in the SI, and leave the main text for hype. Then, you can’t just publish a follow-up paper in a lower-impact journal, because all the data and figures have already been published in the SI of the hypey paper. I’m not sure about the rules, but I doubt most journals would be happy to have authors publish as new in the main text figures and data that had been reported in the SI of a previous paper. Therefore, any follow-up paper will be of new data. So the data in the SI is never carefully reported in a real paper, anywhere.

    Comment by sam — March 12, 2009 #

  14. Whether it is technically “allowed” or not, lots of people already do this with minor-to-zero modification of their figures. Even something as simple as changing font size is adequate to get around the fact that a figure was previously published.

    Even with SI, there’s often not enough information to reproduce an experiment, so we’re only kidding each other. Even if a second, follow-up paper was not a full-blown paper but a methods article, that would be great. With Nature Methods and similar journals becoming popular, it shouldn’t be so hard to make science transparent enough to be reproducible.

    Comment by PhilipJ — March 13, 2009 #

  15. raw crystallographic data? are you nuts? that’s at least a GB for one crystal.

    the cif should provide everything considering it says how many reflections are used and what’s over 2 sigma and stuff.

    Comment by boyie — March 14, 2009 #

  16. agreed. raw crystallographic data should only be of interest in cases in which the crystal structure is the primary result in the paper. as i said in a comment above, raw data is not nearly as important as reporting the results clearly and correctly.

    Comment by sam — March 14, 2009 #

  17. I agree, this is fairly commonplace. Don’t some journals accept communications with the acknowledgement that a follow up paper may come along shortly thereafter?

    It would be great to promote sending these methods publications to open-access journals, too.

    Comment by joel — March 15, 2009 #

  18. […] that an 8-page average means that causually reading AC is going to be difficult. On the other hand, I don’t think that dumping results into the SI is an adequate solution: SIs are less carefully written, refereed, and read, and therefore are not an appropriate medium to […]

    Pingback by Everyday Scientist » royce doesn’t like 8-page papers — June 10, 2009 #

Leave a comment

thanks for the comment

Powered by WordPress, Theme Based on "Pool" by Borja Fernandez
Entries and comments feeds. Valid XHTML and CSS.
^Top^