Sanjeev Arora: Thoughts on Paper Publishing in the Digital Age
In this guest post, Sanjeev Arora will share some thoughts about the future of scientific publishing in our community. This is not unrelated to our last post, and is also aimed at initiating discussion towards FOCS 2013 that is starting in the coming weekend.
As always, comments are most welcomed with the reminder that WindowsOnTheory maintains a policy of keeping the discussion respectful and on point. And now to Sanjeev:
Thoughts on paper publishing in the digital age.
What role should journals and conferences play in the age of arxiv, twitter and other yet-to-be-invented digital wonders? Detecting among many colleagues a general impatience with the status quo, I wrote this blog post to generate more public discussion. I am thinking here purely of theoretical computer science, not other fields.
It is possible now to envision a world where journals and conferences are replaced by arxiv and other depositories, which come with the added benefit of instant “impact metrics” (pageviews, number of tweets, etc.). Machine learning researcher Yann Le Cun has taken this viewpoint to its logical end, arguing for a “free-market”system with papers subject to a distributed market model for refereeing/commenting. Independent consortia of reviewers (basically a new name for journals and conferences of the future??) would decide to publish reviews of arxiv-ed papers —with or without the permission of the papers’ authors.
At the other end of the spectrum, Lance Fortnow is troubled by the steady decline of journals and proliferation of conferences, and its bad effects on scholarship. He seeks to restore the importance of journals, and lower the prestige of conferences (by greatly raising acceptance rates).
Is Arxiv enough?
Physics is one of several fields that have taken enthusiastically to e-publishing. A physics paper on arxiv may have followup papers within weeks or even days.
While this model has some plus points, one also see dangers:(a) incentive to write shallow and incremental papers; (b) more priority disputes and the temptation to publish sketchy ideas in order to later claim full or at least partial credit; (c) lack of incentive for good reviewers to volunteer time reviewing enough papers (despite the attempted analogy to free market in Le Cun’s proposal).
Let me elaborate on (b). The following already seems to be an axiom among my younger colleagues: “If a result appears on arxiv, you have a few days to put up your independent manuscript. After that you can’t claim independent discovery.” Some other disciplines have already moved on to a more cutthroat model: “Whoever gets to arxiv first wins. “
Surely, this must incentivize hasty writing and incomprehensible papers. Or maybe papers that even have errors —but fixed in the weeks or months in subsequent versions. In the past we relied on conference committees and journals to adjudicate such disputes. How would that happen in a distributed market? One can imagine systems involving feedback buttons, reliability ratings etc. but it seems a dubious method of doing science.
An attraction of the “free market” approach at first sight is that unknown researchers can publish on equal footing with established ones. But in reality things may turn out less fair than the current conference system. Prominent researchers will have bigger megaphones (e.g., more twitter followers or friends willing to review their papers) and tend to benefit. Power centers will inevitably form in any system.
By contrast, conferences in theoretical CS —perhaps because each PC is a fresh set of 25 individuals—have a good track record of showcasing great work by grad students and postdocs, and giving best paper awards to people I —and probably many others—had never heard of before. And plenty of Turing award winners get their papers rejected.
Conference vs Journals?
Historically, conferences came to dominate computer science because they allowed fast dissemination and a convenient place to catch up on the latest research/gossip. Today, both goals can increasingly be met by other means, so can we still justify conferences? It is interesting that Fortnow and Le Cun, despite being on opposite ends of the spectrum, agree about the irrelevance of conferences.
Let’s now list factors why conferences still make a lot of sense. My focus here is on promoting better science — I worry less about promotion/tenure policies since they will quickly adjust to accommodate any new dissemination method we choose (including arxiv and twitter). Also, I apologize in advance for occasional forays into pop psychology.
(a) Incentive system for good researchers to (sort-of )review lots of papers.
There simply isn’t enough refereeing capacity to properly referee all the papers at get written. When fields rely solely on journals (eg, economics), backlogs can make it difficult for young researchers to get published —many have no publications when they finish their PhD. Also, accept rates of 5% force editors to be risk-averse.
The PC review system in theoretical CS is not perfect, but better than those in many other fields. The social pressure of a face-to-face PC meeting seems to make members take their reviewing quite seriously.
Also, many researchers seem happier to serve on a STOC/FOCS PC once every 3-4 years rather than on a journal board for 3-4 years. Perhaps humans prefer shorter but more intense pain to a longer and less intense one. Or perhaps journal boards are less interesting because you end up handling papers in your own speciality, including those you already saw 2+ years ago.
(b)Incentive system for researchers to produce a substantial piece of work, and then write it up —sort of comprehensibly—in 10 pages.
The incentives in the arxiv model are quite the opposite —more frequent, insubstantial, and hastily written works.
The 10page limit —archaic relic of the papyrus era— and the PC model has led to our tradition of writing papers that are sort-of comprehensible to nonspecialists.
(c) Clearing point for deciding upon priority, novelty, correctness etc. of claimed results.
Conferences can do it faster and better than journals in most cases, at least under current rules (a jury of 20-25 PC members versus a jury of one editor and 2-3 random reviewers). The informal refereeing system at conferences at first glance seems to invite abuse but I can think of very few accepted papers at STOC/FOCS in the last 30 years that turned out to be very flawed (and often those were recognized as controversial when accepted).
(d) A stamp of authority, or a recommendation if you will.
We increasingly need this guidance from conference PCs to stay afloat in the sea of new papers, especially outside our sub-specialities. That is why I go to STOC/FOCS these days, not social networking (which is a nice bonus though). I could stay at home and watch videotaped talks but, really, who does that?
(e) A synchronization mechanism for our field.
Is it just my imagination, or do conference deadlines actually enhance collaborations and improve productivity/creativity? Half-imagined results get fleshed out as people get together in the months or weeks before the deadline (and I am not referring to caffeine-fueled late-night finishes, which I avoid). We need this synchronization to structure our busy lives, and neither arxiv nor journals provide it. If you don’t care for the human weaknesses this argument stands on, I should mention Boaz Barak’s alternative explanation: sometimes correlated equilibria are superior to Nash equilibria.
Proposals to improve conferences in theoretical CS
The ongoing experimentation—a day of workshops, poster session, recorded talks, no paper proceedings, better feedback from PC—has been good. Here are my thoughts for further improvement.
- Keep the conference format (say 12 pages, 11pt). But to reduce work and give an incentive to produce a readable version, make the submission format identical to the published format.(I admit to having done my share of grumbling about the conference format, but on balance it is important for our field that 50-page arxived papers should be accompanied by shorter, more readable, versions. If you think 12 pages is too few, try vying for the privilege of publishing your result in Science and Nature —in 2-4 pages!)
- Increase number of acceptances moderately. (Beware though of Parkinson’s law: submissions increase to fill all available refereeing capacity. So don’t agonize if acceptance rates stay below 30% despite this increase.)But, reduce number of talks. (In other words, not all accepted papers are treated equally by the PC.) Have more plenary talks.
- To combine some of the speed of arxiv with reliable timestamping, have two submission deadlines a year —papers appear on the website as soon as they are accepted in the first cycle, and papers rejected in the first cycle cannot be resubmitted for another year. Variants of this model have been tested in other fields (databases, ML). This spreads out a PC’s work over a longer period, which has its pluses and minuses.
- It would be nice —perhaps independent of conferences—to have a forum for posting reviews/comments on theory papers. (Hints of Le Cun’s ideas here.) To be useful we must avoid the vicious smallness of blog comments. Requiring posters to use verifiable identities should preclude the worst abuses (the system only needs to scale to a couple thousand users).
- Last but most important: keep the various points made in this article (or any other set of principles discussed and agreed upon collectively) in mind when proposing new changes.
Despite its small size, theoretical CS has been remarkably successful. An incredible edifice of ideas was created together with an open culture that values the need to address papers and talks to nonspecialists. This allows ideas and techniques to jump rapidly across subspecialities. Theory conferences played an important role in creating that culture, and we should think hard about maintaining their good elements in the digital era.
To finish, I must admit that when I started this thought process (and discussions with colleagues) I started out somewhat skeptical of conferences but ended up strongly in favor. I’m decided to be more willing to combat the cynicism I often see in such discussions; hence this blog post.
People’s views tend to be colored by their last conference rejection. Typically, senior people fume about youngsters who value technical sophistication or over conceptual contributions. Young researchers in turn feel anxious about being judged by a power structure that they don’t fully understand or feel part of. Such anxieties have existed since prehistoric times—there is no way to do research and not have it be misjudged at times. Cynicism is not a good response.
“Time for computer science to grow up” by Lance Fortnow.
New publising model for computer science by Yann Le Cun
Some comic relief (article from 1967): The future of scientific journals. A computer-based system will enable a subscriber to receive a personalized stream of papers.
(Acknowledgements: Useful discussions in recent weeks with all my Princeton colleagues —Moses, Bernard, Avi, Mark, Zeev– and ex-Princetonians Boaz Barak and Ankur Moitra.)