Long paper review scores across areas:
Area | Ave before response | Ave after response | Min | Max |
Speech | 3.9 | 3.75 | 2.5 | 5 |
Vision, Robotics and Other Grounding | 3.83 | 3.65 | 2 | 5 |
Tagging, Chunking, Syntax and Parsing | 3.55 | 3.56 | 1.67 | 5 |
Machine Translation | 3.37 | 3.34 | 1.33 | 4.67 |
Theory and Formalisms | 3.37 | 3.38 | 2 | 4.67 |
Dialogue and Interactive Systems | 3.36 | 3.41 | 2 | 5 |
NLP Applications | 3.36 | 3.33 | 1.61 | 4.96 |
Information Extraction | 3.35 | 3.54 | 1.67 | 5 |
Sentiment Analysis | 3.33 | 3.27 | 1.5 | 5 |
Social Media and Computational Social Science | 3.33 | 3.32 | 1.33 | 4.67 |
Summarization | 3.33 | 3.40 | 2 | 4.67 |
Machine Learning for NLP | 3.31 | 3.41 | 1.33 | 5 |
Generation | 3.3 | 3.41 | 1.67 | 4.67 |
Discourse and Pragmatics | 3.27 | 3.60 | 1.33 | 4.67 |
Phonology, Morphology and Word Segmentation | 3.26 | 3.24 | 1.33 | 5 |
Semantics | 3.25 | 3.27 | 1.44 | 5.58 |
Cognitive Modeling and Psycholinguistics | 3.24 | 3.86 | 1 | 4.67 |
Question Answering | 3.16 | 3.00 | 1.67 | 4.33 |
Text Mining | 3.05 | 3.08 | 1.38 | 4.88 |
Heng has been mainly working on the IE area and always thinking that IE reviewers are harsh, e.g., they normally don’t nominate awards from IE area. The above table changed her impression positively.
Long paper review scores comparison across years:
Score | NAACL-HLT 2013 (Daumé, 2013) | NAACL-HLT 2018 |
1 | 1% | 4.1% |
2 | 17% | 20.4% |
3 | 30% | 25.1% |
4 | 44% | 35.4% |
5 | 7% | 15.0% |
6 | – | 0% |
From the scores it looks like the reviews are more harsh than those from five years ago. However we have a much larger and younger reviewer pool this year.
Did Author Response Help?
Score | Before Response | After Response |
1 | 4.3% | 4.1% |
2 | 22.9% | 20.4% |
3 | 21.6% | 25.1% |
4 | 36.5% | 35.4% |
5 | 14.7% | 15.0% |
6 | 0.11% | 0% |
From the changes of score distributions we can see more reviews were changed to a medium score 3. 38 reviews increased scores, and 30 reviews decreased scores.
Generally speaking, reviews were harsh
Very few papers got Best Paper nominations from reviewers, while area chairs identified some excellent submissions for nominations.
Some reviews are too generic, e.g., “the method is more complicated than previous methods [without a concrete list of methods referred]”, “i really like the paper [without explaining merits]”. The PC chairs and area chairs urged these reviewers to refine their comments to make them more informative and constructive.
We could all be nicer. Authors really don’t have to criticize all previous papers in order to make their ideas outstanding; reviewers really don’t have to give harsh comments just because the authors did not cite reviewers’ own (sometimes very irrelevant) papers:-).
The Semantics track has a max score of 5.58 but 0% gave a 6 after author response. Just wondering if it’s due to rounding error?
LikeLike
The first table shows scores before author response/review discussion. I tried to make it clearer. Would you rather see a table that includes scores after response instead? Thanks.
LikeLike
Could we also have average scores after author’s response?
LikeLike
added. Please advise any other type of analysis you would like to see.:) Thanks.
LikeLike
Would it be possible to see a table that includes scores (max) after response?
LikeLike
I have not got any “author response” for my paper!why?
LikeLike
if you are talking about the paper you are reviewing, then that means the authors of the paper did not submit any response. If you are talking about the paper you submitted as an author, you submit (instead of getting) author response by yourself.
LikeLike
thanks
LikeLike
When will we get the notifications?
LikeLike
after a couple of hours today:)
LikeLike
The review form was a pain. I dreaded filling it out, and I had a very difficult time discerning the intent of the reviewers during the rebuttal period. I hope this form is never used again.
LikeLike
The feedback has been summarized in the general chair’s blog: https://naacl2018.wordpress.com/2018/02/03/new-review-form-draws-widely-varying-opinions/ No more discussions are needed here. We will conduct a comprehensive survey after the review process is done for short papers to formally assess the new review form.
LikeLike