I have some questions while reading your code with paper.
- The
conf head in your paper use 1x1 size, but in the code was all 1x5


- The prior boxes, only the first feature layer has
72d output with 12 prior boxes, and other layer has 78d with 14 prior boxes?