Deep linear networks can benignly overfit when shallow ones do