Neural Clustering based Visual Representation Learning